DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 10/24/2018.  Currently, claims 1-20 are pending.  

Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).

The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens.  An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are non-provisionally rejected on the ground of nonstatutory double patenting as being anticipated claims 1-20 of Issued Patent No. 10,558,913 (reference application 1) and claims 1-18 and 20 of Issued Patent No. 11,010,669 (reference application 2).  Although the claims at issue are not identical, they are not patentably distinct from each other because the 
The table below shows a claim comparison between the two applications.

16/169,963 (current app)
Patent No. 10,558,913
Patent No. 11,010,669
(Claim 1)
A method that includes one or more processing devices performing operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; receiving, from a remote computing device, a risk assessment query for a target entity; and computing, responsive to the risk assessment query, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity
(Claim 1)
A method that includes one or more processing devices performing operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; receiving, from a remote computing device, a risk assessment query for a target entity; computing, responsive to the risk assessment query, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity; and  Page 2 of 11 Amdt. dated April 22, 2019 Response to Office Action of April 8, 2019 transmitting, to the remote computing device, a responsive message including the output risk indicator, wherein the output risk indicator is usable for controlling access to one or more interactive computing environments by the target entity
(Claim 16)
A method that includes one or more processing devices performing operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and performing iterative adjustments of parameters of the neural network model to minimize a modified loss function comprising a loss function of the neural network model and a Lagrangian expression approximating a path constraint, the Lagrangian expression comprising a hyperparameter that at least adjusts enforcing the path constraint and minimizing the modified loss function, and wherein the path constraint requiring requires a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of the modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration; and causing the trained neural network model to be applied to predictor variables associated with a target entity to generate an output risk indicator for the target entity
(Claim 2)
The method of claim 1, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, and wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer.
(Claim 2)
The method of claim 1, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, and wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer
(Claim 20)
The method of claim 16, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, wherein the parameters for the neural network model comprise weights applied to the nodes of the input layer, the one or more hidden layers, and the output layer, and wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path
(Claim 3)
The method of claim 2, wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path
(Claim 3)
The method of claim 2, wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path
(Claim 20)
The method of claim 16, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, wherein the parameters for the neural network model comprise weights applied to the nodes of the input layer, the one or more hidden layers, and the output layer, and wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path
(Claim 4)
The method of claim 1, wherein the path constraint is approximated by a smooth differentiable expression in the modified loss function
See claim 4
See claim 17
(Claim 5) The method of claim 4, wherein the smooth differentiable expression is introduced into the modified loss function through a hyperparameter, and wherein training the neural network model further comprises: setting the hyperparameter to a random initial value prior to performing the iterative adjustments; and in one or more of the iterative adjustments, determining a particular set of parameter values for the parameters of the neural network model based on the random initial value of the hyperparameter

See claim 5
See claim 18
(Claim 6)
The method of claim 5, wherein training the neural network model further comprises: determining a value of the loss function of the neural network model based on the particular set of parameter values associated with the random initial value of the hyperparameter;  Page 3 of 9 15402913V.2Appl. No. 16/169,963Amdt. dated April 22, 2019 Preliminary Amendment determining that the value of the loss function is greater than a threshold loss function value; updating the hyperparameter by decrementing the value of the hyperparameter; and determining an additional set of parameter values for the neural network model based on the updated hyperparameter
See claim 6
See claim 6
(Claim 7)
The method of claim 5, wherein training the neural network model further comprises: determining that the path constraint is violated by the particular set of parameter values for the neural network model; updating the hyperparameter by incrementing the value of the hyperparameter ; and determining an additional set of parameter values for the neural network model based on the updated hyperparameter
See claim 7
See claim 7
(Claim 8)
The method of claim 5, wherein the hyperparameter is a Lagrangian multiplier	
See claim 8

(Claim 9)
A system comprising: a processing device; and a memory device in which instructions executable by the processing device are stored for causing the processing device to: train a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: access training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and  Page 4 of 9 15402913V.2Appl. No. 16/169,963PATENT Amdt. dated April 22, 2019 Preliminary Amendment perform iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; and compute, responsive to a risk assessment query for a target entity received from a remote computing device, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity
(Claim 9)
A system comprising: a processing device; and a memory device in which instructions executable by the processing device are stored for causing the processing device to: train a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: access training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and  Page 4 of 11 15402722V.2Appl. No. 16/173,427PATENT Amdt. dated April 22, 2019 Response to Office Action of April 8, 2019 perform iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; [[and]] compute, responsive to a risk assessment query for a target entity received from a remote computing device, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity; and transmit, to the remote computing device, a responsive message including the output risk indicator, wherein the output risk indicator is usable for controlling access to one or more interactive computing environments by the target entity
(Claim 10)
A system comprising: a processing device; and a memory device in which instructions executable by the processing device are stored for causing the processing device to: train a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: access training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and perform iterative adjustments of parameters of the neural network model to minimize a modified loss function comprising a loss function of the neural network model and a Lagrangian expression approximating a path constraint, the Lagrangian expression comprising a hyperparameter for at least adjusting enforcement of the path constraint and minimizing the modified loss function, and wherein the path constraint requiring requires a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of the modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration; and Page 5 of 11 18208105V. 1Appl. No. 17/015,056Attorney Docket No.: 096923-1210178 Amdt. dated March 10, 2021 Response to Office Action of December 11, 2020 causing the trained neural network model to be applied to predictor variables associated with a target entity to generate an output risk indicator for the target entity
(Claim 10)
The system of claim 9, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, and wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer
See claim 10
See claim 11
(Claim 11)
The system of claim 10, wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path
See claim 11
See claim 12
(Claim 12)
The system of claim 9, wherein the instructions further cause the processing device to transmit, to the remote computing device, a responsive message including the output risk indicator, wherein the output risk indicator is usable for controlling access to one or more interactive computing environments by the target entity.
See claim 12
See claim 13
(Claim 13)
The system of claim 9, wherein the path constraint is approximated by a smooth differentiable expression in the modified loss function, and wherein the smooth differentiable expression is introduced into the modified loss function through a hyperparameter
See claim 13
See claim 14
(Claim 14)
The system of claim 13, wherein training the neural network model further comprises, adding one or more regularization terms into the modified loss function through the hyperparameter, wherein the one or more regularization terms represent quantitative measurements of the parameters of the neural network model, wherein the one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of the modified loss function with the regularization terms in a current iteration is smaller than the value of the modified loss function with the regularization terms in another iteration
See claim 14
See claim 15
(Claim 15)
The system of claim 14, wherein the one or more regularization terms comprise one or more of: a function of an L-2 norm of a weight vector comprising the weights of the neural network model, and a function of an L-1 norm of the weight vector
See claim 15
See claim 9
(Claim 16)
A non-transitory computer-readable storage medium having program code that is executable by a processor device to cause a computing device to perform operations, the operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and  Page 6 of 9 Amdt. dated April 22, 2019 Preliminary Amendment performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; computing, responsive to a risk assessment query for a target entity received from a remote computing device, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity; and transmitting, to the remote computing device, a responsive message including the output risk indicator
(Claim 16)
A non-transitory computer-readable storage medium having program code that is executable by a processor device to cause a computing device to perform operations, the operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and  Page 6 of 11 15402722V.2Appl. No. 16/173,427 Amdt. dated April 22, 2019 Response to Office Action of April 8, 2019 performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and wherein the modified loss function comprises the loss function of the neural network model and the path constraint; computing, responsive to a risk assessment query for a target entity received from a remote computing device, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity; and transmitting, to the remote computing device, a responsive message including the output risk indicator, wherein the output risk indicator is usable for controlling access to one or more interactive computing environments by the target entity.
(Claim 1)
A non-transitory computer-readable storage medium having program code that is executable by a processor device to cause a computing device to perform operations, the operations comprising: training a neural network model for computing a risk indicator from predictor variables, wherein the neural network model is a memory structure comprising nodes connected via one or more layers, wherein training the neural network model to generate a trained neural network model comprises: accessing training vectors having elements representing training predictor variables and training outputs, wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and performing iterative adjustments of parameters of the neural network model to minimize a modified loss function comprising a loss function of the neural network model and a Lagrangian expression approximating a path constraint, the Lagrangian expression comprising a hyperparameter for at least adjusting enforcement of the path constraint and minimizing the modified loss function, and wherein the path constraint requiring requires a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of the modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration; and Page 2 of 11 18208105V. 1Appl. No. 17/015,056Attorney Docket No.: 096923-1210178 Amdt. dated March 10, 2021 Response to Office Action of December 11, 2020 causing the trained neural network model to be applied to predictor variables associated with a target entity to generate an output risk indicator for the target entity
(Claim 17)
The non-transitory computer-readable storage medium of claim 16, wherein the path constraint is approximated by a smooth differentiable expression in the modified loss function
See claim 17
See claim 4
(Claim 18)
The non-transitory computer-readable storage medium of claim 17, wherein the smooth differentiable expression is introduced into the modified loss function through a hyperparameter, and wherein training the neural network model further comprises: setting the hyperparameter to a random initial value prior to performing the iterative adjustments; and in one or more of the iterative adjustments, determining a particular set of parameter values for the parameters of the neural network model based on the random initial value of the hyperparameter
See claim 18
See claims 4 and 5
(Claim 19)
The non-transitory computer-readable storage medium of claim 18, wherein training the neural network model further comprises, adding one or more regularization Page 7 of 9 15402913V.2Appl. No. 16/169,963 Amdt. dated April 22, 2019 Preliminary Amendment terms into the modified loss function through the hyperparameter, wherein the one or more regularization terms represent quantitative measurements of the parameters of the neural network model, wherein the one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of the modified loss function with the regularization terms in a current iteration is smaller than the value of the modified loss function with the regularization terms in another iteration
See claim 19
See claim 8
(Claim 20)
The non-transitory computer-readable storage medium of claim 16, wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer, and wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path.
See claim 20
See claims 2 and 3



As such, Issued Patent No. 10,558,913 and Patent No. 11,010,669 anticipates all of the limitations of instant application claims 1-20.
This is a non-provisional nonstatutory double patenting rejection because the patentably indistinct claims have been patented.
Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 9-12, 16, 17, and 20 are rejected under 35 U.S.C. § 103 as being obvious over McBurnett et al (WO 2016160539, hereinafter “McBurnett”) in view of He et al. (US 20180101766 A1, hereinafter “He”).

Regarding claim 1, McBurnett discloses [a] method that includes one or more processing devices performing operations comprising: ([0004]: "systems and methods"; and [0020]: "a computing environment 100")
training a neural network model ([0034]: "tune the numeric weights of the neural network through a process referred to as training") for computing a risk indicator ([0036]: "any type of neural network for assessing risk”) from predictor variables ([0038]: "input nodes corresponding to predictor variables"), 
wherein the neural network model is a memory structure comprising nodes connected via one or more layers, ([0038]: "can include one or more hidden layers of interconnected nodes")
wherein training the neural network model to generate a trained neural network model comprises: accessing training vectors having elements representing training predictor variables and training outputs, ([0056]: "input nodes X1 through Xn represent predictor variables", the ensemble of the predictor variables forming a vector; and [0057]; "map the predictor variables X1 through Xn [...] and providing the risk indicator Y", with Y being the output”)
wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, and ([0057]: the output Y corresponds to the input predictor variables X1,...,Xn)
performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, ([0004]: "can be optimized by iteratively adjusting the neural network such that a monotonic relationship exists between each of the predictor variable and the risk indicator", adjusting the neural network teaches adjusting its parameters, and the optimization teaches a minimization of a function, which is the loss function)
wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and ([0079]: "iteratively adjust a number of nodes [...] in the neural network [...] return to block 302, and the operations associated with blocks [...] 306 [...] can be performed in iteration", wherein the operation in block 306, namely generation of the neural network comprises the training of the neural network in further view of paragraph [0034]: "generating the neural network [...] represented as one or more layers of interconnected nodes [...] have numeric weights that can be tuned [...] through a process referred as training")
wherein the modified loss function comprises the loss function of the neural network model and the path constraint; ([0034]: "generating the neural network [...] represented as one or more layers of interconnected nodes [...] have numeric weights that can be tuned [...] through a process referred as training", wherein the training as part of an optimization teaches a loss function of the neural network model; paragraph [0077]: "in block 310, [...] determine if a relationship between the predictor variables and the risk indicator is monotonic (e.g., in block 308)" teaches a loss function for the path constraint of monotony, measured as a rate of change, as taught by paragraph [0045]: "optimization [...] include instruction for causing [...] to determine a rate of change (e.g., a derivative or partial derivative) of the risk indicator with respect to each predictor variable through every path in the neural network that each predictor variable can follow to affect the risk indicator"; the two loss functions are combined as part of the optimization loop in Figure 3, as taught by paragraph [0079]: "iteratively adjust a number of nodes [...] in the neural network [...] return to block 302, and the operations associated with blocks [...] 306 [...] can be performed in iteration", wherein the operation in block 306, namely generation of the neural network comprises the training of the neural network in further view of paragraph [0034])
receiving, from a remote computing device, ([0020]: "the risk assessment application 102 [...] is executed by the risk assessment server 104 [...] risk assessment application 102 can obtain the data used for risk assessment from [...] the user device 108", with the user device being the remote device) a risk assessment query for a target entity; and ( [0020]: "the risk assessment application 102 [...] is executed by the risk assessment server 104 [...] risk assessment application 102 can obtain the data used for risk assessment from [...] the user device 108", with the user device being the remote device)
computing, responsive to the risk assessment query, ([0023]: "the user device 108 may send data to [...] the risk assessment application 102 [...] to be [...] processed") an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity ([0081]: "to use the neural network to accurately determine risk indicators using predictor variables", the entity associated with the predictor variables being the target entity; examples of target entities and predictor variables are taught by paragraph [0017]: "data about the activities or characteristics of the entity").
McBurnett fails to explicitly disclose to minimize a loss function of the neural network model.
He discloses to minimize a loss function of the neural network model ([0022]; “The hypothesis takes the form of a nonlinear model whose parameters are determined so that the error between the observed data and the values determined by the model are minimized. That minimization problem forms the heart of every deep learning architecture (e.g., feed forward, recurrent, convolutional). A major complexity is that the optimization problem is nonconvex containing a large number of local optima and an even larger number of saddle points. In addition, due the large number of input data and neurons the number of parameters describing the model can be very high resulting in a very large-scale optimization problem which classical methods cannot solve in reasonable time. Stochastic Gradient Descent (SGD) has been used extensively to solve that type of optimization problems. However SGD uses first order information (i.e., gradient information of the loss function) and as such it cannot escape from saddle points encountered during the process of reaching a local minimum of the loss function” (emphasis added); and [0038]; “At step 605, a loss function corresponding to the deep learning network is defined. In general the loss function measures the compatibility between a prediction and a ground truth label. The exact loss function will depend on the type of problem being solved by the network; however, in general, any loss function known in the art may be employed. For example, for classification problems, a Support Vector Machine (e.g., using the Weston Watkins formulation) or a Softmax classifier may be used. For predicting real-valued quantities, the loss function may use regression-based methods. For example, in one embodiment, the loss function measures the loss between the predicted quantity and the ground truth before measuring the L2 squared norm, or L1 norm of the difference”).
McBurnett and He are analogous art because both are concerned with the use of machine learning to analyze data.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the minimization of a loss function of He with the method of McBurnett to yield the predictable result of performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors. The motivation for doing so would be so that the error between the observed data and the values determined by the model are minimized (He; [0022]).

Regarding claim 9, it is a system claim corresponding to the steps of claim 1 and is rejected for the same reasons as claim 1.

Regarding claim 16, McBurnett discloses [a] non-transitory computer-readable storage medium having program code that is executable by a processor device to cause a computing device to perform operations, the operations comprising: ([0025]; “can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices”)
training a neural network model for computing a risk indicator from predictor variables, ([0034]: "tune the numeric weights of the neural network through a process referred to as training"; and [0036]: "any type of neural network for assessing risk”; and [0038]: "input nodes corresponding to predictor variables")
wherein the neural network model is a memory structure comprising nodes connected via one or more layers, ([0038]: "can include one or more hidden layers of interconnected nodes")
wherein training the neural network model to generate a trained neural network comprises: accessing training vectors having elements representing training predictor variables and training outputs, ([0056]: "input nodes X1 through Xn represent predictor variables", the ensemble of the predictor variables forming a vector; and [0057]; "map the predictor variables X1 through Xn [...] and providing the risk indicator Y", with Y being the output”)
wherein a particular training vector comprises (i) particular values for the predictor variables, respectively, and (ii) a particular training output corresponding to the particular values, andPage 6 of 9 15402913V.2Appl. No. 16/169,963Preliminary Amendment ([0057]: the output Y corresponds to the input predictor variables X1,...,Xn)
performing iterative adjustments of parameters of the neural network model to minimize a loss function of the neural network model subject to a path constraint, the path constraint requiring a monotonic relationship between (i) values of each predictor variable from the training vectors and (ii) the training outputs of the training vectors, ([0004]: "can be optimized by iteratively adjusting the neural network such that a monotonic relationship exists between each of the predictor variable and the risk indicator", adjusting the neural network teaches adjusting its parameters, and the optimization teaches a minimization of a function, which is the loss function)
wherein one or more of the iterative adjustments comprises adjusting the parameters of the neural network model so that a value of a modified loss function in a current iteration is smaller than the value of the modified loss function in another iteration, and ([0079]: "iteratively adjust a number of nodes [...] in the neural network [...] return to block 302, and the operations associated with blocks [...] 306 [...] can be performed in iteration", wherein the operation in block 306, namely generation of the neural network comprises the training of the neural network in further view of paragraph [0034]: "generating the neural network [...] represented as one or more layers of interconnected nodes [...] have numeric weights that can be tuned [...] through a process referred as training") 
wherein the modified loss function comprises the loss function of the neural network model and the path constraint; ([0034]: "generating the neural network [...] represented as one or more layers of interconnected nodes [...] have numeric weights that can be tuned [...] through a process referred as training", wherein the training as part of an optimization teaches a loss function of the neural network model; paragraph [0077]: "in block 310, [...] determine if a relationship between the predictor variables and the risk indicator is monotonic (e.g., in block 308)" teaches a loss function for the path constraint of monotony, measured as a rate of change, as taught by paragraph [0045]: "optimization [...] include instruction for causing [...] to determine a rate of change (e.g., a derivative or partial derivative) of the risk indicator with respect to each predictor variable through every path in the neural network that each predictor variable can follow to affect the risk indicator"; the two loss functions are combined as part of the optimization loop in Figure 3, as taught by paragraph [0079]: "iteratively adjust a number of nodes [...] in the neural network [...] return to block 302, and the operations associated with blocks [...] 306 [...] can be performed in iteration", wherein the operation in block 306, namely generation of the neural network comprises the training of the neural network in further view of paragraph [0034])
computing, responsive to a risk assessment query  ([0023]: "the user device 108 may send data to [...] the risk assessment application 102 [...] to be [...] processed") for a target entity received from a remote computing device, an output risk indicator for the target entity by applying the trained neural network model to predictor variables associated with the target entity; and [0081]: "to use the neural network to accurately determine risk indicators using predictor variables", the entity associated with the predictor variables being the target entity; examples of target entities and predictor variables are taught by paragraph [0017]: "data about the activities or characteristics of the entity")
transmitting, to the remote computing device, a responsive message including the output risk indicator ([0057]: "and providing the risk indicator Y", the risk indicator being the message, and being provided implicitly teaches that it is transmitted over the network 110, as it is the sole interface of the risk assessment server 104, in view of Figures 1, 7).
McBurnett fails to explicitly disclose to minimize a loss function of the neural network model.
He discloses to minimize a loss function of the neural network model ([0022]; “The hypothesis takes the form of a nonlinear model whose parameters are determined so that the error between the observed data and the values determined by the model are minimized. That minimization problem forms the heart of every deep learning architecture (e.g., feed forward, recurrent, convolutional). A major complexity is that the optimization problem is nonconvex containing a large number of local optima and an even larger number of saddle points. In addition, due the large number of input data and neurons the number of parameters describing the model can be very high resulting in a very large-scale optimization problem which classical methods cannot solve in reasonable time. Stochastic Gradient Descent (SGD) has been used extensively to solve that type of optimization problems. However SGD uses first order information (i.e., gradient information of the loss function) and as such it cannot escape from saddle points encountered during the process of reaching a local minimum of the loss function” (emphasis added); and [0038]; “At step 605, a loss function corresponding to the deep learning network is defined. In general the loss function measures the compatibility between a prediction and a ground truth label. The exact loss function will depend on the type of problem being solved by the network; however, in general, any loss function known in the art may be employed. For example, for classification problems, a Support Vector Machine (e.g., using the Weston Watkins formulation) or a Softmax classifier may be used. For predicting real-valued quantities, the loss function may use regression-based methods. For example, in one embodiment, the loss function measures the loss between the predicted quantity and the ground truth before measuring the L2 squared norm, or L1 norm of the difference”).
The motivation to combine McBurnett and He is the same as discussed above with respect to claim 1.

Regarding claims 2 and 10, the rejection of claims 1 and 9 are incorporated and McBurnett further discloses wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, and wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer ([0014] and [0034]).

Regarding claims 3 and 11, the rejection of claims 1, 2, 9, and 10 are incorporated and McBurnett further discloses wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path ([0071] teaches that the dependence is in the sense of an increase between the output of the neural network and each of the predictive variables; and [0057] teaches the products between the weights and the activations inside the neural network; and [0045] teaches that the smooth differentiable function is calculated for every path in the neural network that each predictor variable can follow to affect the risk indicaton).

Regarding claims 4 and 12, the rejection of claims 1 and 9 are incorporated and McBurnett further discloses wherein the path constraint is approximated by a smooth differentiable expression in the modified loss function ([0045]: "optimization [...] include instruction for causing [...] to determine a rate of change (e.g., a derivative or partial derivative) of the risk indicator with respect to each predictor variable through every path in the neural network that each predictor variable can follow to affect the risk indicator”).

Regarding claim 17, the rejection of claim 16 is incorporated and McBurnett further discloses wherein the path constraint is approximated by a smooth differentiable expression in the modified loss function ([0045]: "optimization [...] include instruction for causing [...] to determine a rate of change (e.g., a derivative or partial derivative) of the risk indicator with respect to each predictor variable through every path in the neural network that each predictor variable can follow to affect the risk indicator”).

Regarding claim 20, the rejection of claim 16 is incorporated and McBurnett further discloses wherein the neural network model comprises at least an input layer, one or more hidden layers, and an output layer, wherein the parameters for the neural network model comprise weights of connections among the input layer, the one or more hidden layers, and the output layer, and wherein the path constraint comprises, for each path comprising a respective set of nodes across the layers of the neural network model from the input layer to the output layer, a positive product of the respective weights applied to the respective set of nodes in the path ([0014] and [0034]; and [0071] teaches that the dependence is in the sense of an increase between the output of the neural network and each of the predictive variables; and [0057] teaches the products between the weights and the activations inside the neural network; and [0045] teaches that the smooth differentiable function is calculated for every path in the neural network that each predictor variable can follow to affect the risk indicaton).

Allowable Subject Matter

Claims 5-8, 13-15, 18, and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Chen et al., “Credit rating with a monotonicity-constrained support vector machine model”, June 12, 2014, Expert Systems with Applications, Volume 41, Issue 16, 15 November 2014, Pages 7235-7247.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403.  The examiner can normally be reached on Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 
/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127