DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
Acknowledgement is made of Applicant's claim amendments on 4/25/2022. The claim amendments are entered. Presently, claims 1, 4-39, and 42-61 are now pending. Claims 2, 3, 40, and 41 have been cancelled. Claims 1, 4, 5, 9, 21, 23, 27, 36, 37, 39, 60, and 61 have been amended.

Applicant has sufficiently amended the specification to address the typo and to include the requisite trademark designations. Accordingly, the specification objections are withdrawn. 

Applicant has sufficiently amended the claims to address the various claim objections.  Accordingly, the claim objections are withdrawn.

Applicant has sufficiently amended the drawings and the specification to provide support for the various reference labels. Applicant’s clear annotations and indications of the amendments in both the specification and particularly within the drawing figures themselves are much appreciated. These annotations and indications made it clear as to where support was given and where labels were removed or revised. Accordingly, the drawing objections are sufficiently addressed and so the drawing objections are withdrawn.  
Response to Arguments
Applicant's arguments filed on 4/25/2022 have been fully considered but they are not persuasive.

Applicant argues that Choi allegedly does not teach the newly amended claim limitations because it allegedly does not teach the separate training of the node (Applicant’s reply pgs. 23-25). This argument is not persuasive because Choi is not being used to teach that element. Instead, Baker ‘960 is being used to teach the C2 limitation regarding the separate training of the node, while Choi is being used to teach the C1 and C3 limitations as shown in the mapping below. Likewise, Applicant also goes into further detail about the separate training of the new node including its initialization and why Choi does not apply because it allegedly does not teach this claim limitation (Applicant’s reply pgs. 24-25). This is argument is not persuasive because, as previously explained above, Choi is not being used to teach the C2 limitation regarding the separate training of the node.   

Applicant argues that there is allegedly not a motivation to combine Baker ’960 paragraph [0142] with the cited references (Applicant’s reply pgs. 25-26). This argument is not persuasive because [0142] was not cited in the rejection. As such, the argument is moot. 
It is noted that PHOSITA would be motivated to combine Baker ‘960 with Choi because both relate to extending or expanding the neural network to improve the performance of the neural network, wherein such extension or expansion can be performed in an incremental manner via the addition of the new nodes on an as needed basis. As such, it is conceivable for PHOSITA to combine the references. 
Applicant argues that claims 1, 39, 54, and 60 are not allegedly similar before the amendments and argues regarding a potential distinction between target datum and target-specific improvement network subcomponent (AKA node) prior to the amendments (Applicant’s reply pg. 26). This argument is not persuasive. Claims 1 and 39 have now been amended so that they mirror each other except for the fact claim 1 is a method claim and claim 39 is a system claim with the corresponding system components. As such, the argument regarding the similarity between the group before the amendments is moot because the claims have been changed due to the amendments. An updated rejection is provided below. It is noted that contrary to Applicant’s arguments, clear indications were given in the previous Office Action regarding the similarity between the claims and a mapping was provided for the differences. 
With the new amendments, the rejection has been updated to reflect the changes in claim 39 and the relationship between it and claim 54. Claim 54 has been mapped to clearly show the rejection rather than a reference to claim 39 since the two claims now differ. Similarly, for claim 60 since claims 54 and 60 are reflective of each other, but for the fact that claim 54 is a method claim and claim 60 is a system claim with the corresponding system components.
Regarding the potential distinction of the term target datum and target-specific improvement network subcomponent between the various claims 1, 39, 54, and 60, this argument is not persuasive. It is clear upon a review of the prior claim set that these claims are substantially similar except for the substitution of the target datum/target training datum for the target-specific improvement network subcomponent, and for the various differences between method and system claims. The phrase target in correlation with datum denotes a very broad concept because it can include any element in which one considers as a target datum or data. Indeed, the claim limitation does not provide further clarification regarding the very broad term target datum. So, for example, target datum can be target training data or target node (i.e. the target-specific improvement network subcomponent) that can be trained or any other such element that one considers as target datum. As such, a target datum can be interpreted as a target node, wherein such target node can be trained. Thus, Applicant’s argument that Choi allegedly cannot teach the target datum because it also teaches a target node is not persuasive. Likewise, in another example, the target datum can be training data, e.g. text or images as part of a general training data set, which is also taught in Choi. As such, Choi teaches the claim limitation. 
In addition, Applicant also argues that Choi allegedly does not teach the selection process and addition of the target node (Applicant’s reply pg. 26). This argument is not persuasive because Choi teaches the claim limitations as shown in the mapping below. 
It is noted now that with the amendments, the various claim limitations between the claims have changed. However, the limitations that have not changed do still show this similarity as is discussed above.   

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 4-10, 14-20, 25-28, 30-32, 35-39, 42, 43, and 46-61 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi) in view of Baker (WIPO No. WO 2018/226527, hereinafter Baker) and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960). 

Regarding claim 1, Choi teaches:
A method of training a neural network, the method comprising, by a programmed computer system ([0082] and [0086]-[0088]: describing training of a neural network (NN).): 
(a) training, at least partially, a base neural network on a first set of training data ([0082]-[0084] and [0129]: describing training of NN using training data.), 
wherein training the base neural network comprises computing for each datum in the first set of training data, activation values for nodes in the base neural network ([0074] and [0099]-[0102]: describing generation of activation data for the various nodes in the NN.) and 
…, 
wherein the base neural network comprises an input layer, an output layer, and one or more inner layers between the input and output layers ([0076]-[0081]: describing that the NN has input layer, hidden layers, and output layer.); 
(b) after step (a) and based on the training, selecting, based on specified criteria, a target node of the base neural network for targeted improvement ([0087]-[0089], [0100], and [0103]: describing the selection of a target node for improvement, e.g. extension, “based on a variety of information”, i.e. criteria.); and 
(c) after step (b), adding a target-specific improvement network sub-component to the base network to form an expanded neural network, wherein the target-specific improvement network sub-component comprises one or more nodes ([0090]-[0091], [0135], and [0138]: describing adding in an additional/new node to expand the NN.) and 
wherein the target-specific improvement network sub-component, when added to the base neural network, improves performance of the base network ([0085], [0126], and [0132]: describing improved performance of the NN based on its extension by adding in additional node(s).), 
wherein adding the target-specific improvement network sub-component comprises, by the programmed computer system: (c1) selecting the target-specific improvement network sub-component ([0087]-[0090], [0100], and [0103]: describing the selection of the additional node.); 
…; and 
(c3) after step (c2), merging the target-specific improvement network sub-component with the base neural network to form the expanded neural network ([0092]-[0093]: describing connecting the new node to the NN to create an extended NN.).

While the cited reference teaches the above limitations of claim 1, it does not explicitly teach: “estimates of partial derivatives of an objective function for the base neural network for the nodes in the base neural network” on lines 5-7. Baker discloses the claim limitations, teaching: a determination of the partial derivatives of an objection function, e.g. an error cost function, for each node in a base NN in correlation (Baker [0022]-[0024]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the cited reference to include the partial derivatives computation for the base NN in Baker. Doing so would enable a technique to “improve the performance of a network that has converged such that the gradient of the network and all the partial derivatives are zero…. The present system and method can create a new network by splitting the candidate nodes or arcs that diverge from zero and then trains the resulting network with each selected node trained on the corresponding cluster of the data.” (Baker Abstract). 


While the cited references in combination teach the above limitations of claim 1, they do not explicitly teach: “(c2) after step (cl), training the target-specific improvement network sub-component separately from the base network” on lines 19-20. Baker ‘960 teaches: an incremental process for adding in new elements, e.g. nodes, in a NN, wherein the original node and a copy of the original node can be trained separately (Baker ‘960 [0160]-[0162]). Wherein the process can be used to improve performance of the NN (Baker ‘960 [0556]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the separate training in Baker ‘960. Doing so would enable to improve a development of the NN, wherein “the computer system incrementally adds new members to an ensemble or grows any machine learning system [i.e. NN] by adding new elements” (Baker [0160]). Which would provide “for incrementally improving the performance of a machine learning system [i.e. NN] through creating and combining ensembles” (Baker [0061]). 

Regarding claim 4, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, further comprising, by the programmed computer system, after step (c3), training the expanded neural network ([0087] and [0096]: describing training of the extended neural network.).




Regarding claim 5, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, further comprising, by the programmed computer system, after step (c), training the expanded neural network ([0087] and [0096]: describing training of the extended neural network.).

Regarding claim 6, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, wherein the specified criteria for selecting the target node comprises selecting the target node upon a determination that the target node made a classification error for a first datum in the first set of training data ([0083]-[0084], [0124], and [0126]: describing an error determination between an actual output value and an expected value for training in correlation with the nodes in the NN for extension of the NN, wherein such error can relate to a classification.).

Regarding claim 7, the rejection of claim 6 is incorporated. Choi teaches:
The method of claim 6, wherein the target node is an output node of the base neural network ([0080]: describing the nodes in NN that can be targeted for extension, e.g. output of the hidden layer nodes.).

Regarding claim 8, the rejection of claim 6 is incorporated. Choi teaches:
	The method of claim 6, wherein the target node is on a first inner layer of the base neural network ([0080] and [0090]: describing the nodes in NN that can be targeted for extension, e.g. hidden layer nodes. Wherein a hidden layer node can denote an inner layer of the NN.).
Regarding claim 9, the rejection of claim 8 is incorporated. Choi teaches:
	The method of claim 8, wherein the specified criteria for selecting the target node comprises a comparison of the activation value for the target node to a threshold value ([0103]-[0106]: describing comparison of the activation value in correlation with a predetermined threshold to determine the target node to select for extension. Wherein frequency of activation values are also considered in the comparison ([0099]-[0102]).).

Regarding claim 10, the rejection of claim 6 is incorporated. The cited references in combination do not explicitly teach: “comprises a detector node that detects instances of the first datum and data that is within a threshold distance of the first datum.” Baker ‘960 discloses the claim limitations, teaching: detector nodes in a NN (Baker ‘960 [0170], [0424], and [0552]), wherein the detector nodes can determine training data and split of the training data such that the data can be within some threshold distance of each other (Baker ‘960 [0120] and [0552]).
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the detector node in Baker ‘960. Doing so would enable a technique “for improving the performance of a classifier” using a detect node (Baker ‘960 [0146]).

Regarding claim 14, the rejection of claim 1 is incorporated. Choi teaches:
	The method of claim 1, wherein the target-specific improvement network sub-component comprises a second node that is a copy of the target node, such that incoming and outgoing connections, and corresponding weight values, for the target node are initially copied to the second node ([0095] and [0116]-[0117]: describing the copying/duplication of additional nodes from a selected node such that the connection weights and input and output connections between the selected node and additional nodes can be copied/duplicated with each other.), and 
	wherein there is a directional relationship regularization link between the target node and the second node (Figs. 7-9: showing a directional connection link between the selected nodes and additional nodes. Wherein regularization of the directional link can be achieved by the weights and errors associated with the link, which allows the link to regulate the connections between the nodes based on the weights and errors ([0116]-[0117], [0119], and [0122]).).

Regarding claim 15, the rejection of claim 14 is incorporated. Choi teaches:
	The method of claim 14, wherein the directional relationship regularization link comprises a bidirectional relationship regularization link ([0083]: describing a forward direction and a backward direction, i.e. a bi-directionality, for the connection links and their corresponding weight of the nodes in the NN. The bi-directionality links being utilized to estimate and minimize errors, i.e. regularization of the links.).
Regarding claim 16, the rejection of claim 14 is incorporated. The cited references in combination do not explicitly teach: “wherein the directional relationship regularization link enforces an "is-not-equal-to" relationship between activation values of the target node and the second node on a second set of training data, such that the second node is trained to produce different activation values than the target node on data in the second set of training data”. Baker ‘960 discloses the claim limitations, teaching: regularization link via, e.g. soft-tying techniques, such that the activation values of the various nodes can be different from each other (Baker ‘960 [0240] and [0273]).
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the different activation values in Baker ‘960. Doing would enable soft tying of various parameters, e.g. activation values, of different data values (Baker ‘960 [0081]).

Regarding claim 17, the rejection of claim 1 is incorporated. The cited references in combination do not explicitly teach: “comprises selecting the target node upon a determination that an average, over the first set of training data, of the estimate of the partial derivative of the objective function for the base network with respect to an activation function for the target node is less than a threshold value.” Baker ‘960 discloses the claim limitations, teaching: an average computation for the partial derivatives over a training data set (Baker ‘960 [0528]-[0529] and [0367]-[0370]). Wherein the activation values can be less than a threshold value (Baker ‘960 [0400]) and the partial derivatives can be related an objective function such as an error cost function (Baker ‘960 [0206] and [0459]). 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the average determination in Baker ‘960. Doing so would enable “accurate estimates for a small percentage of the partial derivatives” (Baker ‘960 [0370]).


Regarding claim 18, the rejection of claim 17 is incorporated. Choi teaches:
	The method of claim 17, further comprising, by the programmed computer system:
	prior to step (c), selecting a target datum ([0089], [0100]-[0103], and [0134]: describing target data/criteria, e.g. having an activation frequency value within some predetermined threshold or a performance metric.); and 
	selecting the target-specific improvement network sub-component based on a combination of the selected target node and the selected target datum ([0089]-[0091], [0103], [0105], and [0135]: describing that a new node can be generated based on considerations involving the selected node and the target data/criteria.).

Regarding claim 19, the rejection of claim 18 is incorporated. The cited references in combination do not explicitly teach: “comprises selecting an arbitrary datum in the first set of training data for which a magnitude of a partial derivative for the target node is non-zero and greater than a magnitude of the partial derivative averaged over a set of data.” Baker ’960 discloses the claim limitations, teaching: a data selector that can select arbitrary data (Baker ‘960 [0135]) and can have a partial derivative with a magnitude greater than a specified value (Baker ‘960 [0353] and [0371]) and non-zero (Baker ‘960 [0360] and [0716]). Wherein the data can comprise an arbitrary set of data (Baker ‘960 [0359] and [0709]).
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the partial derivative and arbitrary data in Baker ‘960. Doing so would enable “improving the aggressive development of machine learning systems…. [Wherein] various systems and methods can be utilized to separate the process of detailed learning and knowledge acquisition and the process of imposing restrictions and smoothing estimates, thereby allowing machine learning systems to aggressively learn from training data, while mitigating the effects of overfitting on the training data.” (Baker ‘960 Abstract).

Regarding claim 20, the rejection of claim 18 is incorporated. The cited references in combination do not explicitly teach: “comprises selecting a datum in the first set of training data for which a value of an absolute value of the derivative for the target node is greater than a threshold value.” Baker ’960 discloses the claim limitations, teaching: selection of a node via an absolute value determination of a partial derivative in correlation with the node that has “an absolute value above some specified threshold” (Baker ‘960 [0459]). Wherein the computation on the node can comprise training data (Baker ‘960 [0457], [0460], and [0465]-[0466]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the absolute value in Baker ‘960. Doing so would enable “improving the aggressive development of machine learning systems…. [Wherein] various systems and methods can be utilized to separate the process of detailed learning and knowledge acquisition and the process of imposing restrictions and smoothing estimates, thereby allowing machine learning systems to aggressively learn from training data, while mitigating the effects of overfitting on the training data.” (Baker ‘960 Abstract). 



Regarding claim 25, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, wherein the target-specific improvement network sub- component, when added to the base neural network, changes a layer structure of the base neural network ([0082] and [0085]: describing that the structure of the initial NN can change via extension of the NN with additional nodes, wherein such an extension can provide an improvement to the performance of the NN.).

Regarding claim 26, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, wherein: 
the target-specific improvement network sub-component comprises a second node ([0090]-[0091] and [0102]: describing the generation of new nodes.); 
there is a node-to-node relationship regularization link between the second node and the target node (Figs. 7-9: showing a connection link between the selected nodes and additional nodes. Wherein the link can regulate elements such as weights or errors between the nodes ([0116]-[0117], [0119], and [0122]).); and 
the node-to-node relationship regularization link imposes a node-specific regularization cost on the second node for a training datum if the activation value computed for the target node during a prior feed forward computation for the training datum violates a specified relation for the node-to-node relationship regularization link ([0082]-[0084]: describing that the connection links with corresponding weights can be analyzed and updated as needed to reduce errors in correlation with the new and selected nodes and the training data. Wherein the computation to reduce the errors can denote a regularization cost related to a violation, e.g. an error between an actual output vs. an expected output, and the computation can occur in a prior feed forward computation in order to perform a back propagation of the errors. Whereby the connection links with corresponding weights are related to an activation function ([0074]).).

Regarding claim 27, the rejection of claim 26 is incorporated. Choi teaches:
The method of claim 26, wherein: 
there is a node-to-node relationship regularization link between every node of the base network and a corresponding node of the expanded network (Figs. 7-9: showing a connection link between the selected nodes and additional nodes. Wherein the link can regulate elements such as weights or errors between the nodes ([0116]-[0117], [0119], and [0122]).); and 
the node-to-node relationship regularization links impose node-specific regularization costs for the training datum on each node in the expanded network if the activation value computed for the corresponding node in the base network for the training datum during a prior feed forward computation violates a specified relation for the node-to-node relationship regularization link ([0082]-[0084]: describing that the connection links with corresponding weights can be analyzed and updated as needed to reduce errors in correlation with the new and selected nodes and the training data. Wherein the computation to reduce the errors can denote a cost related to a violation, e.g. an error between an actual output vs. an expected output, and the computation can occur in a prior feed forward computation in order to perform a back propagation of the errors. Whereby the connection links with corresponding weights are related to an activation function ([0074]).).

Regarding claim 28, the rejection of claim 26 is incorporated. Choi teaches:
The method of claim 26, wherein the specified relation is that an activation value for the second node for the training datum equals the activation value for the target node for the training datum ([0095]: describing the additional nodes can have duplicate connection weight values as the selected node. Wherein such weights can comprise activation function values ([0074]).).

Regarding claim 30, the rejection of claim 26 is incorporated. The cited references in combination do not explicitly teach: “wherein a strength of the node-to-node relationship regularization link is controlled by a node-to-node relationship regularization link hyperparameter”. Baker ‘960 discloses the claim limitations, teaching: that the strength of the neural network and its respective linked nodes and objectives are related to hyperparameter values (Baker ‘960 [0239], [0243], [0250], [0273], and [0519]-[0520]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the regularization and hyperparameter in Baker ‘960. Doing so would enable “improving the aggressive development of machine learning systems…. [Wherein] various systems and methods can be utilized to separate the process of detailed learning and knowledge acquisition and the process of imposing restrictions and smoothing estimates, thereby allowing machine learning systems to aggressively learn from training data, while mitigating the effects of overfitting on the training data.” (Baker ‘960 Abstract). 


Regarding claim 31, the rejection of claim 26 is incorporated. The cited references in combination do not explicitly teach: “wherein a value of the node-to-node relationship regularization link hyperparameter is controlled by an intelligent learning management system”. Baker ‘960 discloses the claim limitations, teaching: that the computer system can evaluate and determine the hyperparameter values that can control a strength in correlation with the neural network and its respective linked nodes and objectives (Baker ‘960 [0236], [0243], [0319], and [0519]). Wherein the computer system can comprise “a machine learning system”, i.e. an intelligent learning management system (Baker ‘960 [0066]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the intelligent system in Baker ‘960. Doing so would enable “hyperparameter tuning” to optimize a machine learning system (Baker ‘960 [0265]).

Regarding claim 32, the rejection of claim 1 is incorporated. The cited references in combination do not explicitly teach: “determining a range, over a selected set of data, of a value of a summation of incoming connections to the target node; and upon a determination that the range is greater than a threshold value, creating a second node, wherein incoming connections for the second node are initialized by copying the incoming connections and weights from the target node, and wherein a bias for the second node is initialized to discriminate a first datum in the selected set of data from a second datum in the selected set of data.” Baker ‘960 discloses the claim limitations, teaching: 
“determining a range, over a selected set of data (Baker ‘960 [0255], [0400], and [0449]: describing a range of data), of a value of a summation of incoming connections to the target node (Baker ‘960 [0320], [0400], and [0415]: describing a summation of the data. Wherein a summing neuron can perform a summation (Baker ‘960 [0456].); and 
upon a determination that the range is greater than a threshold value (Baker ‘960 [0261]-[0262] and [0400]: describing that the range, e.g. via a standard deviation, can be greater than some specified value or threshold.), 
creating a second node, wherein incoming connections for the second node are initialized by copying the incoming connections and weights from the target node (Baker ‘960 [0456], [0461], and [0492]: describing a copy of a connection of the new/additional nodes with the previous nodes that they are connected to.), and 
wherein a bias for the second node is initialized to discriminate a first datum in the selected set of data from a second datum in the selected set of data (Baker ‘960 [0260] and [0529]-[0531]: describing a bias for the nodes, wherein the bias can correlate with a splitting of the data to enable the nodes to be more decisive, i.e. discriminatory towards the data.).”
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of the training the NN in the combined cited references to include the bias and range determinations in Baker ‘960. Doing so would enable “improving the aggressive development of machine learning systems…. [Wherein] various systems and methods can be utilized to separate the process of detailed learning and knowledge acquisition and the process of imposing restrictions and smoothing estimates, thereby allowing machine learning systems to aggressively learn from training data, while mitigating the effects of overfitting on the training data.” (Baker ‘960 Abstract).

Regarding claim 35, the rejection of claim 1 is incorporated. Choi teaches:
The method of claim 1, wherein adding the target-specific improvement network sub- component comprises, by the programmed computer system: 
creating an expanded network that doubles the base network by having two nodes for each node in the base network, such that each node in the base network has first and second corresponding nodes in the expanded network ([0088], [0120], [0135], and [0138]: describing the generation of new additions nodes for the expanded neural networks, wherein “a plurality of nodes are selected and a plurality of new nodes are generated”. Whereby doing so can result in a doubling of the original NN.); and 
- 83 -creating a node-to-node relationship regularization link from one or more nodes in the base network to each of the one or more node’s first and second corresponding nodes (Figs. 7-9: showing a connection link between the selected nodes and additional nodes. Wherein the link can regulate elements such as weights or errors between the nodes ([0116]-[0117], [0119], and [0122]).).

Regarding claim 36, the rejection of claim 35 is incorporated. Choi teaches:
The method of claim 35, wherein creating the node-to-node relationship regularization link from the one or more nodes in the base network to the first and second corresponding nodes in the expanded network comprises creating an is-equal-to regularization link from the one or more nodes in the base network to the first and second corresponding nodes in the expanded network ([0094] and [0116]-[0117]: describing that the connection link edges between the new additional nodes in the extended NN and the selected nodes in the base network can be equal as a result of equal connection weights.).
Regarding claim 37, Choi teaches:
The method of claim 35, wherein creating the node-to-node relationship regularization link from the one or more nodes in the base network to the one or more nodes first and second corresponding nodes in the expanded network comprises creating a directional is-not-equal-to regularization link from the one or more nodes in the base network to the one or more nodes first and second corresponding nodes in the expanded network ([0083]: describing a forward direction and a backward direction, i.e. a bi-directionality, for the connection links and their corresponding weight of the nodes in the NN. The bi-directionality links being utilized to estimate and minimize errors, i.e. regularization of the links. Wherein the relationship in the direction link is-not-equal since the connection weights differ, i.e. are not equal, during a training of the neural network because the weights are being updated/changed in the backward direction to minimize the errors.).

Regarding claim 38, the rejection of claim 37 is incorporated. Choi teaches:
The method of claim 37, wherein the one or more directional is-not-equal-to regularization links comprise one or more bidirectional is-not-equal-to regularization links ([0083]: describing a forward direction and a backward direction, i.e. a bi-directionality, for the connection links and their corresponding weight of the nodes in the NN. The bi-directionality links being utilized to estimate and minimize errors, i.e. regularization of the links. Wherein the relationship in the direction link is-not-equal since the connection weights differ, i.e. are not equal, during a training of the neural network because the weights are being updated/changed in the backward direction to minimize the errors.).

Regarding independent claim 39, claim 39 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 39 is a system claim that corresponds to method claim 1. A mapping is shown below for the limitations of claim 39 that differ from claim 1. Choi teaches:
A computer system for training a neural network, the computer system comprising: 
one or more processor units ([0153] and [0163]-[0164]: describing various processors.); and 
memory in communication with the one or more processor units, where the memory stores computer instructions that, when executed by the one or more processor units, cause the one or more processor units, to ([0163]-[0164] and [0166]-[0167]: describing memory with instructions that can be executed by the processing units.): 
…
wherein the memory stores computer instructions that, when executed by the one or more processor units, causes the one or more processor units ([0163]-[0164] and [0166]-[0167]: describing memory with instructions that can be executed by the processing units.) to add the target-specific improvement network sub-component by  ([0088] and [0149]: describing the addition of a new/additional node to a neural network.): ….

Regarding claim 42, claim 42 is substantially similar to claim 6 and therefore is rejected on the same ground as claim 6. Claim 42 is a system claim that corresponds to method claim 6.

Regarding claim 43, claim 43 is substantially similar to claim 10 and therefore is rejected on the same ground as claim 10. Claim 43 is a system claim that corresponds to method claim 10.

Regarding claim 46, claim 46 is substantially similar to claim 14 and therefore is rejected on the same ground as claim 14. Claim 46 is a system claim that corresponds to method claim 14.

Regarding claim 47, claim 47 is substantially similar to claim 15 and therefore is rejected on the same ground as claim 15. Claim 47 is a system claim that corresponds to method claim 15.

Regarding claim 48, claim 48 is substantially similar to claim 16 and therefore is rejected on the same ground as claim 16. Claim 48 is a system claim that corresponds to method claim 16.

Regarding claim 49, claim 49 is substantially similar to claim 17 and therefore is rejected on the same ground as claim 17. Claim 49 is a system claim that corresponds to method claim 17.

Regarding claim 50, claim 50 is substantially similar to claim 26 and therefore is rejected on the same ground as claim 26. Claim 50 is a system claim that corresponds to method claim 26.
Regarding claim 51, claim 51 is substantially similar to claim 27 and therefore is rejected on the same ground as claim 27. Claim 51 is a system claim that corresponds to method claim 27.

Regarding claim 52, claim 52 is substantially similar to claim 32 and therefore is rejected on the same ground as claim 32. Claim 52 is a system claim that corresponds to method claim 32.

Regarding claim 53, claim 53 is substantially similar to claim 35 and therefore is rejected on the same ground as claim 35. Claim 53 is a system claim that corresponds to method claim 35.

Regarding independent claim 54, Choi teaches: 
A method of training a neural network, the method comprising, by a programmed computer system (([0082] and [0086]-[0088]: describing training of a neural network (NN).): 
(a) training, at least partially, a base neural network on a first set of training data ([0082]-[0084] and [0129]: describing training of NN using training data.), 
wherein training the base neural network comprises computing for each datum in the first set of training data, activation values for nodes in the base neural network ([0074] and [0099]-[0102]: describing generation of activation data for the various nodes in the NN.) and
…, 
wherein the base neural network comprises an input layer, an output layer, and one or more inner layers between the input and output layers ([0076]-[0081]: describing that the NN has input layer, hidden layers, and output layer.); 
(b) after step (a), selecting a target training datum ([0085]-[0086], [0149], and [0154]: describing examples of selecting a target training datum, e.g. a target node or a set of text or digital images related to handwritings out of a general training data set stored in memory to train a neural network for performing a particular task, e.g. recognizing English handwriting patterns.); and 
(c) after step (b), adding a target-specific improvement network sub-component to the base neural network to form an expanded neural network, wherein the target-specific improvement network sub-component comprises one or more nodes ([0090]-[0091], [0135], and [0138]: describing adding in an additional/new node to expand the NN.) and wherein the target-specific improvement network sub-component, when added to the neural network, improves performance of the neural network ([0085], [0126], and [0132]: describing improved performance of the NN based on its extension by adding in additional node(s).) for the target training datum ([0085]-[0086], [0100], and [0126]: describing improved performance of the NN in correlation with the target training datum related to the various tasks or nodes. Wherein the datum was previously described above.). 

While the cited reference teaches the above limitations of claim 54, it does not explicitly teach: “estimates of partial derivatives of an objective function for the base neural network for the nodes in the base neural network” on lines 5-7. Baker discloses the claim limitations, teaching: a determination of the partial derivatives of an objection function, e.g. an error cost function, for each node in a base NN in correlation (Baker [0022]-[0024]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the cited reference to include the partial derivatives computation for the base NN in Baker. Doing so would enable a technique to “improve the performance of a network that has converged such that the gradient of the network and all the partial derivatives are zero…. The present system and method can create a new network by splitting the candidate nodes or arcs that diverge from zero and then trains the resulting network with each selected node trained on the corresponding cluster of the data.” (Baker Abstract). 

Regarding claim 55, the rejection of claim 54 is incorporated. Choi teaches:
The method of claim 54, wherein the step of adding the target-specific improvement network sub-component comprises: 
(c1) selecting the target-specific improvement network sub-component ([0087]-[0090], [0100], and [0103]: describing the selection of the additional node.); 
(c2) after step (c1), training the target-specific improvement network sub-component ([0135] and [0138]: describing training of the extended NN comprising the additional node. Similarly, see also [0126] and [0129] as shown in Fig. 11A.); and
(c3) after step (c2), merging the target-specific improvement network sub-component with the neural network to form an expanded neural network ([0092]-[0093]: describing connecting the new node to the NN to create an extended NN.).

Regarding claim 56, claim 56 is substantially similar to claim 4 and therefore is rejected on the same ground as claim 4. Claim 56 is a method claim that corresponds to another method claim 4.

Regarding claim 57, the rejection of claim 54 is incorporated. Baker further teaches:
The method of claim 54, wherein the target training datum is not a member of the set of training data (Baker [0034]-[0035]: describing that the training data is a member either in a first group or in a second group of training data.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in Choi to include the target training data computation in Baker. Doing so would enable “different sub-networks of a neural network could be trained with the different groups of data” (Baker [0035]). Whereby doing so can “improve the performance of a network that has converged such that the gradient of the network and all the partial derivatives are zero…. The present system and method can create a new network by splitting the candidate nodes or arcs that diverge from zero and then trains the resulting network with each selected node trained on the corresponding cluster of the data.” (Baker Abstract).

Regarding claim 58, the rejection of claim 54 is incorporated. Baker further teaches:
The method of claim 54, wherein the target training datum is a member of the set of training data (Baker [0034]-[0035] and [0049]: describing that training data is a member in a respective group, e.g. a first group or a second group of training data.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in Choi to include the target training data computation in Baker. Doing so would enable “different sub-networks of a neural network could be trained with the different groups of data” (Baker [0035]). Whereby doing so can “improve the performance of a network that has converged such that the gradient of the network and all the partial derivatives are zero…. The present system and method can create a new network by splitting the candidate nodes or arcs that diverge from zero and then trains the resulting network with each selected node trained on the corresponding cluster of the data.” (Baker Abstract).

Regarding claim 59, the rejection of claim 58 is incorporated. The cited references in combination do not explicitly teach: “comprises selecting a data item in the set of training data on which a target node of the base network made a classification error”. Baker ‘960 discloses the claim limitations, teaching: selecting a data example for a model in which a main classifier makes an error ([0382]-[0383]). Wherein the model can be a neural network with various nodes (Baker ‘960 [0341] and [0388]).
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the classification error in Baker ‘960. Doing so would enable “improving the aggressive development of machine learning systems…. [Wherein] various systems and methods can be utilized to separate the process of detailed learning and knowledge acquisition and the process of imposing restrictions and smoothing estimates, thereby allowing machine learning systems to aggressively learn from training data, while mitigating the effects of overfitting on the training data.” (Baker ‘960 Abstract).

Regarding independent claim 60, claim 60 is substantially similar to independent claim 54 and therefore is rejected on the same grounds as claim 54. Claim 60 is a system claim that corresponds to method claim 54.
A mapping is shown below for the limitations of claim 60 that differ from claim 54. Choi teaches:
A computer system for training a neural network, the computer system comprising: 
one or more processor units ([0153] and [0163]-[0164]: describing various processors.); and 
memory in communication with the one or more processor units, where the memory stores computer instructions that, when executed by the one or more processor units, cause the one or more processor units, to ([0163]-[0164] and [0166]-[0167]: describing memory with instructions that can be executed by the processing units.): ….

Regarding claim 61, the rejection of claim 60 is incorporated. Choi teaches:
The computer system of claim 60, wherein the memory stores computer instructions that, when executed by the one or more processor units, causes the one or more processor units to add the target-specific improvement network sub-component by: 
(c1) select the target-specific improvement network sub-component ([0087]-[0090], [0100], and [0103]: describing the selection of the additional node.); 
(c2) after step (c1), train the target-specific improvement network sub-component ([0135] and [0138]: describing training of the extended NN comprising the additional node. Similarly, see also [0126] and [0129] as shown in Fig. 11A.); and
(c3) after step (c2), merge the target-specific improvement network sub-component with the base neural network to form the expanded neural network ([0092]-[0093]: describing connecting the new node to the NN to create an extended NN.).

Claims 11, 24, and 44 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi), Baker (WIPO No. WO 2018/226527, hereinafter Baker), and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960) in view of Baker (WIPO No. WO 2019/067542, hereinafter Baker ‘542). 

Regarding claim 11, the rejection of claim 6 is incorporated. The cited references in combination do not explicitly teach: “comprises a discriminator node that discriminates between the first datum and a second datum in the first set of training data”. Baker ‘542 discloses the claim limitations, teaching: a “discrimination node” that can discriminate with regards to pair of data items (Baker ‘542 [0055]-[0056]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the discrimination node in Baker ‘542. Doing so would enable “a combined machine-learning system comprising an ensemble of machine-learning systems 102A-C and a joint optimization network 104, in which the members of the ensemble are neural networks trained to optimize a joint objective from the joint optimization network. Each member 102A, 102B, 102C of the ensemble illustrated in Figure 1 is a neural network that has been pre-trained or that may be trained to optimize its individual objective 103A, 103B, or 103C….” (Baker ‘542 [0011]). The ensemble machine learning being optimized for continued learning (Baker ‘542 [0054]). 

Regarding claim 24, the rejection of claim 1 is incorporated. The cited references in combination do not explicitly teach: “comprises training the target-specific improvement network sub-component with one-shot learning”. Baker ‘542 discloses the claim limitations, teaching: a process comprising “one-shot learning, a node, called herein a "template node," is added to a neural network based on a single data item example” (Baker ‘542 [0054]-[0055]). Wherein the node can continue learning via the one-shot learning process (Baker ‘542 [0056]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the one-shot learning in Baker ‘542. Doing so would enable the additional ensemble node in the neural network to “continue[] learning from additional training data items” (Baker ‘542 [0054]).

Regarding claim 44, claim 44 is substantially similar to claim 11 and therefore is rejected on the same ground as claim 11. Claim 44 is a system claim that corresponds to method claim 11.

Claims 12, 13, and 45 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi), Baker (WIPO No. WO 2018/226527, hereinafter Baker), and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960) in view of Baker (WIPO No. WO 2019/005507, hereinafter Baker ‘507).

Regarding claim 12, the rejection of claim 1 is incorporated. Choi teaches:
	The method of claim 1 wherein the target-specific improvement network sub-component comprises: 
	…; and 
	an error-correction node that passes through an activation value from the target node unless certain conditions apply ([0083]-[0084]: describing nodes for propagating weights based on an error calculation to minimize errors. Wherein the weights being propagated operate in correlation with activation values of the nodes ([0074]).), wherein the conditions comprise 
	(i) the target node made a - 79 -classification choice ([0084], [0112], [0124], and [0126]: describing output values or classifications that can be made by the NN and its nodes.) and 
(ii) the error prediction node predict that the classification choice by the target node is erroneous ([0084]: determining that an error has occurred when an actual output is different than an expected output.). 

While the cited references in combination teach the above limitations of claim 12, they do not explicitly teach: “an error-prediction node that is trained to detect training data on which the target node makes classification errors” on lines 3-4. Baker ‘507 discloses the claim limitations, teaching: a node that can predict the errors related activation functions for related data items in the various layers comprises the node (Baker ‘507 [0020]). Wherein a prediction of the error can be associated with the training data (Baker ‘507 [0042]) which can include classification data (Baker ‘507 [0029]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the one-shot learning in Baker ‘507. Doing so would enable guidance in a neural network that “can be provided by aligning sets of nodes or entire layers in a network being trained with sets of nodes in a reference system. This guidance facilitates the trained network to more efficiently learn features learned by the reference system using fewer parameters and with faster training. The guidance also enables training of a new system with a deeper network, i.e., more layers, which tend to perform better than shallow networks. Also, with fewer parameters, the new network has fewer tendencies to overfit the training data.” (Baker ‘507 Abstract). 

Regarding claim 13, the rejection of claim 12 is incorporated. Choi teaches:
	The method of claim 12, wherein the error-correction node reverses an output of the target node relative to a threshold value when conditions (i) and (ii) apply ([0083]-[0085]: describing error computations and backpropagation through the respective nodes to improve performance of the NN based on a predetermined level.).

Regarding claim 45, claim 45 is substantially similar to claim 12 and therefore is rejected on the same ground as claim 12. Claim 45 is a system claim that corresponds to method claim 12.

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi), Baker (WIPO No. WO 2018/226527, hereinafter Baker), and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960) in view of Baker (WIPO No. WO 2018/226492, hereinafter Baker ‘492).

Regarding claim 21, the rejection of claim 1 is incorporated. Choi teaches: 
	The method of claim 1, wherein merging the target-specific improvement network sub- component with the base neural network comprises establishing an incoming connection from a first node in the base network to a first node of the target-specific improvement network sub- component ([0091]-[0093], [0116]-[0117] and [0121]: describing the connection between the new nodes and the selected nodes in the base network.), ….

While the cited references in combination teach the above limitations of claim 21, they do not explicitly teach: “wherein a weight for the incoming connection is initialized to zero prior to training of the expanded network”. Baker ‘492 discloses the claim limitations, teaching: “[a] weight for a new incoming arc may be initially set to zero prior to subsequently training the updated deep neural network” (Baker ‘492 [0223]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the absolute value in Baker ‘492. Doing so would enable techniques to “improve a trained base deep neural network by structurally changing the base deep neural network to create an updated deep neural network, such that the updated deep neural network has no degradation in performance relative to the base deep neural network on the training data. The updated deep neural network is subsequently training.” (Baker ‘492 Abstract). 

Regarding claim 22, the rejection of claim 21 is incorporated. Choi teaches:
The method of claim 21, wherein merging the target-specific improvement network sub- component with the base neural network further comprises establishing an outgoing connection from the first node of the target-specific improvement network sub-component to a second node of the base network ([0091]-[0093], [0116]-[0117] and [0121]: describing the connections between the new nodes and the selected nodes in the base network.), ….

While the cited references in combination teach the above limitations of claim 22, they do not explicitly teach: “wherein a weight for the outgoing connection is initialized to zero prior to training of the expanded network”. Baker ‘492 discloses the claim limitations, teaching: “a weight of the new outgoing arc may be initially set to zero prior to subsequently training the updated deep neural network” (Baker ‘492 [0223]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the absolute value in Baker ‘492. Doing so would enable “arc weight is initialized to zero, so there is no immediate change in the activations, so no change in performance” when new node element are added to the NN (Baker ‘492 [0121]).

Regarding claim 23, the rejection of claim 22 is incorporated. Choi teaches:
The method of claim 22, wherein the target node is the second node of the base network, such that there is an outgoing connection from the first node of the target-specific improvement network sub-component to the target node ([0090]-[0093] and [0107]: describing connection of a new node to a selected node.).

Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi), Baker (WIPO No. WO 2018/226527, hereinafter Baker), and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960) in view of Baker (WIPO No. WO 2018/231708, hereinafter Baker ‘708).

Regarding claim 29, the rejection claim 28 is incorporated. The cited references in combination do not explicitly teach: “wherein the node-specific regularization cost comprises an absolute value of a difference between the activation value for the second node for the training datum and the activation value for the target node for the training datum”. Baker ‘708 discloses the claim limitations, teaching: an absolute value of a difference for a gate node that comprises several nodes with their respective activation values (Baker ‘708 [0043]-[0045]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the absolute value in Baker ‘708. Doing so would enable techniques to “improve the robustness of a network that has been trained to convergence, particularly with respect to small or imperceptible changes to the input data. Various techniques … can include adding biases to the input nodes of the network, increasing the minibatch size of the training data, adding special nodes to the network that have activations that do not necessarily change with each data example of the training data, splitting the training data based upon the gradient direction, and making other intentionally adversarial changes to the input of the neural network.” (Baker ‘708 Abstract).

Claims 33 and 34 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0155049, hereinafter Choi), Baker (WIPO No. WO 2018/226527, hereinafter Baker), and Baker (WIPO No. WO 2019/067960, hereinafter Baker ‘960) in view of Baker (WIPO No. WO 2018/231708, hereinafter Baker ‘708).

Regarding claim 33, the rejection claim 32 is incorporated. The cited references in combination do not explicitly teach: “wherein the second datum is selected as datum in the selected set of data that maximizes an absolute value of a difference between the value of the summation of the incoming connections to the target node for the target node and the value of the summation of the incoming connections to the target node for the second datum”. Baker ‘708 discloses the claim limitations, teaching: a selection of values based on an absolute value of a difference for a gate node that comprises several nodes with their respective activation values (Baker ‘708 [0043]-[0045]). Wherein the activation values for the nodes can be summed (Baker ‘708 [0035], [0043], and [0090]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the method of training the NN in the combined cited references to include the absolute value in Baker ‘708. Doing so would enable techniques to “improve the robustness of a network that has been trained to convergence, particularly with respect to small or imperceptible changes to the input data. Various techniques … can include adding biases to the input nodes of the network, increasing the minibatch size of the training data, adding special nodes to the network that have activations that do not necessarily change with each data example of the training data, splitting the training data based upon the gradient direction, and making other intentionally adversarial changes to the input of the neural network.” (Baker ‘708 Abstract)

Regarding claim 34, Choi teaches:
The method of claim 33, wherein adding the target-specific improvement network sub- component further comprises adding a connection from the second node to the target node ([0090]-[0093] and [0107]: describing connection of a new node to a selected node).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Azimi-Sadjadi et. al., “Recursive Dynamic Node Creation in Multilayer Neural Networks”: describing a recursive least squares process for creating/adding new nodes into a multilayered neural network. Wherein the process comprises computing the index of performance of nodes in the neural network over training time t for a set of training data. The process also includes calculating time and weight updates for the nodes in order to create/add the new nodes. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762. The examiner can normally be reached M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.H./Examiner, Art Unit 2128

/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128