DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 3/19/2019 and the Remarks and Amendments filed on 6/14/2022.  

Claim Objections

Claims 21, 22, and 23 are objected to because of the following informalities:  Claims 21, 22, and 23 recite the limitations “designating a node as active in response to an L2 norm of a feature vector in output the node being higher than a predetermined threshold; or designating a node as active in response to an estimate of variance of arithmetic mean of weights of every node under a fixed training cycle being high enough warrant active status” (emphasis added), which are grammatically confusing. For better clarity, the following amendment is suggested: “designating a node as active in response to an L2 norm of a feature vector in an output of the node being higher than a predetermined threshold; or designating a node as active in response to an estimate of a variance of an arithmetic mean of weights of every node under a fixed training cycle being high enough to warrant active status.” (emphasis added).  Appropriate correction is required.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 21, 22, and 23 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 21, 22, and 23 recite the limitation “designating a node as active in response to an estimate of variance of arithmetic mean of weights of every node under a fixed training cycle being high enough warrant active status” (emphasis added). The term “high enough” is a relative term which renders the claim indefinite. The term “high enough” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. What does it mean for the estimate to be “high enough” to “warrant” active status?  This limitation is unclear.  Foe examination purposes, the limitation will be interpreted to designate a node as active if an estimate of a variance of an arithmetic mean of weights of every node under a fixed training cycle is above a threshold.  Appropriate correction is required.
Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



Claims 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Zoph et al. (US 20200265315 A1, hereinafter “Zoph”).

	Regarding claim 1, Zoph discloses [a] neural architecture search method comprising: ([0005]; “This specification describes how a system implemented as computer programs on one or more computers in one or more locations can determine, using a controller neural network, an architecture for a neural network that is configured to perform a particular neural network task”; and Abstract)
selecting a neural architecture for training as part of an automated machine learning process; ([0006]; “select a neural network architecture that will result in a high-performing neural network for a particular task”; and [0007]; “In particular, by limiting the search space to paths within a large model and therefore sharing parameter values between candidate architectures during a given round of search, the system effectively constrains the search space and limits the computational resources required for training while still being able to determine effective architectures that result in high-performing neural networks”; and [0031]; “In particular, the system 100 maintains large neural network data 140 that defines the large neural network as a directed acyclic graph (DAG), i.e., the neural network data 140 represents a DAG that defines the architecture of the large neural network and, therefore, the search space for the architecture search process”, the DAG is user to select a neural architecture for training)
training the selected neural network architecture with a training set ([0046]; “The training engine 120 then trains the large neural network with the architecture defined by the sampled output sequence active to determine updated large neural network parameter values 142 for those components that are active during the training”; and Figure 3, 310; and [0077]; “As described above, the system can train the architecture for a specified number of iterations or for one pass through the training data”)
collecting statistical parameters on individual nodes of the neural architecture during the training; (Figure 2A; the figure discloses, under a broadest reasonable interpretation of the claim language, collecting statistical parameters or output measurements of the individual nodes (shown as nodes 104 in the figure) at various time steps during the training; and [0056]; the controller, as described in the paragraph, collects statistical information for each node in figure 2a during various time steps during the training; and [0057]; “generate an output for the time step that defines a score distribution over possible values of the output at the time step”; and [0058-0061])
determining, based on the statistical parameters, active nodes of the neural architecture to form a candidate neural architecture; and ([0030]; “By selecting a subset of components of the large neural network that should be active during processing, the system 100 identifies a high-quality architecture that is computationally feasible and that can be trained to generate high-quality network outputs.”, the subsets of components that are selected are the active nodes of the neural architecture that are used to form the candidate architecture; and [0046]; “The training engine 120 then trains the large neural network with the architecture defined by the sampled output sequence active to determine updated large neural network parameter values 142 for those components that are active during the training”; and [0077]; “update the large neural network parameters of the components that are designated as active by the sampled output sequence”; and [0066]; “the system can select more than one of the incoming edges to the node to be active in order to form a skip connection”)
validating the candidate neural architecture to produce a trained neural architecture to be used in an application or a service ([0048]; “the system 100 can evaluate the performance of the architecture defined by each new output sequence on the validation set 104 and then select the highest-performing architecture as the final architecture.”, the evaluation being the validating of the candidate architecture; and [0092]; “Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface, a web browser, or an app through which a user can interact with an implementation of the subject matter described in this specification”, which discloses that the NN architecture search system can be implemented in an application or service; and [0026]; “The neural architecture search system 100 is a system that obtains training data 102 for training a neural network to perform a particular task and a validation set 104 for evaluating the performance of the neural network on the particular task and uses the training data 102 and the validation set 104 to determine an architecture for a neural network that is configured to perform the particular task.”).

Regarding claim 8, it is a system claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claim 15, it is a non-transitory computer-readable media claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claims 2, 9, and 16, the rejection of claims 1, 8, and 15 are incorporated and Zoph further discloses receiving an input dataset; forming the training set as a first subset of the input dataset to be used for the training of the neural architecture; and forming a validation set as a second subset of the input data set to be used for validating the candidate neural architecture ([0026]; “The neural architecture search system 100 is a system that obtains training data 102 for training a neural network to perform a particular task and a validation set 104 for evaluating the performance of the neural network on the particular task and uses the training data 102 and the validation set 104 to determine an architecture for a neural network that is configured to perform the particular task.”; and [0027-0028]).

Regarding claims 3, 10, and 17, the rejection of claims 1, 8, and 15 are incorporated and Zoph further discloses determining that the candidate neural architecture is not validated; and iteratively repeating the selecting, training, collecting, determining and validating until an updated candidate neural architecture is validated ([0046]; “For example, the training engine 120 can train the large neural network for an entire pass through the training data 102 or for a specified number of training iterations”, which discloses the iterative training processes; and [0048]; “In implementations where multiple new output sequences are generated, the system 100 can evaluate the performance of the architecture defined by each new output sequence on the validation set 104 and then select the highest-performing architecture as the final architecture. Alternatively, the system 100 can further train each selected architecture and then evaluate the performance of each of the architectures after the further training”; and [0070]; “For each output sequence in the batch, the system evaluates the performance of the architecture defined by the sequence to determine a performance metric for the trained instance on the particular neural network task (step 304). For example, the performance metric can be an accuracy of an instance of the large neural network having the architecture on the validation set or a subset of the validation set as measured by an appropriate accuracy measure. For example, the accuracy can be based on a perplexity measure when the outputs are sequences or a classification error rate when the task is a classification task”, wherein determining whether an architecture is not validated is when the accuracy or error rate is below a threshold. This process is repeated until an architecture is sufficiently accurate and therefore validated).

Regarding claims 4 and 18, the rejection of claims 1, 3, 15, and 17 are incorporated and Zoph further discloses wherein during each iteration, a different neural architecture is selected ([0046]; “For example, the training engine 120 can train the large neural network for an entire pass through the training data 102 or for a specified number of training iterations”, which discloses the iterative training processes; and [0048]; “In implementations where multiple new output sequences are generated, the system 100 can evaluate the performance of the architecture defined by each new output sequence on the validation set 104 and then select the highest-performing architecture as the final architecture. Alternatively, the system 100 can further train each selected architecture and then evaluate the performance of each of the architectures after the further training).

Regarding claim 11, the rejection of claims 8 and 10 are incorporated and Zoph further discloses wherein during each iteration, the controller selects a different neural architecture compared to previous iterations ([0046]; “For example, the training engine 120 can train the large neural network for an entire pass through the training data 102 or for a specified number of training iterations”, which discloses the iterative training processes; and [0048]; “In implementations where multiple new output sequences are generated, the system 100 can evaluate the performance of the architecture defined by each new output sequence on the validation set 104 and then select the highest-performing architecture as the final architecture. Alternatively, the system 100 can further train each selected architecture and then evaluate the performance of each of the architectures after the further training; and Figure 1; the figure discloses the controller).

Regarding claims 5, 12, and 19, the rejection of claims 1, 3, 4, 8, 10, 11, 15, 17, and 18 are incorporated and Zoph further discloses wherein during each iteration, the updated neural architecture is comprised of a first set of nodes and a second set of nodes, (Figure 2A;  the figure discloses the two sets of nodes) wherein the first set of nodes includes active nodes of at least one neural architecture trained in one or more previous iterations and the second set of nodes includes active nodes of a neural architecture trained in a current iteration ([0030]; “By selecting a subset of components of the large neural network that should be active during processing, the system 100 identifies a high-quality architecture that is computationally feasible and that can be trained to generate high-quality network outputs.”, the subsets of components that are selected are the active nodes of the neural architecture that are used to form the candidate architecture; and [0046]; “The training engine 120 then trains the large neural network with the architecture defined by the sampled output sequence active to determine updated large neural network parameter values 142 for those components that are active during the training”; and [0077]; “update the large neural network parameters of the components that are designated as active by the sampled output sequence”; and [0066]; “the system can select more than one of the incoming edges to the node to be active in order to form a skip connection”).

Regarding claims 7 and 14, the rejection of claims 1, 3, 8, and 10 are incorporated and Zoph further discloses storing, in a database, statistical information on the selecting, collecting, determining and validating of the candidate neural architecture; ([0083-0084])
wherein determining active nodes of any further candidate neural architecture further comprises: accessing the database to retrieve the statistical information; and ([0083-0084])
determining an updated neural architecture based on the statistical information (Figure 1, 130; and [0043-0044]).

Response to Arguments

Applicant’s arguments and amendments, filed on 6/14/2022, with respect to the objection to claims 1-20 have been fully considered and are persuasive.  The objection to claims 1-20 is withdrawn.  However, newly presented claims 21, 22, and 23 are presently objected to based on grammatical clarity issues as discussed above.

Applicant’s arguments and amendments, filed on 6/14/2022, with respect to the 35 USC § 101 rejection of claims 1-20 have been fully considered and are persuasive.  The 35 USC § 101 rejection of claims 1-20 is withdrawn.  

Applicant’s arguments and amendments, filed on 6/14/2022, with respect to the 35 USC § 102(a)(1) rejection of claims 1-5, 7-12, and 14-19 have been fully considered and are not persuasive.  

Beginning on page 8 of the remarks, filed on 6/14/2022, Applicant argues that “Zoph defines at [0040] how "active" nodes are identified: "Thus, the components specified as active by a given output sequence are (i) any components that are fixed and are not part of the search process and (ii) the active components within the DAG, i.e., the parameter matrices corresponding to the connectivity defined by the output sequence and the components that perform the operations specified by the output sequence." This Zoph definition of its active does not call for or rely upon any collected statistical parameters. There is no teaching or suggestion that Zoph uses the collected analytics from Zoph [0056]-[0061] to determine whether a node is or is not an active node”.  Examiner respectfully disagrees with Applicant’s analysis for Zoph teaching the limitations “collecting statistical parameters on individual nodes of the neural architecture during the training; determining, based on the statistical parameters, active nodes of the neural architecture to form a candidate neural architecture”.  

First, Zoph discloses the limitation “collecting statistical parameters on individual nodes of the neural architecture during the training” at least at paragraphs [0056] and [0058-0061] as well as in Figure 2A of the drawings.  Specifically, Zoph discloses that the collected statistical parameters are “a score distribution over possible output values at the time step” from which an output value at the time step in the output sequence of one of the plurality of nodes in the neural network architecture is determined.  Further, paragraph [0055] and Figure 2A of Zoph discloses the different time steps that correspond to different nodes of the neural network.  Each time step has a corresponding output associated with a corresponding node.  It is these individual outputs that are used in determining whether or not the node is “active” at the particular time step.  

Turning to the limitation “determining, based on the statistical parameters, active nodes of the neural architecture to form a candidate neural architecture”, Zoph discloses this limitation in at least paragraphs [0030], [0046], [0066], and [0077].  The Examiner did not cite to paragraph [0040] in the rejection to demonstrate how active nodes are identified.  Notably, however, paragraph [0039] of Zoph discloses “Collectively, the outputs in a given output sequence define a subset of components that are active within the large neural network. Output sequences are discussed in more detail below with reference to FIGS. 2A-2B.”  This suggests that the outputs in a given sequence, such as at a particular time step that is associated with a particular node in the neural network define (or identify) a subset of components (or nodes) that are active. This determination of whether or not a node is active is based on statistical parameters or score distributions from which from which an output value at the time step in the output sequence of one of the plurality of nodes in the neural network architecture is determined.  Thus, the determination of whether a node is “active” is based on the statistical parameters or score distributions that result in an output sequence at a particular time step associated with a particular node in the neural network.  

Further paragraph [0077] discloses “The system trains an architecture defined by the sampled output sequence to update the large neural network parameters of the components that are designated as active by the sampled output sequence”, which discloses that the nodes are determined or designated as “active” by the “output sequence” or statistical parameters collected or sampled at a node at a specific time period.

In sum, Applicant’s arguments are not persuasive, and the 35 USC 102(a)(1) rejection of claims 1-5, 7-12, and 14-19 STANDS.

Examiner’s Comment

Claims 21, 22, and 23 would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and upon a proper overcoming the 112(b) rejection of the claims.

Conclusion

Claims 21, 22, and 23 have been searched, but no prior art was uncovered.
The closest prior art of record to claims 21, 22, and 23, Doussof et al. (US 20220076103 A1) discloses at paragraph [0009] determining an active node based on comparing a neuron’s output against a threshold, but fails to explicitly disclose designate a node as active in response to an estimate of variance of arithmetic mean of weights of every node under a fixed training cycle being high enough warrant active status as claimed.  

Further, Nachum et al. (US 20190147339 A1) discloses a shrinkming process for neural networks that considers active and inactive node, but fails to explicitly disclose designating a node as active in response to an L2 norm of a feature vector in output the node being higher than a predetermined threshold; or designating a node as active in response to an estimate of variance of arithmetic mean of weights of every node under a fixed training cycle being high enough warrant active status as claimed.  

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Doussof et al. (US 20220076103 A1).
Nachum et al. (US 20190147339 A1).

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403. The examiner can normally be reached Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127