DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112

The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 4, 11, and 17 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Representative Claim 4 recites, at least in part: 
exploitation reward…updating the RNN based on a combination of the exploitation reward and the exploration preference variable…”
The above claim language does not appear to clearly described in the as-filed specification. 
MPEP 2161.01(I) recites: 
The written description requirement of 35 U.S.C. 112(a)  or pre-AIA  35 U.S.C. 112, first paragraph, applies to all claims including original claims that are part of the disclosure as filed. Ariad, 598 F.3d at 1349, 94 USPQ2d at 1170. As stated by the Federal Circuit, "[a]lthough many original claims will satisfy the written description requirement, certain claims may not." Id. at 1349, 94 USPQ2d at 1170-71; see also LizardTech, Inc. v. Earth Res. Mapping, Inc., 424 F.3d 1336, 1343-46, 76 USPQ2d 1724, 1730-33 (Fed. Cir. 2005); Regents of the Univ. of Cal. v. Eli Lilly & Co., 119 F.3d 1559, 1568, 43 USPQ2d 1398, 1405-06 (Fed. Cir. 1997)("The description requirement of the patent statute requires a description of an invention, not an indication of a result that one might achieve if one made that invention."). Problems satisfying the written description requirement for original claims often occur when claim language is generic or functional, or both. Ariad, 593 F.3d at 1349, 94 USPQ2d at 1171
Similarly, original claims may lack written description when the claims define the invention in functional language specifying a desired result but the specification does not sufficiently describe how the function is performed or the result is achieved. For software, this can occur when the algorithm or steps/procedure for performing the computer function are not explained at all or are not explained in sufficient detail (simply restating the function recited in the claim is not necessarily sufficient). In other words, the algorithm or steps/procedure taken to perform the function must be described with sufficient detail so that one of ordinary skill in the art would understand how the inventor intended the function to be performed

As described in the MPEP, the above claim language does not appear to be described in such a clear, concise, and exact manner that a person of ordinary skill in the art how the inventor intended the function (of at least Claim 4) to be performed. 
	To clarify, at least Claim 4 appears to require two separate and distinct variables: the “exploration” and the “exploitation reward”. Additionally, the RNN is updated based on a “combination” of above two variables. 
	As an initial point, the term “exploitation reward” is only mentioned in paragraph [0062] of the instant as-filed specification and this paragraph simply reflects the claim language at issue. That is, paragraph [0062], does not further describe or clarify the functionality encompassed by at least Claim 4. 
	As a second point, at least representative Claim 4 appears to require that the RNN controller is updated based on “…a combination…” of the exploration and exploitation variables. However, the specification nor the claim language itself clearly described what the “combination” encompasses nor how the inventor intended the function of “combine” to update the RNN to be performed. 
	For at least the above reasons, Claim 4 lack written description and therefore a rejection under 35 U.S.C. 112(a) is appropriate. 
	The examiner notes, for clarity of record, that Claims 11 and 17 recites similar subject matter to that of representative claim 4. Therefore, claims 11 and 17 are rejected under similar grounds. 
	For purposes of examination, the functionality encompassed by “combination” of the two variables to update the RNN as required by Claim 4 is interpreted, under BRI, as any combination of a reward and current inputs or outputs which update a RNN controller using reinforcement learning. 
clearly shows the intended interpretation. 


The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 9, and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Representative Claim 2 recites, at least in part: 
Wherein the output model is based on a comparison of the performance levels of each of the trained models.
There are several issues with this limitation: 
1. “the output model” appears to lack antecedent basis. Specifically, claim 1, which claim 2 depends on, recites “outputting one of the first trained model or the second trained model…” Under the broadest reasonable interpretation, claim 2’s “the output model” does NOT appear to encompass either “the first trained model” or “the second trained model” and therefore it is unclear what “the output model” refers to. 
2. “a comparison” is indefinite because it is unclear if claim 2’s “a comparison” is different or the same as “a comparison” as recited in the last limitation of Claim 1. 
only “the third or more trained models” or does “each of the trained models” also encompass “the first trained model” and “the second trained model.” 
For at least the above three reasons, Claim 2 is indefinite and therefore a rejection under 35 U.S.C. 112(b) is appropriate. 
For clarity of record, the examiner notes that Claims 9 and 16 recites similar subject matter to that of representative claim 2. Therefore claims 9 and 16 are rejected under similar grounds as described with respect to claim 2 above. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

For clarity of record and ease of reading, the examiner notes the following: 
Any text that is bolded is a limitation of a claim. 
The “teaching” or reference citation, along with any necessary examiner notes are contained within the parentheses “()” following the bolded
Any text that is underlined is emphasized language from reference(s) used and/or particular important examiner notes. While NOT fully reflective of the rejection as a whole, these underlined passages are indicative or otherwise reflective of key evidence.   

Claims 1-5, 7-12, 14-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zoph et al. (“Neural Architecture Search with Reinforcement learning”, NPL 2017 (See applicant provided IDS filed 09/10/2020) in view of Veniat et al. (“Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks”, NPL 2018).

With respect to Clam 1,  Zoph teaches A method for generating a trained model, the method comprising: receiving hyper-parameters for a search model, the hyper-parameters comprising operational training parameters for the search model (Zoph Pg. 3 Section 3.1 “In Neural Architecture Search, we use a controller to generate architectural hyperparameters of neural networks…Let’s suppose we would like to predict feedforward neural networks with only convolutional layers, we can use the controller to generate their hyperparameters as a sequence of tokens”. The examiner notes that by generating the hyperparameters the controller is receiving the hyperparameters that define the architecture of a neural network and thus teaches “receiving hyper-parameters for a search model, the hyper-parameters comprising operational training parameters for the search model”.).

generating, by the search model, a first model based on the hyper- parameters and a first sample of an architecture space…(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”)…the architecture space representative of a set of model components which may be collected together into combinations to form a trainable model… (Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”)…and the first sample a first subset of the model components (Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” That is, an iteration of Zoph controller RNN generates a sample of components and this sample of components during a respective iteration of the RNN controller teaches at least “the first sample a first subset of the model components”.).

training the first model (Zoph Pg. 3 Section 3.1  “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…”).

determining a performance level of the trained first model (Zoph Pg. 3 Section 3.1 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”. The examiner notes that recorded the 
…
updating an exploration preference variable based on [model accuracy] (Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…” The examiner notes that Zoph’s “reward” and the updating of the reward using reinforcement learning teaches “…updating an exploration preference variable based on [model accuracy]”.).

generating a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a this child network will achieve accuracy R…We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Where m is the number of different architectures that the controller samples in one batch and T is the number of hyperparameters our controller has to predict to design a neural network architecture. The validation accuracy that the k-th neural network architecture achieves after being trained on a training dataset is Rk.” The examiner notes that the iterative process of sampling an architecture, building the sampled architecture, training the sampled architecture, and calculating or otherwise determining an accuracy of the sampled model, to maximize the reward seen by the controller to achieve an optimal neural network teaches “generating a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components”.).

training the second model (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”).
determining a performance level of the trained second model (Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”.).
outputting one of the first trained model or the second trained model based on a comparison the performance levels (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture and thus teaches “outputting one of the first trained model or the second trained model based on a comparison the performance levels”.).

	Zoph, however, does not appear to explicitly disclose: 
determining a first availability of one or more resources for generating models; 
…based on the first availability of the one or more resources; 

Veniat, however, teaches determining a first availability of one or more resources for generating models (Veniat Abstract “We propose to focus on the or instance our approach is able to solve the following tasks: learn a neural network able to predict well in less than 100 milliseconds or learn an efficient model that fits in a 50Mb memory. Our contribution is a novel family of models call Budgeted Super Networks (BSN). They are learned using gradient descent techniques applied on a budgeted learning objective function which integrates a maximum authorized cost…We present a set of experiments on computer vision problems and analyze the ability of our technique to deal with three different costs: the computation cost, the memory consumption cost and a distributed computation cost…” Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…” The examiner notes that receiving a budget of computational time and/or maximum authorized cost teaches “determining a first availability of one or more resources for generating models”.).
…based on the first availability of the one or more resources (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”) 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the neural architecture search as taught by Zoph modified with the computational cost/time requirement as taught by Veniat because this would reduce the total cost of producing a neural architecture and would further result in a higher accuracy model (Veniat Abstract “We particular show that our model can discover neural network architectures that have a better accuracy than the ResNet and Convolutional Neural Fabrics architectures…at a lower cost.”) 

With respect to Claim 2, the combination of Zoph and Veniat teach generating a third or more models …(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” Zoph Pg. 3 Section 3.2 “training with REINFORCE”. The examiner notes that a person of ordinary skill in the art would infer that generating a third or more models is simply three or more iterations of Zophs RNN controller. That is, and to based on a second or more availabilities of the one or more resources for generating models (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”). 
Training the third or more models (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”). 
Determining a respective performance level of each trained model of the trained third or more models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and 
Wherein the output model is based on a comparison of the performance levels of each of the trained models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture.).
With respect to Claim 3, the combination of Zoph and Veniat teach wherein the hyper-parameters are received by a controller comprising a recurrent neural network (RNN) and the method further comprising updating the RNN with the exploration preference variable ( Zoph Pg. 3 Section 3.1 c.f. Figure 2 note the architecture of the controller which shows an RNN controller. Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…”).
With respect to Claim 4, the combination of Zoph and Veniat teach wherein the controller updates the RNN using reinforcement learning, the method further comprising (Zoph Pg. 3 Section 3.1 c.f. Figure 2 note the architecture of the controller which shows an RNN controller. Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…”)
determining an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model (Zoph Pg. 3 Section 3.2 “We can use this accuracy [for each child network] as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward, represented by J(θc):                         
                            
                                
                                    J
                                    (
                                    θ
                                    )
                                
                                
                                    c
                                
                            
                            =
                             
                            
                                
                                    E
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            (
                                            a
                                            1
                                            :
                                            T
                                            ,
                                             
                                            θ
                                            c
                                            )
                                        
                                    
                                
                            
                            [
                            R
                            ]
                        
                    …” The examiner notes the expected reward (Jθc), teaches “an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model…” To clarify, the expected reward (e.g. exploitation reward) is the reward signal that represents how optimized the RNN controller is. In turn, as can be seen by the equation, the expected reward (exploitation reward) is based on the accuracies of previous models (i.e. known components). Indeed, Zoph Pg. 2 Section 1 provides interpretation: “Using this accuracy as the reward signal, we can compute the policy gradient to update the As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” ).
Updating the RNN based on a combination of the exploitation reward and the exploration preference variable, wherein the controller using the updated RNN to explore the architecture space (Zoph Pg. 2 Section 1 “Using this accuracy as the reward signal, we can compute the policy gradient to update the controller. As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” Zoph Pg. 3 last equation (see policy gradient using REINFORCE rule). The examiner notes that using the policy gradient (i.e. the difference between the expected reward and current reward) to update the RNN controller (e.g. to update its expected reward) teaches “updating the RNN based on a combination of the exploitation reward and the exploration preference variable, wherein the controller using the updated RNN to explore the architecture space” where the “combination” is the difference between the exploitation reward (e.g. expected reward, Jθc) and exploration preference variable (e.g. the current reward, R).).
With respect to Claim 5, the combination of Zoph and Veniat teach wherein the one or more resources for generating models includes at least one of remaining compute time or compute power (Veniat Pg. 3494 Col. 2 Section 3.1 “Let us consider H a binary matrix of size N x N. Let us denote                         
                            C
                             
                            (
                            H
                            
                                ⨀
                                
                                    E
                                    )
                                     
                                    ∈
                                     
                                    R
                                    +
                                
                            
                        
                     the cost associated to the computation of the S-network…let us also define C the maximum cost the user would allow. For instance, when solving the problem of learning a model with a computation time lower than 200ms then C is equal to 200ms…” Note Equation 2. That is, Veniat’s equation attempts to find an architecture E that minimizes, for example, the his is based on a difference between the current cost of some neural architecture E and the computational time requirement. This difference teaches the “remaining compute time or compute power”.). 
With respect to Claim 7, the combination of Zoph and Veniat teach wherein the exploration preference variable is proportional to the availability of the one or more resources for generating models (Veniat Pg. 3495 see eq. 6 and/or 7. Note how the gradient (eq. 6) is proportional, that is changes directly, with respect to the change  of the sampled structure  and the predicted loss. In other words, a person of ordinary skill in the art would infer by observing eq. 6 that the exploration preference variable (e.g. current reward) changes in direct relationship to the cost requirement). 

With respect to Claim 8, Zoph teaches  a system for generating a trained model, the system comprising: one or more processors; and a memory comprising instructions which cause the one or more processors to: receive hyper-parameters for a search model, the hyper-parameters comprising operational training parameters for the search model (Zoph Pg. 3 Section 3.1 “In Neural Architecture Search, we use a controller to generate architectural hyperparameters of neural networks…Let’s suppose we would like to predict feedforward neural networks with only convolutional layers, we can use the controller to generate their hyperparameters as a sequence of tokens”. The examiner notes that by generating the hyperparameters the controller is receiving the hyperparameters that define the architecture of a neural network and thus teaches “receiving hyper-

generate, by the search model, a first model based on the hyper- parameters and a first sample of an architecture space…(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”)…the architecture space representative of a set of model components which may be collected together into combinations to form a trainable model… (Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”)…and the first sample a first subset of the model components (Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” That is, an iteration of Zoph controller RNN generates a sample of components and this sample of components during a respective iteration of the RNN controller teaches at least “the first sample a first subset of the model components”.).

train the first model (Zoph Pg. 3 Section 3.1  “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…”).

determine a performance level of the trained first model (Zoph Pg. 3 Section 3.1 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”. The examiner notes that recorded the accuracy of the generated, built, and trained model teaches “determining a performance level of the trained first model”.).  
…
update an exploration preference variable based on [model accuracy] (Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…” The examiner notes that Zoph’s “reward” and the updating of the reward using reinforcement learning teaches “…updating an exploration preference variable based on [model accuracy]”.).

generate a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” Zoph Pg. 3 Section 3.2 “training with REINFORCE”. “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network. At convergence, this child network will achieve accuracy R…We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Where m is the number of different architectures that the controller samples in one batch and T is the number of hyperparameters our controller has to predict to design a neural network architecture. The validation accuracy that the k-th neural network architecture achieves after being trained on a training dataset is Rk.” The examiner notes that the iterative process of sampling an architecture, building the sampled architecture, training the sampled architecture, and calculating or otherwise determining an accuracy of the sampled model, to maximize the reward seen by the controller to achieve an optimal neural network teaches “generating a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components”.).

train the second model (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”).
determine a performance level of the trained second model (Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”.).
output one of the first trained model or the second trained model based on a comparison the performance levels (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture and thus teaches “outputting one of the first trained model or the second trained model based on a comparison the performance levels”.).

	Zoph, however, does not appear to explicitly disclose: 
determine a first availability of one or more resources for generating models; 
…based on the first availability of the one or more resources; 

determine a first availability of one or more resources for generating models (Veniat Abstract “We propose to focus on the problem of discovering neural network architectures efficient in terms of both prediction quality and cost. For instance our approach is able to solve the following tasks: learn a neural network able to predict well in less than 100 milliseconds or learn an efficient model that fits in a 50Mb memory. Our contribution is a novel family of models call Budgeted Super Networks (BSN). They are learned using gradient descent techniques applied on a budgeted learning objective function which integrates a maximum authorized cost…We present a set of experiments on computer vision problems and analyze the ability of our technique to deal with three different costs: the computation cost, the memory consumption cost and a distributed computation cost…” Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…” The examiner notes that receiving a budget of computational time and/or maximum authorized cost teaches “determining a first availability of one or more resources for generating models”.).
…based on the first availability of the one or more resources (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”) 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the neural architecture search as taught by Zoph modified with the computational cost/time requirement as taught by Veniat because this would reduce the total cost of producing a neural architecture and would further result in a higher accuracy model (Veniat Abstract “We particular show that our model can discover neural network architectures that have a better accuracy than the ResNet and Convolutional Neural Fabrics architectures…at a lower cost.”) 

With respect to Claim 9, the combination of Zoph and Veniat teach generate a third or more models …(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” Zoph Pg. 3 Section 3.2 “training with REINFORCE”. The examiner notes based on a second or more availabilities of the one or more resources for generating models (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”). 
Train the third or more models (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”). 
Determine a respective performance level of each trained model of the trained third or more models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the 
Wherein the output model is based on a comparison of the performance levels of each of the trained models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture.).
With respect to Claim 10, the combination of Zoph and Veniat teach wherein the hyper-parameters are received by a controller comprising a recurrent neural network (RNN) and the method further comprising updating the RNN with the exploration preference variable ( Zoph Pg. 3 Section 3.1 c.f. Figure 2 note the architecture of the controller which shows an RNN controller. Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…”).
With respect to Claim 11, the combination of Zoph and Veniat teach wherein the controller updates the RNN using reinforcement learning, the memory comprising further instructions to: (Zoph Pg. 3 Section 3.1 c.f. Figure 2 note the architecture of the controller which shows an RNN controller. Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…”)
determine an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model (Zoph Pg. 3 Section 3.2 “We can use this accuracy [for each child network] as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward, represented by J(θc):                         
                            
                                
                                    J
                                    (
                                    θ
                                    )
                                
                                
                                    c
                                
                            
                            =
                             
                            
                                
                                    E
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            (
                                            a
                                            1
                                            :
                                            T
                                            ,
                                             
                                            θ
                                            c
                                            )
                                        
                                    
                                
                            
                            [
                            R
                            ]
                        
                    …” The examiner notes the expected reward (Jθc), teaches “an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model…” To clarify, the expected reward (e.g. exploitation reward) is the reward signal that represents how optimized the RNN controller is. In turn, as can be seen by the equation, the expected reward (exploitation reward) is based on the accuracies of previous models (i.e. known components). Indeed, Zoph Pg. 2 Section 1 provides interpretation: “Using this As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” ).
Update the RNN based on a combination of the exploitation reward and the exploration preference variable, wherein the controller using the updated RNN to explore the architecture space (Zoph Pg. 2 Section 1 “Using this accuracy as the reward signal, we can compute the policy gradient to update the controller. As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” Zoph Pg. 3 last equation (see policy gradient using REINFORCE rule). The examiner notes that using the policy gradient (i.e. the difference between the expected reward and current reward) to update the RNN controller (e.g. to update its expected reward) teaches “updating the RNN based on a combination of the exploitation reward and the exploration preference variable, wherein the controller using the updated RNN to explore the architecture space” where the “combination” is the difference between the exploitation reward (e.g. expected reward, Jθc) and exploration preference variable (e.g. the current reward, R).).
With respect to Claim 12, the combination of Zoph and Veniat teach wherein the one or more resources for generating models includes at least one of remaining compute time or compute power (Veniat Pg. 3494 Col. 2 Section 3.1 “Let us consider H a binary matrix of size N x N. Let us denote                         
                            C
                             
                            (
                            H
                            
                                ⨀
                                
                                    E
                                    )
                                     
                                    ∈
                                     
                                    R
                                    +
                                
                            
                        
                     the cost associated to the computation of the S-network…let us also define C the maximum cost the user would allow. For instance, when solving the problem of learning a model with a computation time lower than 200ms then C is equal to 200ms…” Note Equation 2. That his is based on a difference between the current cost of some neural architecture E and the computational time requirement. This difference teaches the “remaining compute time or compute power”.). 
With respect to Claim 14, the combination of Zoph and Veniat teach wherein the exploration preference variable is proportional to the availability of the one or more resources for generating models (Veniat Pg. 3495 see eq. 6 and/or 7. Note how the gradient (eq. 6) is proportional, that is changes directly, with respect to the change  of the sampled structure  and the predicted loss. In other words, a person of ordinary skill in the art would infer by observing eq. 6 that the exploration preference variable (e.g. current reward) changes in direct relationship to the cost requirement). 

With respect to Claim 15, Zoph teaches  a non-transitory computer readable medium comprising instructions that, when executed by one or more processors, base the one or more processors to: receive hyper-parameters for a search model, the hyper-parameters comprising operational training parameters for the search model (Zoph Pg. 3 Section 3.1 “In Neural Architecture Search, we use a controller to generate architectural hyperparameters of neural networks…Let’s suppose we would like to predict feedforward neural networks with only convolutional layers, we can use the controller to generate their hyperparameters as a sequence of tokens”. The examiner notes that by generating the hyperparameters the controller is receiving the hyperparameters that define the architecture of a neural network and thus teaches 

generate, by the search model, a first model based on the hyper- parameters and a first sample of an architecture space…(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”)…the architecture space representative of a set of model components which may be collected together into combinations to form a trainable model… (Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”)…and the first sample a first subset of the model components (Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” That is, an iteration of Zoph controller RNN generates a sample of components and this sample of components during a respective iteration of the RNN controller teaches at least “the first sample a first subset of the model components”.).

train the first model (Zoph Pg. 3 Section 3.1  “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…”).

determine a performance level of the trained first model (Zoph Pg. 3 Section 3.1 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”. The examiner notes that recorded the accuracy of the generated, built, and trained model teaches “determining a performance level of the trained first model”.).  
…
update an exploration preference variable based on [model accuracy] (Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…” The examiner notes that Zoph’s “reward” and the updating of the reward using reinforcement learning teaches “…updating an exploration preference variable based on [model accuracy]”.).

generate a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” Zoph Pg. 3 Section 3.2 “training with REINFORCE”. “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network. At convergence, this child network will achieve accuracy R…We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Where m is the number of different architectures that the controller samples in one batch and T is the number of hyperparameters our controller has to predict to design a neural network architecture. The validation accuracy that the k-th neural network architecture achieves after being trained on a training dataset is Rk.” The examiner notes that the iterative process of sampling an architecture, building the sampled architecture, training the sampled architecture, and calculating or otherwise determining an accuracy of the sampled model, to maximize the reward seen by the controller to achieve an optimal neural network teaches “generating a second model, by the search model, by exploring the architecture space based on the exploration preference variable and the performance level of the trained first model, exploration of the architecture space  (1020681-US.01)comprising generating a second sample of the architecture space, the second sample comprising a second subset of the model components”.).

train the second model (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”).
determine a performance level of the trained second model (Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is built and trained. At convergence, the accuracy of the network on a held-out validation set is recorded”.).
output one of the first trained model or the second trained model based on a comparison the performance levels (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture and thus teaches “outputting one of the first trained model or the second trained model based on a comparison the performance levels”.).

	Zoph, however, does not appear to explicitly disclose: 
determine a first availability of one or more resources for generating models; 
…based on the first availability of the one or more resources; 

determine a first availability of one or more resources for generating models (Veniat Abstract “We propose to focus on the problem of discovering neural network architectures efficient in terms of both prediction quality and cost. For instance our approach is able to solve the following tasks: learn a neural network able to predict well in less than 100 milliseconds or learn an efficient model that fits in a 50Mb memory. Our contribution is a novel family of models call Budgeted Super Networks (BSN). They are learned using gradient descent techniques applied on a budgeted learning objective function which integrates a maximum authorized cost…We present a set of experiments on computer vision problems and analyze the ability of our technique to deal with three different costs: the computation cost, the memory consumption cost and a distributed computation cost…” Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…” The examiner notes that receiving a budget of computational time and/or maximum authorized cost teaches “determining a first availability of one or more resources for generating models”.).
…based on the first availability of the one or more resources (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”) 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the neural architecture search as taught by Zoph modified with the computational cost/time requirement as taught by Veniat because this would reduce the total cost of producing a neural architecture and would further result in a higher accuracy model (Veniat Abstract “We particular show that our model can discover neural network architectures that have a better accuracy than the ResNet and Convolutional Neural Fabrics architectures…at a lower cost.”) 

With respect to Claim 16, the combination of Zoph and Veniat teach generate a third or more models …(Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.” Zoph Pg. 3 c.f. Figure 2. Note that the controller RNN “samples” a CNN. The examiner notes that the components that the RNN samples teaches “architecture space”. Zoph Pg. 3 c.f. Figure 2 note “sample” in the caption. Further Zoph Pg. 3 “Once the controller RNN finishes generating an architecture, a neural network with this architecture is build and trained…” Zoph Pg. 3 Section 3.2 “training with REINFORCE”. The examiner notes based on a second or more availabilities of the one or more resources for generating models (Veniat Pg. 3492 Col. 2 (bottom of page into Pg. 3493) “Our model called Budgeted Super Network (BSN) is based on the following principles: (i) the user provides a (big) Super Network…defining a large set of possible final network architectures as well as a maximum authorized cost. (ii) Since finding the best architecture that satisfies the cost constraint is an intractable combinatorial problem…we relax this optimization problem and propose a stochastic model…that can be optimized using policy gradient-inspired methods…We show that the optimal solution of this stochastic problem corresponds to the optimal constrained network architecture…”). 
Train the third or more models (Zoph Pg. 3 Section 3.1 “Once controller RNN finishes generating an architecture, a neural network with this architecture is built and trained.”). 
Determine a respective performance level of each trained model of the trained third or more models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the 
Wherein the output model is based on a comparison of the performance levels of each of the trained models (Zoph Pg. 3 Section 3.2 “We can use this accuracy R as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…” The examiner notes that a person of ordinary skill in the art would infer that the “optimal” architecture is the neural architecture that best maximizes the reward. And, because the reward is representative of an individual sampled, built, and trained neural architecture’s accuracy, the optimal architecture that is output is based on a comparison of the accuracy levels for each sampled, built, and trained architecture.).
With respect to Claim 17, the combination of Zoph and Veniat teach wherein the hyper-parameters are received by a controller comprising a recurrent neural network (RNN) ( Zoph Pg. 3 Section 3.1 c.f. Figure 2 note the architecture of the controller which shows an RNN controller. Zoph Pg. 3 Section 3.2 “The list of tokens that the controller predicts can be viewed as a list of actions a1:T to design an architecture for a child network…We can use this accuracy R as the reward signal and use reinforcement learning to the train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward…Since the reward signal R is non-differentiable, we need to use a policy gradient method to iteratively update θc…”).
 And further comprising instructions to: determine an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model (Zoph Pg. 3 Section 3.2 “We can use this accuracy [for each child network] as the reward signal and use reinforcement learning to train the controller. More concretely, to find the optimal architecture, we ask our controller to maximize its expected reward, represented by J(θc):                         
                            
                                
                                    J
                                    (
                                    θ
                                    )
                                
                                
                                    c
                                
                            
                            =
                             
                            
                                
                                    E
                                
                                
                                    
                                        
                                            p
                                        
                                        
                                            (
                                            a
                                            1
                                            :
                                            T
                                            ,
                                             
                                            θ
                                            c
                                            )
                                        
                                    
                                
                            
                            [
                            R
                            ]
                        
                    …” The examiner notes the expected reward (Jθc), teaches “an exploitation reward based on the generated second model, the exploitation reward corresponding to a preference for including known model components in a generated model…” To clarify, the expected reward (e.g. exploitation reward) is the reward signal that represents how optimized the RNN controller is. In turn, as can be seen by the equation, the expected reward (exploitation reward) is based on the accuracies of previous models (i.e. known components). Indeed, Zoph Pg. 2 Section 1 provides interpretation: “Using this accuracy as the reward signal, we can compute the policy gradient to update the controller. As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” ).
Update the RNN based on a combination of the exploitation reward and the exploration preference variable, wherein the controller using the updated RNN to explore the architecture space (Zoph Pg. 2 Section 1 “Using this accuracy as the reward signal, we can compute the policy gradient to update the controller. As a result, in the next iteration, the controller will give higher probabilities to architectures that receive high accuracies…” Zoph Pg. 3 last equation (see policy gradient using REINFORCE rule). The examiner notes that using the policy gradient (i.e. the difference 
With respect to Claim 18, the combination of Zoph and Veniat teach wherein the one or more resources for generating models includes at least one of remaining compute time or compute power (Veniat Pg. 3494 Col. 2 Section 3.1 “Let us consider H a binary matrix of size N x N. Let us denote                         
                            C
                             
                            (
                            H
                            
                                ⨀
                                
                                    E
                                    )
                                     
                                    ∈
                                     
                                    R
                                    +
                                
                            
                        
                     the cost associated to the computation of the S-network…let us also define C the maximum cost the user would allow. For instance, when solving the problem of learning a model with a computation time lower than 200ms then C is equal to 200ms…” Note Equation 2. That is, Veniat’s equation attempts to find an architecture E that minimizes, for example, the computational time. Additionally, as can be seen in, for example, equation 2. This is based on a difference between the current cost of some neural architecture E and the computational time requirement. This difference teaches the “remaining compute time or compute power”.). 
With respect to Claim 20, the combination of Zoph and Veniat teach wherein the exploration preference variable is proportional to the availability of the one or more resources for generating models (Veniat Pg. 3495 see eq. 6 and/or 7. Note how the gradient (eq. 6) is proportional, that is changes directly, with respect to the change  of the sampled structure  and the predicted loss. In other words, a person of . 

Claims 6, 13, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zoph et al. (“Neural Architecture Search with Reinforcement learning”, NPL 2017 (See applicant provided IDS filed 09/10/2020) in view of Veniat et al. (“Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks”, NPL 2018) and further in view of Jin et al. (“Auto-Keras: Efficient Neural Architecture Search with Network Morphism”, NPL 2018). 
With respect to Claim 6, the combination of Zoph and Veniat teach all of the limitation of Claim 1 as described above. 
The combination of Zoph and Veniat, however, do not appear to explicitly disclose: 
Wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources. 
Jin, however, teaches wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources (Jin Figure 2 note “Application Programming Interface”. Jin Pg. 8 Col. 2 Section 4.2 “Interface”. “The design the API follows the classic design of the scikit-Learn API…The training of a neural network requires as few as three lines of code calling the constructor, the fit and predict function respectively. Users can also specify the model trainer’s hyperparameters using the default they can specify all kinds of hyperparameters of the search process and neural network optimization process by the default parameters in the interface…” Jin Pg. 3 Col. 2 Section 2. “The general neural architecture search problem we studied in this paper is defined as: given a neural architecture search space F, the input data X, and the cost metric Cost(), we aim at finding an optimal neural network…with its trained parameter…which could achieve the lowest cost metric value on the given dataset X.” Jin Pg. 9 Col. 1 Section 4.4 GPU memory adaption. “Since different deployment environments of Auto-Keras have different limitations on the GPU memory usage, the size of the neural network needs to be limited according to the GPU memory limitation…To tackle this challenge, we implemented a memory estimation function on our own data structure for the neural architectures. An integer value is used to mark the upper bounds of the neural architecture size…” The examiner notes that because the API of Jin allows the user to specify, for example, the hyperparameters of the search process and the neural network optimization process, and because Jin discloses that an efficient neural network has a define cost metric, a person of ordinary skill in the art would infer that a user using the API to specify the hyperparameters would include at least the cost metric and therefore Jin teaches “wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources”.).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the neural architecture search as taught by the combination of Zoph and Veniat modified with the API call to external 
With respect to Claim 13, the combination of Zoph, Veniat, and Jin teaches wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources (Jin Figure 2 note “Application Programming Interface”. Jin Pg. 8 Col. 2 Section 4.2 “Interface”. “The design the API follows the classic design of the scikit-Learn API…The training of a neural network requires as few as three lines of code calling the constructor, the fit and predict function respectively. Users can also specify the model trainer’s hyperparameters using the default parameters to the functions…Third, for advanced users, they can specify all kinds of hyperparameters of the search process and neural network optimization process by the default parameters in the interface…” Jin Pg. 3 Col. 2 Section 2. “The general neural architecture search problem we studied in this paper is defined as: given a neural architecture search space F, the input data X, and the cost metric Cost(), we aim at finding an optimal neural network…with its trained parameter…which could achieve the lowest cost metric value on the given dataset X.” Jin Pg. 9 Col. 1 Section 4.4 GPU memory adaption. “Since different deployment environments of Auto-Keras have different limitations on the GPU memory usage, the size of the neural network needs to be limited according to the GPU memory limitation…To tackle this challenge, we implemented a memory estimation function on our own data structure for the neural architectures. An integer value is used to mark the The examiner notes that because the API of Jin allows the user to specify, for example, the hyperparameters of the search process and the neural network optimization process, and because Jin discloses that an efficient neural network has a define cost metric, a person of ordinary skill in the art would infer that a user using the API to specify the hyperparameters would include at least the cost metric and therefore Jin teaches “wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources”.).
With respect to Claim 19, the combination of Zoph, Veniat, and Jin teaches wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources (Jin Figure 2 note “Application Programming Interface”. Jin Pg. 8 Col. 2 Section 4.2 “Interface”. “The design the API follows the classic design of the scikit-Learn API…The training of a neural network requires as few as three lines of code calling the constructor, the fit and predict function respectively. Users can also specify the model trainer’s hyperparameters using the default parameters to the functions…Third, for advanced users, they can specify all kinds of hyperparameters of the search process and neural network optimization process by the default parameters in the interface…” Jin Pg. 3 Col. 2 Section 2. “The general neural architecture search problem we studied in this paper is defined as: given a neural architecture search space F, the input data X, and the cost metric Cost(), we aim at finding an optimal neural network…with its trained parameter…which could achieve the lowest cost metric value on the given dataset X.” Jin Pg. 9 Col. 1 Section 4.4 GPU memory adaption. “Since different deployment he size of the neural network needs to be limited according to the GPU memory limitation…To tackle this challenge, we implemented a memory estimation function on our own data structure for the neural architectures. An integer value is used to mark the upper bounds of the neural architecture size…” The examiner notes that because the API of Jin allows the user to specify, for example, the hyperparameters of the search process and the neural network optimization process, and because Jin discloses that an efficient neural network has a define cost metric, a person of ordinary skill in the art would infer that a user using the API to specify the hyperparameters would include at least the cost metric and therefore Jin teaches “wherein determining the first availability of the one or more resources comprises making an application programming interface (API) call to one or more external resources”.).











Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
1. Cai, Ermao et al. “NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks”, NPL 2017. Similar inventive concept. Note especially Figure 2 which does that based on a neural architecture search a detailed power, runtime, and energy analysis is performed which then informs how the controller (e.g. machine learners) builds candidate CNN architectures. 
2. Cai, Hun et al. “ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware”, NPL 2019. Similar inventive concept. Note especially Pg. 2 “We propose a novel gradient-based approach…for handling hardware objectives [e.g. availability of resources]. Given different hardware platforms: CPU/GPU/Mobile, ProxylessNAS enables hardware-aware neural network specialization…” 
3. Zela, Arber et al. “Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter search”, NPL 2018. Note especially Section 4.3 “First, we studied the rank correlation of the final validation error of all configurations that were trained on any particular pair of budgets…” 



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FEN TAMULONIS whose telephone number is (571)272-0934. The examiner can normally be reached 7:30AM-5:30PM MON-FRI EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571)-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/FEN CHRISTOPHER TAMULONIS/Examiner, Art Unit 2126   
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126