DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Amendments
	Per Applicant’s request, claims 1-3, 5-6, 7-10, 12-17, and 19 have been amended. Claim 20 has been canceled. Claim 21 has been added. Claims 1-19 and 21 are pending and have been considered.
	Note: On page 9 of the remarks, in the section titled “Claim Objections”, Applicant states claims 5-6 have been amended. However, claims 5-6 have the status identifier “Original” instead of “Currently Amended.” MPEP § 714, subsection II, Part C requires applicants to use the status identifier (currently amended) if the claims are being amended. Examiner will consider claims 5 and 6 as being amended, with the numeral 105 being deleted from each claims. Applicant is respectfully reminded to include correct status identifiers for all claims as discussed in MPEP § 714, subsection II, Part C.

Information Disclosure Statement
The information disclosure statement filed 10/20/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to for the following reasons:
Replacement Figure 5 (filed 10/20/2021) has been entered, but amended specification paragraph [0047] (filed 10/20/2021) has not been entered. The drawings now contains reference elements 510 and 512 which are not contained in the specification. 

Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claim 21 is objected to because of the following informalities:  
In line 2, “wehrein” should recite “wherein”. 
Line 3 should recite “a number of [[or]] processing elements”. For examining purposes, Examiner will interpret claim 21, line 3 as reciting “a number of processing elements”.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claims 1-19 and 21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	The limitation “implementation attribute” is recited by Claim 1, line 8, Claim 7, line 3, Claim 8, second-to-last line, Claim 14, line 4, Claim 15, second-to-last line, and Claim 21, in lines 2 and 3.
	In specification paragraph [0025], the last three lines state, “Additionally, the term hyperparameters is also used for parameters that define the capacity of the neural network (e.g., the number of hidden layers in a neural network) and hence are related to the network topology. These hyperparameters are referred to as "model-capacity" hyperparameters herein and include all implementation attributes (e.g., bit width).” It is unclear how the implementation attributes differ from hyperparameters. For examining purposes, Examiner interprets hyperparameters and implementation attributes as synonyms.
	Claims 2-6 are rejected for failing to cure the deficiencies of claim 1 upon which they depend. 
Claims 9-13 are rejected for failing to cure the deficiencies of claim 7 upon which they depend.
Claims 16-19 are rejected for failing to cure the deficiencies of claim 15 upon which they depend.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-6, 8-13, and 15-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hsu et al. (“MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning”).

Regarding CLAIM 1, Hsu teaches: A method of implementing a neural network, comprising: 
selecting a first neural network architecture from a search space; (A first neural network architecture is a convolutional neural network architecture. Page 3, ¶ Target Networks & Search Space: “We demonstrate the generality of MONAS on two CNN models” and Page 6, § 4. Conclusion, line 2: “MONAS is general, and we applied it to two different kinds of CNN architectures”. In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step selects an architecture from a search space, where RN is an RNN “robot network” and TN is a “target network”)
training the neural network having the first neural network architecture to obtain an accuracy and an implementation cost, the implementation cost based on a programmable device of an inference platform; (In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step assigns the first neural network architecture to the CNN target network, as indicated by the leftward arrow. The second step trains the CNN target network. The third step obtains rewards. Further, p. 4 first paragraph: “In this work, we consider three rewards signals in the MONAS framework: the validation accuracy, the peak power, and average energy cost during the inference of the target CNN model,” where peak power and average energy costs are types of implementation costs.)
selecting a second neural network architecture from the search space (In Algorithm 1 on page 2, during the second iteration of the for-loop, in the first step the RNN robot network assigns a second neural network architecture to the neural network.) 
based on the accuracy and the implementation cost; and (In Algorithm 1 on page 2, the fourth step of the for-loop shows updating the RN with Reward_i with policy gradient method. In the section “Policy Gradient” on p. 3, Hsu states, “We take the RNN model as an actor with parameter θ… θ is updated by the policy gradient method to maximize the expected reward                         
                            
                                
                                    
                                        
                                            R
                                        
                                        -
                                    
                                
                                
                                    θ
                                
                            
                        
                    ” where rewards are accuracy, energy, and/or power (p. 4, Reward Function).)
outputting weights, hyperparameters, and an implementation attribute for the neural network having the second neural network architecture. (Outputting weights: The last line of Algorithm 1 on p. 2 returns the CNN with the maximum reward. This is interpreted as outputting weights. Outputting hyperparameters: P. 2 above Algorithm 1 states, “we train a target network (TN) according to the hyperparameters output by the RNN.” The RNN robot network outputs the hyperparameters in the first line of the for-loop. Outputting an implementation attribute: The broadest reasonable interpretation of “an implementation attribute” in light of instant specification paragraphs [0025]-[0026] includes any hyperparameter. The limitation has been met because Hsu teaches outputting hyperparameters.)

Regarding CLAIM 2, Hsu teaches: The method of claim 1, wherein the step of selecting the first neural network architecture is performed by a reinforcement agent, (P. 2, § 2, line 3: RNN robot network)
wherein the reinforcement agent selects the first neural network architecture from the search space with a probability P, and wherein the reinforcement agent adjusts the probability P (P. 3, last 
based on a function of the accuracy and the implementation cost. (Equation 1 on p. 3 shows that theta gets updated based on the expected reward, which is a function of the accuracy and the implementation cost.)

	Regarding CLAIM 3, Hsu teaches: The method of claim 2, wherein the reinforcement agent is a recurrent neural network (RNN). (P. 2, § 2, line 2-3: “In the generation stage, we use a RNN as a robot network (RN), which generates a hyperparameter sequence for a CNN.”)

	Regarding CLAIM 4, Hsu teaches: The method of claim 1, wherein the first neural network architecture is one of a plurality of neural network architectures, (The plurality of neural network architectures include the combinations of hyperparameters in Table 1 on p. 3)
wherein the step of training includes evaluating the plurality of neural network architectures using a fitness function. (On p. 4, the rewards in equations 3-5 are fitness functions. These rewards incorporated into the training the target network TN because its hyperparameters are selected by the RNN in Algorithm 1 on p. 2.)

Regarding CLAIM 5, Hsu teaches: The method of claim 1, wherein the step of selecting the first neural network architecture is performed by a tuning agent, (P. 2, § 2, line 3: RNN robot network.)
and wherein the tuning agent selects hyperparameters for the second neural network architecture based on a function of the accuracy and the implementation cost. (Equations 1 and 2 on p. 3 update the RNN based on reward functions Equations 3 to 5 on p. 4.)

CLAIM 6, Hsu teaches: The method of claim 5, wherein the tuning agent selects the hyperparameters using a grid search, random search, or Bayesian search. (The broadest reasonable interpretation of the claim is that the tuning agent selects the hyperparameters using a random search or a Bayesian search. The tuning agent performs a Bayesian search as shown by the conditional log-likelihood in Equation 2 on p. 3. Also, p. 2, last 2 lines states, “The output of the LSTM is fed into a softmax layer, and a prediction is made by sampling the probabilities over the values of the hyperparameter of that time step.” Here, a search is made sampling the probabilities, i.e., a random search.)

Regarding CLAIM 8, Hsu teaches: A non-transitory computer readable medium comprising instructions, which when executed in a computer system, causes the computer system to carry out a method of implementing a neural network, comprising: (Hsu teaches a computer system on p. 4, § 3, Experiment Setup: “All experiments are implemented with TensorFlow, running on Intel XEON E5-2620v4 processor equipped with NVIDIA GTX1080Ti GPU cards.” The experimental results are evidence of a non-transitory computer readable medium comprising instructions.)
selecting a first neural network architecture from a search space; (A first neural network architecture is a convolutional neural network architecture. Page 3, ¶ Target Networks & Search Space: “We demonstrate the generality of MONAS on two CNN models” and Page 6, § 4. Conclusion, line 2: “MONAS is general, and we applied it to two different kinds of CNN architectures”. In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step selects an architecture from a search space, where RN is an RNN “robot network” and TN is a “target network”)
training the neural network having the first neural network architecture to obtain an accuracy and an implementation cost, the implementation cost based on a programmable device of an inference platform; (In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step 
selecting a second neural network architecture from the search space (In Algorithm 1 on page 2, during the second iteration of the for-loop, in the first step the RNN robot network assigns a second neural network architecture to the neural network.) 
based on the accuracy and the implementation cost; and (In Algorithm 1 on page 2, the fourth step of the for-loop shows updating the RN with Reward_i with policy gradient method. In the section “Policy Gradient” on p. 3, Hsu states, “We take the RNN model as an actor with parameter θ… θ is updated by the policy gradient method to maximize the expected reward                         
                            
                                
                                    
                                        
                                            R
                                        
                                        -
                                    
                                
                                
                                    θ
                                
                            
                        
                    ” where rewards are accuracy, energy, and/or power (p. 4, Reward Function).)
outputting weights, hyperparameters and an implementation attribute for the neural network having the second neural network architecture. (Outputting weights: The last line of Algorithm 1 on p. 2 returns the CNN with the maximum reward. This is interpreted as outputting weights. Outputting hyperparameters: P. 2 above Algorithm 1 states, “we train a target network (TN) according to the hyperparameters output by the RNN.” The RNN robot network outputs the hyperparameters in the first line of the for-loop. Outputting an implementation attribute: The broadest reasonable interpretation of “an implementation attribute” in light of instant specification paragraphs [0025]-[0026] includes any hyperparameter. The limitation has been met because Hsu teaches outputting hyperparameters.)

CLAIM 9, Hsu teaches: The non-transitory computer readable medium of claim 8, wherein the step of selecting the first neural network architecture is performed by a reinforcement agent, (P. 2, § 2, line 3: RNN robot network)
wherein the reinforcement agent selects the first neural network architecture from the search space with a probability P, and wherein the reinforcement agent adjusts the probability P (P. 3, last paragraph, line 2: “P(τ |θ) is the conditional probability of outputting a τ under θ.” Tau is the output model from one MONAS iteration and theta is a parameter of the RNN model as an actor.)
based on a function of the accuracy and the implementation cost. (Equation 1 on p. 3 shows that theta gets updated based on the expected reward, which is a function of the accuracy and the implementation cost.)

	Regarding CLAIM 10, Hsu teaches: The non-transitory computer readable medium of claim 9, wherein the reinforcement agent is a recurrent neural network (RNN). (P. 2, § 2, line 2-3: “In the generation stage, we use a RNN as a robot network (RN), which generates a hyperparameter sequence for a CNN.”)

	Regarding CLAIM 11, Hsu teaches: The non-transitory computer readable medium of claim 8, wherein the first neural network architecture is one of a plurality of neural network architectures, (The plurality of neural network architectures include the combinations of hyperparameters in Table 1 on p. 3)
wherein the step of training includes evaluating the plurality of neural network architectures using a fitness function. (On p. 4, the rewards in equations 3-5 are fitness functions. These rewards incorporated into the training the target network TN because its hyperparameters are selected by the RNN in Algorithm 1 on p. 2.)

Regarding CLAIM 12, Hsu teaches: The non-transitory computer readable medium of claim 8, wherein the step of selecting the first neural network architecture is performed by a tuning agent, (P. 2, § 2, line 3: RNN robot network.)
and wherein the tuning agent selects hyperparameters for the second neural network architecture based on a function of the accuracy and the implementation cost. (Equations 1 and 2 on p. 3 update the RNN based on reward functions Equations 3 to 5 on p. 4.)

	Regarding CLAIM 13, Hsu teaches: The non-transitory computer readable medium of claim 12, wherein the tuning agent selects the hyperparameters using a grid search, random search, or Bayesian search. (The broadest reasonable interpretation of the claim is that the tuning agent selects the hyperparameters using a random search or a Bayesian search. The tuning agent performs a Bayesian search as shown by the conditional log-likelihood in Equation 2 on p. 3. Also, p. 2, last 2 lines states, “The output of the LSTM is fed into a softmax layer, and a prediction is made by sampling the probabilities over the values of the hyperparameter of that time step.” Here, a search is made sampling the probabilities, i.e., a random search.)

Regarding CLAIM 15, Hsu teaches: A computer system, comprising: a memory having program code stored therein; and a processor, configured to execute the program code, to implement a neural network by: (Hsu teaches a computer system on p. 4, § 3, Experiment Setup: “All experiments are implemented with TensorFlow, running on Intel XEON E5-2620v4 processor equipped with NVIDIA GTX1080Ti GPU cards.” The experimental results are evidence of a non-transitory computer readable medium comprising instructions.)
selecting a first neural network architecture from a search space; (A first neural network architecture is a convolutional neural network architecture. Page 3, ¶ Target Networks & Search Space: “We demonstrate the generality of MONAS on two CNN models” and Page 6, § 4. Conclusion, line 2: “MONAS is general, and we applied it to two different kinds of CNN architectures”. In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step selects an architecture from a search space, where RN is an RNN “robot network” and TN is a “target network”)
training the neural network having the first neural network architecture to obtain an accuracy and an implementation cost, the implementation cost based on a programmable device of an inference platform; (In Algorithm 1 on page 2, during the first iteration of the for-loop, the first step assigns the first neural network architecture to the CNN target network, as indicated by the leftward arrow. The second step trains the CNN target network. The third step obtains rewards. Further, p. 4 first paragraph: “In this work, we consider three rewards signals in the MONAS framework: the validation accuracy, the peak power, and average energy cost during the inference of the target CNN model,” where peak power and average energy costs are types of implementation costs.)
selecting a second neural network architecture from the search space In Algorithm 1 on page 2, during the second iteration of the for-loop, in the first step the RNN robot network assigns a second neural network architecture to the neural network.)
based on the accuracy and the implementation cost; and (In Algorithm 1 on page 2, the fourth step of the for-loop shows updating the RN with Reward_i with policy gradient method. In the section “Policy Gradient” on p. 3, Hsu states, “We take the RNN model as an actor with parameter θ… θ is updated by the policy gradient method to maximize the expected reward                         
                            
                                
                                    
                                        
                                            R
                                        
                                        -
                                    
                                
                                
                                    θ
                                
                            
                        
                    ” where rewards are accuracy, energy, and/or power (p. 4, Reward Function).)
outputting weights, hyperparameters, and an implementation attribute for the neural network having the second neural network architecture. (Outputting weights: The last line of Outputting hyperparameters: P. 2 above Algorithm 1 states, “we train a target network (TN) according to the hyperparameters output by the RNN.” The RNN robot network outputs the hyperparameters in the first line of the for-loop. Outputting an implementation attribute: The broadest reasonable interpretation of “an implementation attribute” in light of instant specification paragraphs [0025]-[0026] includes any hyperparameter. The limitation has been met because Hsu teaches outputting hyperparameters.)
	
	Regarding CLAIM 16, Hsu teaches: The computer system of claim 15, wherein the processor is configured to execute the code to select the first neural network architecture using a reinforcement agent, (P. 2, § 2, line 3: RNN robot network)
wherein the reinforcement agent selects the first neural network architecture from the search space with a probability P, and wherein the reinforcement agent adjusts the probability P (P. 3, last paragraph, line 2: “P(τ |θ) is the conditional probability of outputting a τ under θ.” Tau is the output model from one MONAS iteration and theta is a parameter of the RNN model as an actor.)
based on a function of the accuracy and the implementation cost. (Equation 1 on p. 3 shows that theta gets updated based on the expected reward, which is a function of the accuracy and the implementation cost.)

Regarding CLAIM 17, Hsu teaches: The computer system of claim 16, wherein the reinforcement agent is a recurrent neural network (RNN). (P. 2, § 2, line 2-3: “In the generation stage, we use a RNN as a robot network (RN), which generates a hyperparameter sequence for a CNN.”)

CLAIM 18, Hsu teaches: The computer system of claim 15, wherein the first neural network architecture is one of a plurality of neural network architectures, (The plurality of neural network architectures include the combinations of hyperparameters in Table 1 on p. 3)
wherein the processor executes the code to perform the training by evaluating the plurality of neural network architectures using a fitness function. (On p. 4, the rewards in equations 3-5 are fitness functions. These rewards incorporated into the training the target network TN because its hyperparameters are selected by the RNN in Algorithm 1 on p. 2.)

	Regarding CLAIM 19, Hsu teaches: The computer system of claim 15, wherein the processor executes the code to select the first neural network architecture using a tuning agent, (P. 2, § 2, line 3: RNN robot network.)
and wherein the tuning agent selects hyperparameters for the second neural network architecture based on a function of the accuracy and the implementation cost. (Equations 1 and 2 on p. 3 update the RNN based on reward functions Equations 3 to 5 on p. 4.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.

3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Hsu et al. (“MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning”) in view of Qiu et al. (“Going Deeper with Embedded FPGA Platform for Convolutional Neural Network”).

Regarding CLAIM 7, Hsu teaches: The method of claim 1. 
The caption for Fig. 2 on p. 5 states, “Each point is a generated model” and p. 4, § 3, Experiment Setup states, “All experiments are implemented with TensorFlow, running on Intel XEON E5-2620v4 processor equipped with NVIDIA GTX1080Ti GPU cards.”
	However, Hsu does not explicitly teach: further comprising: generating a circuit design based on the weights, the hyperparameters, and the implementation attribute of the neural network; and implementing the circuit design for the programmable device.
generating a circuit design based on the weights, the hyperparameters, and the implementation attribute of the neural network; and (Weights – P. 29, col. 1, two paragraphs above section 5: “Since FC layers contribute to most of memory footprint, it is necessary to reduce weights of FC layers while maintaining comparable accuracy.” Hyperparameters – P. 26, col. 1, 10 lines from the bottom: “A data arrangement method is proposed to further ensure a high utilization of the external memory bandwidth” (emphasis added). A circuit design is broadly interpreted as Figure 4(a) on p. 31, the overall architecture shown having an external memory. The circuit design is based at least on a bandwidth constraint. Implementation attribute – The broadest reasonable interpretation of “implementation attribute” allows Examiner to consider this limitation as any hyperparameter. This limitation has been met because Qiu teaches generating a circuit design based on the hyperparameters of the neural network.)
implementing the circuit design for the programmable device. (Qiu teaches that the circuit design was implemented on an FPGA on p. 33, col. 1, last paragraph, lines 1-4: “We use 16-bit dynamic-precision quantization and Xilinx Zynq ZC706 for the implementation. Xilinx Zynq platform consists of a Xilinx Kintex-7 FPGA, dual ARM Cortex-A9 Processor, and 1 GB DDR3 memory. It offers a bandwidth of up to 4.2GB/s.”)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Qiu’s system into Hsu’s system by designing the neural network for an FPGA implementation with reduced weights of FC layers and high utilization of external memory bandwidth. A motivation for the combination is to efficiently use a neural network on an embedded system. (Qiu P. 35, Conclusion, lines1-5: “The limited bandwidth is one of the bottlenecks of accelerating deep CNN models on embedded systems. In this paper, we make an in-depth investigation of the memory footprint and bandwidth problem in order to accelerate state-of-the-art CNN models for Image-Net classification on the embedded FPGA platform.”)

	Regarding CLAIM 14, Hsu teaches: The non-transitory computer readable medium of claim 8. 
The caption for Fig. 2 on p. 5 states, “Each point is a generated model” and p. 4, § 3, Experiment Setup states, “All experiments are implemented with TensorFlow, running on Intel XEON E5-2620v4 processor equipped with NVIDIA GTX1080Ti GPU cards.”
	However, Hsu does not explicitly teach: further comprising: generating a circuit design based on the weights, the hyperparameters, and the implementation attribute of the neural network; and implementing the circuit design for the programmable device.
	But Qiu teaches: further comprising: generating a circuit design based on the weights, the hyperparameters, and the implementation attribute of the neural network; and  (Weights – P. 29, col. 1, two paragraphs above section 5: “Since FC layers contribute to most of memory footprint, it is necessary to reduce weights of FC layers while maintaining comparable accuracy.” Hyperparameters – P. 26, col. 1, 10 lines from the bottom: “A data arrangement method is proposed to further ensure a high utilization of the external memory bandwidth” (emphasis added). A circuit design is broadly interpreted as Figure 4(a) on p. 31, the overall architecture shown having an external memory. The circuit design is based at least on a bandwidth constraint. Implementation attribute – The broadest reasonable interpretation of “implementation attribute” allows Examiner to consider this limitation as any hyperparameter. This limitation has been met because Qiu teaches generating a circuit design based on the hyperparameters of the neural network.)
implementing the circuit design for the programmable device. (Qiu teaches that the circuit design was implemented on an FPGA on p. 33, col. 1, last paragraph, lines 1-4: “We use 16-bit dynamic-precision quantization and Xilinx Zynq ZC706 for the implementation. Xilinx Zynq platform consists of a 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Qiu’s system into Hsu’s system by designing the neural network for an FPGA implementation with reduced weights of FC layers and high utilization of external memory bandwidth. A motivation for the combination is to efficiently use a neural network on an embedded system. (Qiu P. 35, Conclusion, lines 1-5: “The limited bandwidth is one of the bottlenecks of accelerating deep CNN models on embedded systems. In this paper, we make an in-depth investigation of the memory footprint and bandwidth problem in order to accelerate state-of-the-art CNN models for Image-Net classification on the embedded FPGA platform.”)

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Hsu et al. (“MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning”) in view of Zoph et al. (“Neural Architecture Search with Reinforcement Learning”).

Regarding CLAIM 21, Hsu teaches: The method of claim 1 further comprises learning the weights, hyperparameters, and the implementation attribute for the neural network, and (Hsu teaches the RNN robot network (RN) updates itself based on the reward of the target network (TN). Hsu teaches this on p. 2, § 2, first paragraph. This updating is shown in Algorithm 1, four to six lines down from the start of the for-loop:

    PNG
    media_image1.png
    67
    523
    media_image1.png
    Greyscale
)
wehrein the implementation attribute corresponds to one of a bit width, a number or processing elements, and an error tolerance attribute.
However, Zoph teaches: wehrein the implementation attribute corresponds to one of a bit width, a number or processing elements, and an error tolerance attribute. (Zoph teaches controller recurrent neural networks (RNNs) that predict the architecture/hyperparameters for generating another neural networks by using a Neural Architecture Search. The broadest reasonable interpretation of a number of processing elements includes a number of filters for the neural network. On p. 3, in Fig. 2, the controller RNN outputs a number of filters, as shown below with an arrow.
    PNG
    media_image2.png
    284
    685
    media_image2.png
    Greyscale

Zoph Fig. 2 (annotated)
Zoph’s Neural Architecture Search is explained in more detail by the paragraph before Fig. 2 and the caption of Fig. 2. Additionally, at the bottom of page 6, in the paragraph “Search space,” Zoph teaches: “For every convolutional layer, the controller RNN has to select… a number of filters in [24, 36, 48, 64].” 
	Zoph is in the same field of endeavor as the claimed invention, namely, neural network generation. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have output a number of filters, as is taught by Zoph, from Hsu’s RNN-based robot network. This combination is especially reasonable because Hsu’s MONAS framework is built on top of a system similar to Zoph’s Neural Architecture Search. (Hsu, p. 2, § 2, “Framework Overview: Our MONAS framework is built on top of a two-stage reinforcement learning framework similar to NAS (Zoph and Le, 2017a).”) 
A motivation for the combination is to discover the optimal network architecture. (Zoph p. 2, § 3: “In the following section, we will first describe a simple method of using a recurrent network to generate convolutional architectures. We will show how the recurrent network can be trained with a policy gradient method to maximize the expected accuracy of the sampled architectures.)

Response to Arguments
Examiner will respond to Applicant’s remarks, claim amendments, replacement drawings, and specification amendments filed 10/20/2021.

Objections to the Drawings (Remarks p. 9)
Applicant’s Argument # 1
The drawings stand objected to as FIG. 3 and FIG. 5 include the same reference elements (e.g., 306, 308, 310, and 312). In response, FIG. 5 has been amended to include reference elements 506, 508, 510, and 512. 
Examiner’s Response # 1


Applicant’s Argument # 2
Further, paragraph 0047 of the Applicant’s originally filed specification has been amended to include reference elements 510 and 512. Accordingly, the Applicant requests that the objection to the drawings be withdrawn. The replacement drawings have been entered.
Examiner’s Response # 2
In paragraph 0047 of the amendment to the specification, filed 10/20/2021, the amendment recites “At step 508, a determination is made whether to end and at step 510, the trained neural network is output.” The reference elements recited in this paragraph do not correctly correspond to the reference elements in replacement FIG. 5. In the amended paragraph 0047, step 508 should recite 510 and Step 510 should recite 512 in order to correspond to replacement FIG. 5. 
	Paragraph 0047 will not be entered, and the drawings are objected to because the specification does not recite reference elements 510 and 512.

No Response to an Objection to the Drawings: In the non-final rejection mailed 07/20/2021, on page 3, lines 3-4, Examiner made the following objection to the drawings: 
“The drawings are objected to because the paragraph [0057] recites “PMU 122” while the Fig. 8 shows PMU 11. These reference numerals for the PMU must match.”
	Applicant did not address this objection in the remarks. Applicant is required to respond to every objection and rejection made in the office action. This objection to the drawings is maintained.

Claim Objections (Remarks p. 9): Objections to claims 2-3, 5-6, 9-10, 12-13, 16-17, and 19 are withdrawn due to the claim amendments. The objection to claim 20 is moot because the claim is cancelled.

Rejections of Claims under 35 U.S.C. § 112 (Remarks p. 10): Rejections of claims 3, 7, 10, 14, and 17 are withdrawn due to the claim amendments. 

Rejections of Claims under 35 U.S.C. § 102 and § 103 (Remarks p. 10):
Applicant’s Argument:

    PNG
    media_image3.png
    168
    637
    media_image3.png
    Greyscale

Examiner’s response: Upon further consideration, Examiner has determined that Hsu teaches the limitation “outputting an implementation attribute for the neural network having the second neural network architecture”. The term “implementation attribute” is indefinite. Examiner is interpreting it as any hyperparameter. Since, the Hsu reference teaches outputting a hyperparameter, Hsu also teaches outputting an implementation attribute. Examiner has likewise interpreted the term “implementation attribute” as any hyperparameter in the rejections of independent claims 8 and 15 and in the rejections of dependent claims 7 and 14. The prior art rejections are maintained. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Merity et al. (US 20180336453 A1) teaches an RNN candidate architecture generator.
Elsken et al. (US 20210012183 A1) teaches a method for ascertaining a suitable network configuration for a neural network using a Pareto frontier.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASHER H. JABLON/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127