Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04/11/2022 has been entered.
 
Amendments
	Per Applicant’s request, claims 1, 7-8, and 14-15 have been amended. Claim 21 has been canceled. Claim 22 has been added. Claims 1-19 and 22 are pending and have been considered.
	Note: The claims filed 04/11/2022 are identical to the claims filed 03/14/2022 except for the status identifier for claim 19. Applicant has submitted two sets of claims without clarifying which one should be entered and which one should be canceled, as required by MPEP 706.07(h)(III)(D). The claims filed 04/11/2022 will be examined. The amendments to the specification filed 03/14/2022 and 04/11/2022 are identical.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-19 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Hsu et al. (“MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning”, cited in the PTO-892 filed 07/20/2021) in view of Qiu et al. (“Going Deeper with Embedded FPGA Platform for Convolutional Neural Network”, cited in PTO-892 filed 07/20/2021), and Lee et al. (US 20190042948 A1).

Regarding CLAIM 1, Hsu teaches: A method of implementing a neural network, comprising: 
selecting a first neural network architecture from a search space; (Abstract; In Algorithm 1 on page 2, the first step in the for-loop teaches the claim limitation.  The paragraph above Algorithm 1 discloses RN stands for “robot network” and it is performed by a recurrent neural network (RNN). TN stands for “target network” and it is performed by a convolutional neural network (CNN). Hsu clarifies the target network is a CNN on p. 3, section “Target Networks & Search Space”.)
training the neural network having the first neural network architecture to obtain an accuracy and an implementation cost, the implementation cost based on a programmable device of an inference platform; (In Algorithm 1 on page 2, the second step in the for-loop discloses training the target network, and the third step of the for-loop discloses obtaining rewards. The paragraph above the algorithm and p. 4, ¶ 1 teach the rewards are accuracy and energy consumption. P. 4, § 3, lines 1-2 teach “a programmable device of an inference platform.” P. 4, “Adaptability”, line 3 teaches the platform has a peak power constraint of 70 watts.)
selecting a second neural network architecture from the search space based on the accuracy and the implementation cost; (In Algorithm 1 on page 2, the fourth step of the for-loop teaches updating the robot network based on the rewards. In the next iteration of the for-loop, in the first step, the updated RNN will select a second architecture.)
outputting weights, a number of layers,… and an error tolerance attribute for the neural network having the second neural network architecture; and (Outputting weights: The last line of Algorithm 1 on p. 2 returns the trained target network. A trained network has weights. Outputting a number of layers: P. 2-3, § 2.1, Fig. 1, and Table 1 teach outputting hyperparameters from a search space for an AlexNet type of CNN. A “number of layers” is taught by the number of filters. Outputting an error tolerance attribute: An error tolerance is interpreted as the classification error, which is 1.0 minus the accuracy. On p. 5, Fig. 2(c) shows accuracies of generated models on the x-axis.)
However, Hsu does not explicitly teach: outputting a number of channels per layer, a bit width, a number of processing elements for the neural network
implementing a circuit design in the programmable device, the circuit design generated based on… the number of channels per layer, the bit width, the number of processing elements,
But Qiu teaches: outputting a bit width, a number of processing elements for the neural network (Bit width: P. 28, col. 1, § 3.2, lines 1-2 and the last paragraph of the section introduces quantization; P. 29, § 5 in col. 1-2 teaches a weight quantization phase and the data quantization phase. Number of PEs: P. 30, col. 1, last 2 lines to col. 2, line 2 introduces PEs; P. 31, col. 2, § 6.3.2, lines 1-3 and the bullet point “PE En” teaches enabling number of PEs; P. 32, col. 1, middle paragraph teaches: “Table 4 shows the generated the instructions with the example in Figure 5 (a). Instruction 1… PE En enables two PEs working in parallel.”)
 implementing a circuit design in the programmable device, the circuit design generated based on… the bit width, the number of processing elements, (The programmable device as claimed is taught by P. 30, col. 1, § 6.1, all of paragraph “PL” (Programmable Logic). Implementing the circuit design is generally taught by P. 30, col. 2, paragraphs “PS” (Processing System), “Data Preparation”, and “Data Processing”. Note that lines 2-3 of “PS” teaches storing model parameters, data, and instructions in memory, and lines 3-4 of “Data Processing” teaches loading the data and instructions and executing them on PEs.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have implemented Hsu’s neural network model on Qiu’s programmable logic which is an embedded FPGA system. A motivation for the combination is that an embedded Qiu’s system accelerates both the convolutional layers and the fully connected layers, in order to reduce resource demands for CNNs (Qiu, P. 26, col. 2, § 1, ¶ 2; and P. 28, col. 2, § 3.4, last paragraph). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have included Qiu’s parameters of bit width and number of processing elements in Hsu’s system. A motivation is to reduce memory footprint, conserve computational resources, and conserve energy. (Qiu, P. 28, col. 1, § 3.2, lines 1-6; and P. 31, col. 2, last 3 lines)
	However, Hsu and Qiu do not explicitly teach: outputting a number of channels per layer for the neural network; the circuit design generated based on the number of channels per layer
	But Lee teaches: outputting a number of channels per layer for the neural network; the circuit design generated based on the number of channels per layer (¶ [0063], last sentence teaches channels per layer is a parameter of a convolutional neural network. All of ¶ [0069] teaches number of channels in a deep neural network.)
	Additionally, Lee teaches quantization levels in ¶ [0106] and [0094]; one processing element per channel in ¶ [0072], first sentence; and deploying the neural network on a hardware accelerator in ¶ [0089]-[0090].
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Lee’s parameters of a number of channels per layer into Hsu’s search algorithm for neural architectures. A motivation for the combination is that adjusting neural network parameters allows a designer to trade off accuracy for efficiency.

Regarding CLAIM 2, the combination of Hsu, Qiu and Lee teaches: The method of claim 1, 
Hsu teaches: wherein the step of selecting the first neural network architecture is performed by a reinforcement agent, (P. 2, § 2, lines 2-3 teaches an RNN robot network)
wherein the reinforcement agent selects the first neural network architecture from the search space with a probability P, and wherein the reinforcement agent adjusts the probability P (P. 3, last paragraph, line 2: “P(τ |θ) is the conditional probability of outputting a τ under θ.” Tau is the output model from one MONAS iteration and theta is a parameter of the RNN model as an actor.)
based on a function of the accuracy and the implementation cost. (Equation 1 on p. 3 shows that theta                         
                            
                                
                                    θ
                                
                            
                        
                     gets updated based on the expected reward, which is a function of the accuracy and the implementation cost.)

	Regarding CLAIM 3, the combination of Hsu, Qiu, and Lee teaches: The method of claim 2, 
Hsu teaches: wherein the reinforcement agent is a recurrent neural network (RNN). (P. 2, § 2, line 2-3 teaches an RNN robot network)

	Regarding CLAIM 4, the combination of Hsu, Qiu, and Lee teaches: The method of claim 1, 
Hsu teaches: wherein the first neural network architecture is one of a plurality of neural network architectures, (The plurality of neural network architectures includes the combinations of hyperparameters in Table 1 on p. 3)
wherein the step of training includes evaluating the plurality of neural network architectures using a fitness function. (On p. 4, the rewards in equations 3-5 are fitness functions. These rewards incorporated into the training the target network TN because its hyperparameters are selected by the RNN in Algorithm 1 on p. 2.)

Regarding CLAIM 5, the combination of Hsu, Qiu, and Lee teaches: The method of claim 1, 
Hsu teaches: wherein the step of selecting the first neural network architecture is performed by a tuning agent, (P. 2, § 2, lines 2-3 teach an RNN robot network.)
and wherein the tuning agent selects hyperparameters for the second neural network architecture based on a function of the accuracy and the implementation cost. (In § 2.2 on pp. 3-4, Equations 1 and 2 update the RNN based on reward functions Equations 3 to 5. See p. 4, “Reward Function”)

Regarding CLAIM 6, the combination of Hsu, Qiu, and Lee teaches: The method of claim 5, 
Hsu teaches: wherein the tuning agent selects the hyperparameters using a grid search, random search, or Bayesian search. (P. 3, line 2 under equation (2) discloses a search based on a conditional probability, which is a Bayesian search. On P. 4, in “Adaptability”, ¶ 1, last 2 lines teach a random search.)

	Regarding CLAIM 7, the combination of Hsu, Qiu, and Lee teaches: The method of claim 1, 
Hsu teaches: generating based on the weights, the number of layers, … and the error tolerance attribute. (Weights: The last line of Algorithm 1 on p. 2 returns the trained target network. A trained network has weights. Number of layers: P. 2-3, § 2.1, Fig. 1, and Table 1 teach outputting hyperparameters from a search space for an AlexNet type of CNN. A “number of layers” is taught by the number of filters. Error tolerance attribute: An error tolerance is interpreted as the classification error, which is 1.0 minus the accuracy. On p. 5, Fig. 2(c) shows accuracies of generated models on the x-axis.)
However, Hsu does not explicitly teach: generating the circuit design based on… the number of channels per layer, the bit width, the number of processing elements 
	But Qiu teaches: further comprising: generating the circuit design based on… the bit width, the number of processing elements (P. 30, col. 2, “Data Processing”, lines 3-4. See the claim 1 mapping for bit width and number of processing elements.) 
However, neither Hsu nor Qiu teaches: generating the circuit design based on the number of channels per layer
But Lee teaches: generating the circuit design based on the number of channels per layer, the bit width, (¶ [0063], last sentence teaches channels per layer is a parameter of a convolutional neural network. All of ¶ [0069] teaches number of channels in a deep neural network. )

Regarding CLAIM 22, the combination of Hsu, Qiu, and Lee teaches: The method of claim 1, 
However, neither Hsu nor Lee explicitly teaches: wherein the processing elements are unfixed architectural design parameters of the programmable device.
But Qiu teaches: wherein the processing elements are unfixed architectural design parameters of the programmable device. (P. 31, col. 2, § 6.3.2, lines 1-3 and the bullet point “PE En” teaches enabling number of PEs; P. 32, col. 1, middle paragraph teaches: “Table 4 shows the generated the instructions with the example in Figure 5 (a). Instruction 1… PE En enables two PEs working in parallel.”)

	Claims 8-14 are computer readable medium claims which recite the same features as method claims 1-7. Claims 8-14 recite the limitation: A non-transitory computer readable medium comprising instructions, which when executed in a computer system, causes the computer system to carry out a method of implementing a neural network. Hsu, p. 4, § 3, “Experimental Setup” teaches this limitation. Claims 8-14 are rejected for the reasons set forth in the rejections of claims 1-7.
	Claims 15-19 are computer system which recite the same features as method claims 1-5. Claims 15-19 recite the limitation: a memory having program code stored therein; and a processor, configured to execute the program code, to implement a neural network. Hsu, p. 4, § 3, “Experimental Setup” teaches this limitation. Claims 15-19 are rejected for the reasons set forth in the rejections of claims 1-5.

Response to Arguments
	Examiner herein responds to the claims and remarks filed 04/11/2022 and the Advisory Action filed 03/25/2022.
Claim Rejections Under 35 U.S.C. 102 and 103 (Remarks, pp. 8-11): Applicant's arguments have been fully considered but they are not persuasive.
In the second full paragraph on p. 9, Applicant argues, “However, while Hsu discloses hyperparameters and rewards (accuracy and energy consumption), Hsu fails to teach or suggest outputting a number of layers, a number of channels per layer, a bit width, a number of processing elements, and an error tolerance attribute for as is recited by amended claim 1. Specifically, Hsu is silent with regard to disclosing a number of layers, a number of channels per layer, a bit width, a number of processing elements, and an error tolerance attribute, as is recited by amended claim 1.” 
Examiner respectfully disagrees because Hsu teaches at least part of amended claim 1. The 35 U.S.C. 103 rejection of claim 1 in this office action discusses Hsu teaching outputting weights, a number of layers, and an error tolerance attribute. 
	In the first paragraph on p. 11, Applicant argues, “Qui fails to teach or suggest that the image classification system is generated based on weights, a number of layers, a number of channels per layer, a bit width, a number of processing elements, and an error tolerance attribute, as is recited by amended claim 1. Further, while Qui discloses using an FPGA for the implementation, Qui fails to disclose that the FPGA implements a circuit generated based on weights, a number of layers, a number of channels per layer, a bit width, a number of processing elements, and an error tolerance attribute, as is recited by amended claim 1.” 
Examiner respectfully disagrees because Qiu teaches at least part of amended claim 1. The 35 U.S.C. 103 rejection of claim 1 in this office action discusses Qiu teaching outputting a bit width, a number of processing elements for the neural network; and implementing a circuit design in the programmable device, the circuit design generated based on… the bit width, the number of processing elements.
With respect to the claim 1 limitation of “a number of channels”, Applicant’s arguments have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kobayashi (US 20180365557 A1) discloses a type of neural architecture search. Figs. 8A-8C discloses a Pareto Frontier with error rate on the vertical axis and calculation amount on the horizontal axis.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/A.H.J./Examiner, Art Unit 2127                                                                                                                                                                                         

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127