DETAILED ACTION
This action is in response to the claims filed 18 March 2019 for application 16/356,991 filed 18 March 2019.
Claims 1-20 are pending.
Claims 1-20 are rejected.
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-4, 10-13, 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Stanley et al. (A Hypercube-Based Indirect Encoding for Evolving Large-Scale Neural Networks, hereinafter referred to as "Stanley").

Regarding claim 1, Stanley teaches a method comprising: 
for each unit in a direct network, identifying a unit code (Stanley, section 3.1 – teaches defining nodes of a given ANN [direct network] given coordinates; see also Stanley, Fig. 4); 
for each weight in a set of weights of the direct network, determining a weight code (Stanley, section 3.1 – teaches defining each weight of a given ANN [direct network] using the four , the weight code based on unit codes associated with units connected by the weight (Stanley, section 3.1 – teaches defining each weight using the four coordinates of the two nodes connected by the weight; see also Stanley, Fig. 4); 
identifying a set of expected weights from an indirect network that generates the expected weights for the set of weights by applying a set of indirect parameters to the determined weight codes (Stanley, section 3.1 – teaches a CPPN [indirect network] which takes in the coordinates and determines a weight for every connection between every node; see also Stanley, Fig. 4); 
applying the set of expected weights of the direct network (Stanley, section 3.6 – teaches querying the CPPN [indirect network] for weights for each connection of the substrate) to an input to generate an output from the set of expected weights applied to the input (Stanley, section 3.6 – teaches running the substrate as an ANN in the task domain); 
identifying an error between an expected output and the output generated from the direct network (Stanley, section 3.6 – teaches running the substrate as an ANN in the task domain to ascertain fitness [generating error in the direct network]); and 
updating the set of indirect parameters based on the error (Stanley, section 3.6 – teaches generating the next generation of the CPPN according to the NEAT method [updating the parameters of the CPPN (indirect parameters)]).

Regarding claim 2, Stanley teaches all of the limitations of the method of claim 1 as noted above. Stanley further teaches wherein determining a weight code further comprises performing a concatenation of unit codes associated with units connected by the weight
Regarding claim 3, Stanley teaches all of the limitations of the method of claim 1 as noted above. Stanley further teaches wherein unit codes are based at least in part on a structural position of the unit in the direct network (Stanley, section 3.1 – teaches the weights being a function of the position of source and target nodes and the distribution of weights on connections throughout the substrate will exhibit a pattern that is a function of the coordinate system; see also Stanley, sections 3.3-3.4).

Regarding claim 4, Stanley teaches all of the limitations of the method of claim 1 as noted above. Stanley further teaches wherein unit codes are a fixed function of the structural position of the corresponding unit (Stanley, section 3.1 – teaches the weights being a function of the position of source and target nodes and the distribution of weights on connections throughout the substrate will exhibit a pattern that is a function of the coordinate system; see also Stanley, sections 3.3-3.4).

Regarding claim 10, it is the computer-readable medium embodiment of claim 1 with similar limitations to claim 1 and is rejected using the same reasoning found in claim 1. Stanley further teaches the following additional limitations:
a non-transitory, computer-readable medium comprising computer-executable instructions that, when executed by a processor, cause the processor to perform steps (Stanley, Acknowledgements, Appendix A – teaches software and software packages to perform visual discrimination and robot food gathering experiments [While the hardware is not specifically mentions, the hardware is a necessary requirement for operating the software and the algorithms]) ...

Regarding claim 11
Regarding claim 12, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley for the reasons set forth in the rejection of claim 3.

Regarding claim 13, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley for the reasons set forth in the rejection of claim 4.

Regarding claim 19, it is the system embodiment of claim 1 with similar limitations to claim 1 and is rejected using the same reasoning found in claim 1. Stanley further teaches the following additional limitations:
a processor; and 
a non-transitory, computer-readable medium comprising computer-executable instructions that, when executed by a processor, cause the processor to perform steps (Stanley, Acknowledgements, Appendix A – teaches software and software packages to perform visual discrimination and robot food gathering experiments [While the hardware is not specifically mentions, the hardware is a necessary requirement for operating the software and the algorithms]) ...

Regarding claim 20, the rejection of claim 19 is incorporated herein. Further, the limitations in this claim are taught by Stanley for the reasons set forth in the rejection of claim 2.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 5, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Stanley in view of Ha et al. (HyperNetworks, hereinafter referred to as “Ha”).

Regarding claim 5, Stanley teaches all of the limitations of the method of claim 1 as noted above. However, Stanley does not explicitly teach wherein unit codes are latent codes learned based in part on the identified error.
Ha teaches wherein unit codes are latent codes learned based in part on the identified error (Ha, section 3.1 – teaches leaning layer embeddings zj through training [learning layer embeddings through training means learning based on the error]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Stanley with the teachings of Ha in order to improve existing hypernetworks to increase computational speed while applying hypernetworks to RNNs and CNN in the field of designing indirect networks, i.e., hypernetworks, to generate weights for a primary network, i.e., direct network (Ha, Abstract – “This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an 

Regarding claim 14, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley in view of Ha for the reasons set forth in the rejection of claim 5.

Claims 6-9, 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Stanley in view of Pawlowski et al. (Implicit Weight Uncertainty in Neural Networks, hereinafter referred to as “Pawlowski”).

Regarding claim 6, Stanley teaches all of the limitations of the method of claim 1 as noted above. However, Stanley does not explicitly teach wherein determining a weight code further comprises performing a concatenation of unit codes associated with units connected by the weight and a global latent state variable.
wherein determining a weight code further comprises performing a concatenation of unit codes associated with units connected by the weight and a global latent state variable (Pawlowski, section 2 - teaches employing hypernetworks that lead to a global variable model and parametrise the auxiliary variable z by combining conditioning that encodes the position of the generated weight with an auxiliary noise vector [global latent state variable]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Stanley with the teachings of Pawlowski in order to model a richer variational distribution than previous methods while providing at least comparable results to existing methods in the field of designing indirect networks, i.e., hypernetworks, to generate weights for a primary network, i.e., direct network (Pawlowski, Abstract – “We interpret HyperNetworks ... within the framework of variational inference within implicit distributions... Our method, Bayes by Hypernet, is able to model a richer variational distribution than previous methods. Experiments show that it achieves comparable predictive performance on the MNIST classification task while providing higher predictive uncertainties compared to MC-Dropout ... and regular maximum likelihood training.”).

Regarding claim 7, Stanley teaches all of the limitations of the method of claim 1 as noted above. However, Stanley does not explicitly teach wherein one or more of the unit codes, the weight codes, a global latent state variable, the indirect parameters, and the expected weights from the indirect network are probabilistic distributions.
Pawlowski teaches wherein one or more of the unit codes, the weight codes, a global latent state variable, the indirect parameters, and the expected weights from the indirect network are probabilistic distributions (Pawlowski, section 2 - teaches a generator                         
                            w
                            =
                            G
                            
                                
                                    z
                                
                                
                                    θ
                                
                            
                        
                     which models the variational distribution                         
                            q
                            (
                            w
                            |
                            θ
                            )
                        
                    ).


Regarding claim 8, Stanley in view of Pawlowski teaches all of the limitations of the method of claim 7 as noted above.  Pawlowski further teaches wherein the probabilistic distributions are Bayesian (Pawlowski, section 2 - teaches a generator                                 
                                    w
                                    =
                                    G
                                    
                                        
                                            z
                                        
                                        
                                            θ
                                        
                                    
                                
                             which models the variational distribution                                 
                                    q
                                    (
                                    w
                                    |
                                    θ
                                    )
                                
                             which is a variational approximation of the Bayesian inference).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Stanley and Pawlowski in order to generate Bayesian distributions because it is advantageous to model a richer variational distribution than previous methods while providing at least comparable results to existing hypernetwork methods (Pawlowski, Abstract).

Regarding claim 9, Stanley teaches all of the limitations of the method of claim 1 as noted above. However, Stanley does not explicitly teach wherein the indirect network is a parametric model.
Pawlowski teaches wherein the indirect network is a parametric model (Pawlowski, section 2- teaches the hypernetwork is a parametric model).


Regarding claim 15, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley in view of Pawlowski for the reasons set forth in the rejection of claim 6.

Regarding claim 16, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley in view of Pawlowski for the reasons set forth in the rejection of claim 7.

Regarding claim 17, the rejection of claim 16 is incorporated herein. Further, the limitations in this claim are taught by Stanley in view of Pawlowski for the reasons set forth in the rejection of claim 8.

Regarding claim 18, the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Stanley in view of Pawlowski for the reasons set forth in the rejection of claim 9.



Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Krueger et al. (Bayesian Hypernetworks) teaches Bayesian hypernetworks which are neural networks which learns to transform a simple noise distribution to a distribution over parameters of another neural network (the "primary network").
Ranganath et al. (Hierarchical Variational Models) teaches hierarchical variational models which augment a variational approximation with a prior on its parameters, which allows it to capture complex structure for both discrete and continuous latent variables.

Any inquiry concerning this communication or earlier communication from the examiner should be directed to MARSHALL WERNER whose telephone number is (469) 295-9143. The examiner can normally be reached on Monday – Thursday 7:30 AM – 4:30 PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at (571) 272-7796. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/MARSHALL L WERNER/               Examiner, Art Unit 2125         
/BRIAN M SMITH/               Primary Examiner, Art Unit 2122