The present application, filed on or after 16 March 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

This office action is in response to Applicant’s submission filed on 27 September 2019.     THIS ACTION IS NON-FINAL.

Status of Claims

Claims 1-16 are pending.
Claim 1-16 are rejected under 35 U.S.C. 101 for being directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claims 1-5, 7-16 are rejected under 35 U.S.C. 103 as unpatentable.
There is no art rejection for claim 6.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Judicial Exception
Claims 1-16 of the claimed invention are directed to a judicial exception, an abstract idea, without significantly more. 
 (Independent Claims) With regards to claim 1, the claim recites a method, which falls into one of the statutory categories.
2A – Prong 1: Claim 1, in part, recites 
 “obtaining a computational graph comprising: a plurality of nodes connected by a plurality of edges, and a plurality of weightings configured to scale input data provided to nodes along edges, wherein each node is configured to: receive at least one item of input data from a preceding node connected to the node via an edge; perform an operation on the input data to provide output data, wherein each item of input data is scaled according to a weighting associated with the node and/or edge; and provide the output data to a subsequent node via an edge in the graph; wherein the computational graph defines a first model and a second model, each model being a subgraph in the computational graph that has a selection of the plurality of nodes, edges and associated weightings, wherein some of the selected nodes, edges and weightings are shared between both the first and the second model; updating the weightings of the first model based on training the first model to perform the selected task; updating the weightings of the second model based on training the second model to perform the same selected task as the first model, wherein: updating the weightings of the second model comprises updating some of the weightings updated in step which are shared between the first and second models, and updating a the shared weighting is controlled based on an indication of importance for the trained first model a node and/or edge associated with the weighting;  46ATTORNEY DOCKET NO. 63886US02 identifying a preferred model for the neural network comprising a selection of nodes, edges and associated weightings from the computational graph, wherein the preferred model is identified based on an analysis of the first and second trained models; and providing a neural network configured to perform the selected task based on the preferred model.” (mental process), as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  That is, other than reciting a computing device, nothing in the claim element precludes the step from practically being performed in the mind.  For example, but for the language about generic computer components, “obtaining”, “receive”, “perform”, “updating”, “identifying”, “providing”, in the limitation citied above could be performed by human using paper / pen / calculator (e.g., a human model builder can analyze alternative models to select a model for certain data modeling tasks), see Appendix 1 to October 2019 Update: Subject Matter Eligibility Life Sciences & Data Processing Examples, Example 43, Step 2A Prong One, p.4, “Note that even if most humans would use a physical aid (e.g., pen and paper, a slide rule, or a calculator) to help them complete the recited calculation, the use of such physical aid does not negate the mental nature of this limitation”.  .  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
2A – Prong 2: This judicial exception is not integrated into a practical application.  There is no additional elements showing integration of the abstract idea into a practical application and/or providing anything significantly more to the abstract idea.  Claim 1 is directed to an abstract idea.
2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The claim is not patent eligible.
(Dependent claims) 
Claims 2-16 are dependent on claim 1 and include all the limitations of claim 1. Therefore, claims 2-16 recite the same abstract ideas. 
With regards to claims 2-16, the claim recites further limitation on transaction data analysis and handling, and does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.  The claim is not patent eligible.






Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-16 are rejected under 35 U.S.C. 103 as being unpatentable over Pham, et al., “Efficient Neural Architecture Search via Parameter Sharing”, 35th ICML, June 25, 2018 [hereafter Pham] in view of Duncan, et al., US-PGPUB NO.2017/0213127A1 [hereafter Duncan].

With regards to claim 1, Pham in view of Duncan teaches 
“A method for neural architecture search to provide a neural network configured to perform a selected task, the method comprising: obtaining a computational graph comprising: a plurality of nodes connected by a plurality of edges, and a plurality of weightings configured to scale input data provided to nodes along edges (Pham, FIG.1, 

    PNG
    media_image1.png
    192
    750
    media_image1.png
    Greyscale

), wherein each node is configured to: receive at least one item of input data from a preceding node connected to the node via an edge; perform an operation on the 
input data to provide output data, wherein each item of input data is scaled according to a weighting associated with the node and/or edge; and 
provide the output data to a subsequent node via an edge in the graph (Pham, p.2, 2.1 Designing Recurrent Cells, 

    PNG
    media_image2.png
    394
    369
    media_image2.png
    Greyscale

); wherein the computational graph defines a first model and a second model, each model being a subgraph in the computational graph that has a selection of the plurality of nodes, edges and associated weightings, wherein some of the selected nodes, edges and weightings are shared between both the first and the second mode (Pham, FIG.2, 2.

    PNG
    media_image3.png
    210
    367
    media_image3.png
    Greyscale

 Methods: ‘ENAS’s design allow parameters to be shared among all child models, i.e., architectures, in search space’) ; 
updating the weightings of the first model based on training the first model to perform the selected task; updating the weightings of the second model based on training the second model to perform the same selected task as the first model, wherein: updating the weightings of the second model comprises updating some of the weightings updated in step which are shared between the first and second models[, and updating a the shared weighting is controlled based on an indication of importance for the trained first model a node and/or edge associated with the weighting] (Pham, 2.2 Training ENAS and Deriving Architectures:

    PNG
    media_image4.png
    355
    373
    media_image4.png
    Greyscale

);  46ATTORNEY DOCKET NO. 63886US02identifying a preferred model for the neural network comprising a selection of nodes, edges and associated weightings from the computational graph, wherein the preferred model is identified based on an analysis of the first and second trained models (Pham, 2.2 Training ENAS and Deriving Architectures: 
‘… We then take only model with the highest reward to re-train from scratch …’); and providing a neural network configured to perform the selected task based on the preferred model (Pham, 3. Experiments: ‘… employing ENAS to design recurrent cells on the Penn Treebank dataset and convolutional architectures on the CIFAR-10 dataset …’)”.
Pham does not explicitly detail “updating a the shared weighting is controlled based on an indication of importance for the trained first model a node and/or edge associated with the weighting”.
However Li teaches “updating a the shared weighting is controlled based on an indication of importance for the trained first model a node and/or edge associated with the weighting (Duncan, FIG.5, 15, [0562], ‘The weight defines the importance of the element’)
)” 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Pham and Duncan before him or her, to modify the Neural architecture search process of Pham to include importance association with weights as shown in Duncan.   
The motivation for doing so would have been for facilitating structure discovery (Duncan, Abstract). 

With regards to claim 2, Pham in view of Duncan teaches 
“The method of claim 1, wherein the neural network is configured for at least one of: natural language processing, image recognition, classification and/or modelling of physical systems, data processing and generation of search results (Pham, 3. Experiments: 

    PNG
    media_image5.png
    112
    370
    media_image5.png
    Greyscale

).”

With regards to claim 3, Pham in view of Duncan teaches 
“The method of claim 1, comprising determining the indication of importance for each shared weighting using a measure based on a gradient or higher order derivative of a selected loss function with respect to the weighting (Pham, 2. Method: 

    PNG
    media_image6.png
    190
    376
    media_image6.png
    Greyscale

).”

With regards to claim 4, Pham in view of Duncan teaches 
“The method of claim 1”
Pham does not explicitly detail “wherein controlling the update to the shared weighting comprises scaling a magnitude of the update based on a value for the measure”.
However Duncan teaches “wherein controlling the update to the shared weighting comprises scaling a magnitude of the update based on a value for the measure (Duncan, [1212], ‘scaling the impact of the measured commonalities by the confidence’)”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Pham and Duncan before him or her, to modify the Neural architecture search process of Pham to include scaling as shown in Duncan.   
The motivation for doing so would have been for facilitating structure discovery (Duncan, Abstract). 

With regards to claim 5, Pham in view of Duncan teaches 
“The method of claim 1, comprising training the second model using a training loss function which includes a component indicative of the measure for weightings in the second model (Pham, 2. Method:.”

    PNG
    media_image7.png
    200
    378
    media_image7.png
    Greyscale

‘)”.

With regards to claim 7, Pham in view of Duncan teaches 
“The method of claim 1, comprising initializing weightings of the second model, prior to training of the second model, based on the trained weightings of the first model (Pham, 3. Experiments: “

    PNG
    media_image8.png
    161
    389
    media_image8.png
    Greyscale

)”.

With regards to claim 8, Pham in view of Duncan teaches 
“The method of claim 1, wherein identifying the preferred model for the neural network comprises analyzing the first and second models using a policy based on at least one of reinforcement learning and evolutionary learning (Pham, 2. Method: “

    PNG
    media_image9.png
    123
    381
    media_image9.png
    Greyscale

)”.

With regards to claim 10, Pham in view of Duncan teaches 
“The method of claim 8, comprising operating a controller based on the policy, wherein the operating comprises use of a reward based on: a component indicative of a goodness of fit for a model, and a component indicative of complexity of a the model (Pham, 2. Method: “

    PNG
    media_image10.png
    129
    382
    media_image10.png
    Greyscale

)”. 

With regards to claim 11, Pham in view of Duncan teaches 
“The method of claim 10, wherein the two components of the reward are separated and/or scaled to prevent the reward being negative (Pham, 2. Method: “

    PNG
    media_image11.png
    27
    373
    media_image11.png
    Greyscale


    PNG
    media_image12.png
    130
    373
    media_image12.png
    Greyscale

)”. 

With regards to claim 12, Pham in view of Duncan teaches 
“The method of claim 8, comprising repeating the one or more steps for different suggested models until at least one of: a time threshold is reached, and a difference between a best performing model and a subsequently suggested model is below a threshold difference (Pham, 3. Experiment: ‘We find that using a large learning rate whilst clipping the gradient norm at a small threshold makes the updates on  more stable’)”.

With regards to claim 13, Pham in view of Duncan teaches 
“The method of claim 12, comprising selecting the best performing suggested model from a last iteration as the model architecture for the neural network configured to perform the selected task (Pham, 3. Experiment: ‘We find that using a large learning rate whilst clipping the gradient norm at a small threshold makes the updates on  more stable’)”.

With regards to claim 14, Pham in view of Duncan teaches 
“The method of claim 1, wherein obtaining a model comprises selecting nodes based on: the selected task, and an indication of a performance of the nodes on similar tasks to the selected task, for example, wherein the nodes comprise at least one of: neurons, and blocks of neurons (Pham, 2. Method: ‘ 

    PNG
    media_image13.png
    200
    382
    media_image13.png
    Greyscale

’)”.
.


With regards to claim 15, Pham in view of Duncan teaches 
“The method of claim 1, comprising repeating the updating the weightings of the first model, the updating the weightings of the second model, and the identifying of the preferred model for the neural network for a selected number of times, wherein initially they are repeated without controlling the updates to weightings (Pham, 2. Method: ‘ 

    PNG
    media_image14.png
    376
    378
    media_image14.png
    Greyscale

’)”.

With regards to claim 16, Pham in view of Duncan teaches 
“The method of claim 15, comprising repeating the updating the weightings of the first model, the updating the weightings of the second model, and the identifying of the preferred model for the neural network for a selected number of times initially without controlling the updates to weightings (Pham, 2. Method: ‘ 

    PNG
    media_image14.png
    376
    378
    media_image14.png
    Greyscale

’)”.

Claim 9 is substantially similar to claims 1. The arguments as given above for claims 1 are applied, mutatis mutandis, to claims 9, therefore the rejection of claims 1 are applied accordingly.

Additional Relevant Art

The prior art made of record is considered pertinent to applicant’s disclosure and is recorded on Form PTO-892. Applicant is required under 37 C.F.R. § 1.111 (c) to consider these references fully when responding to this action, with particular attention paid to:
Kirkpatrick, et. al., “Overcoming catastrophic forgetting in the neural networks”,  arXiv: 1612.00796v2 [cs.LG] 25 Jan 2017 [hereafter Kirkpatrick] shows learning weight importance for tasks.


Examiner's Note

The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well. 


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TSU-CHANG LEE whose telephone number is 571-272-3567.  The fax number is 571-273-3567.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas, can be reached 571-272-2589.  
 Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TSU-CHANG LEE/
Primary Examiner, Art Unit 2128