Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 9/16/2022 has been entered.
 Response to Arguments
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 6, 8, 13 and 15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Pat. No. 11/348,032 to Van Gael et al. (hereinafter Van Gael).

Per claim 1, Van Gael discloses a method for lifelong learning (fig. 2…can continuously generate models 230 and store the models in model store 245 for use; col.3:15-38…can continuous add improvements to various machine learning models that are defined hierarchically), the method comprising: 
identifying a new task for a machine learning model to perform (fig. 4A:420…target description 420 is for generating a new machine learning model, e.g., a child model, from “a machine learning model”, e.g., parent model with corresponding parent description 410, to be applicable to a identified new task; col.2:65-col.3:18...the child model generated hierarchically from the parent model can “perform various tasks from selecting content items a user may be interested in, to detecting whether the security of an account of a user has been compromised.  For each use of machine learning model, a new model is generated…”; fig. 3B:350 and col. 8:61-col.9:43…content of description 350 in the target description 420 for the generation of a child model (the generation of the child model requiring the parent model description 410) includes specific feature selection workflows which can be construed as identifying a new task, the identification involving “evaluating and selecting a set of features of the model to use as inputs to the model in operation…such as those in social networking, data sets may include a very large number of types of values and characteristics that could be used as a feature to describe an item for a model to use in evaluating the item”), the machine learning model trained to perform an existing task (fig. 3A…child models 330-345 inherit from parent/base models, e.g., “the machine learning model”, whether the parent/base models are trained to perform a task that existed before the child model; col.3:15-28…”Each of the child models is represented using a description file that describes this hierarchy by specifying which parent models the child model depends or inherits from. The description file then includes instructions for modifying the parent model when used for the child model. When generating each of the child models, a description file for the parent model, which specifies how the parent model is to be generated (e.g., parameters and process for training the parent model), is accessed and modified based on the instructions”);
adaptively training a network architecture of the machine learning model (figs. 3A-3B and figs. 4A-4B…generating child models is construed as adaptively training a network architecture of the parent model; fig. 4A…child model is generated through adaptive training from both the network architecture of the parent model/description 410, e.g., through inheritance 365, as well as additional workflows 370) to generate an adapted machine learning model (fig. 3A:330-345 and fig. 4A:460…generated target/child network model is “adapted machine learning model”, e.g., adapted from parent model) based on incorporating inherent correlations between the new task and the existing task (fig. 3A…hierarchical nature of models have the child models incorporating inherent properties of corresponding parent models, such as features selection properties of both the child models and corresponding parent models), wherein adaptively training the network architecture (col.9:1-4 and col.10:16-27…training of child networks is done through workflows that designates the process for training giving the results from other workflows, i.e., feature selection workflow, architecture sweeping workflow, and parameter sweeping workflow) includes:
generating a plurality of child network architectures (fig. 3A:330-340…multiple child models can be generated with different network architectures between the child models and the parent model; col 9:44-63…target descriptions for child networks have ‘architecture sweeping’ workflows, which establish each child network architecture, “Architecture sweeping workflows refer to workflows for evaluating and selecting different network model architectures. For example, various network architectures and configurations may be evaluated to determine a set of network architectures that perform well for the given model. The workflow may define the types of architectures to be evaluated along with the means for evaluating the candidate architectures.  For example, the possible architectures may include different numbers of layers, transformations between layers, connections between layers, and so forth”), wherein each of the plurality of child network architectures is expanded from a size of the network architecture (fig. 3B…child models 360 builds off of network architecture of parent models 365, construed as the child models expanding from a size of parent models network architecture; col.8:37-60…each of the plurality of child models inherit characteristics from corresponding parent models and subsequently expanding therefrom: “characteristics and features a model may inherit from a parent model includes an input format or structure, an output format or structure, a model structure, model training parameters (e.g., number of nodes in the model, how the nodes are connected, or number of layers in a neural network…”; col.2:1-13… child models’ network architectures are expanded from a size of the corresponding parent models’ network architecture through any update of the parent models’ network architecture: ”When a child model is re-generated, the latest version of the configurations for generating the child model is used, including every modification applied to the parent model, is used to re-generate the model. For example, an engineer maintaining the parent model modifies the parent model to increase the number of layers of a neural network of the parent model. When the child model is re-generated (e.g., as part of a scheduled task that periodically regenerates the model), the modification of the parent model that increases the number of layers of the neural network is identified an automatically applied to the child model. That is, the number of layers in the neural network of the child model is also increased accordingly”); and
determining an optimal child network architecture from the plurality of child network architectures for the adapted machine learning model (figs. 2,4A:240 and col.11:56…one child network architecture can be selected based on an evaluation of multiple child network architectures to implement as the one target model, the evaluation being made based on the performance scores of the multiple child network architectures: “In one embodiment, the architecture sweep scheme describes a process for determining the viability of the various architectures being tested during the sweep, or an algorithm for determining a score for assessing the performance of the various architectures being tested during the sweep. Based on the architecture sweep, the pipeline executor 240 selects one architecture for the target model”); and
using the adapted machine learning model to perform both the existing task and the new task (col. 3:8-38…child model that contain improvements to execution of a parent model, can perform the functionality previously provided by the parent model on a set of features from input data in addition to identified new features from the input data; col. 4:36-48 and col. 9:5-43…example task of feeding content to users on a social networking system previously performed by parent model can now be performed by the improved child model, which can handle/identify more features from input data that can feed even more content relevant to users than the parent model; col. 2:65-col.3:15…the child model can be applied to other possible new tasks other than social network content feeds, such as data security).
Per claim 6, Van Gael discloses claim 1, further disclosing the machine learning model is a compressed model (col. 9:27-43…pruning of features implemented in the parent model, thus the parent model being a compressed model).
Claims 8 and 15 are substantially similar in scope and spirit to claim 1.  Therefore, the rejection of claim 1 is applied accordingly.  Van Gael discloses an electronic device for performing the method of claim 1 (col. 13:50-62…computing device having a processor and memory).  Van Gael further discloses having a computer program product comprising a computer-readable medium containing computer program code which can be executed by a computer processor for performing the method of claim 1 (col. 13:41-51).
Claim 13 is substantially similar in scope and spirit to claim 6.  Therefore, the rejection of claim 6 is applied accordingly.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5, 7, 12, 14, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US Pat. No. 11/348,032 to Van Gael in view of US Pat. Pub. No. 2019/0188567 to Yao et al. (hereinafter Yao).
Per claim 5, Van Gael discloses claim 1.  
Van Gael does not expressly disclose, but Yao does teach: compressing the optimal child network architecture (Yao: figs. 1, 3, 6, 7…a reference neural network model 113, e.g., parent model, is used to create a sparse neural network model 116, e.g., optimal child neural network, by compression optimization/iterations 703-710) to reduce the size (Yao: ¶24…compressing reduces size of parent model and intermediate child models until optimal child model achieved, ”a trained (e.g., pre-trained) deep neural network model having full connectivity, convolutional layers, fully connected layers, or the like between available connections and weights or parameters for each of such connections, convolutional layers, fully connected layers, or the like may be received for compression. For example, such DNNs may be characterized as dense DNNs. The compression discussed herein may include iterative pruning and splicing operations and parameter weight update operations. Such pruning operations (e.g., disconnecting an available connection at a particular iteration) may compress the DNN model by removing unimportant connections and such splicing operations (e.g., reconnecting previously disconnected available connections at a particular iteration) may provide recovery for pruned connections that are found to be important over the iterations. Such techniques provide dynamic network surgery for learning lossless highly sparse DNNs. Such techniques may be performed on the fly to compress a pre-trained (i.e., fully trained) DNN model”).
 Van Gael and Yao are analogous art because they are from the same field of endeavor creating a child model from an original parent model.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to compress the optimal child model’s corresponding network architecture disclosed by Van Gael in the manner taught by Yao, e.g., through selective pruning.
The suggestion/motivation for doing so would have been for efficient implementation of neural network models with respect to computation and memory usage, while maintaining high accuracy (Yao: ¶1-3).	
Claims 12 and 19 are substantially similar in scope and spirit to claim 5.  Therefore, the rejection of claim 5 is applied accordingly.
Per claim 7, Van Gael discloses claim 1.  Van Gael further discloses adaptively training the network architecture further comprises: training the machine learning model to perform the new task using training data for the new task (Van Gael: fig. 3B,4A and col.9:5-43…generating the child model in part with the parent model can include training the models with new candidate features identified in a feature selection workflow with corresponding training data for the new candidate features: ”The process for identifying the candidate features may also include processing or modifying the training data to generate features, for example by processing the data to generate features representing latent characteristics of the data such as embeddings describing interactions between two types of objects”). 
Van Gael does not expressly disclose Yao teaches compressing the optimal child network architecture of the trained machine learning model (Yao: figs. 1, 3, 6, 7…a trained reference neural network model 113, e.g., parent model, is used to create a sparse neural network model 116, e.g., optimal child neural network, by compression optimization/iterations 703-710; ¶24…compressing reduces size of parent model and intermediate child models until optimal child model achieved) using the training data for the new task (Yao: fig. 1:112…training data set 112 is used to trained both the reference neural network model 113 and the sparse neural network model 116).
Van Gael and Yao are analogous art because they are from the same field of endeavor creating a child model from an original parent model.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to compress the optimal child model’s corresponding network architecture derived from a trained parent model disclosed by Van Gael in the manner taught by Yao, e.g., through selective pruning.
The suggestion/motivation for doing so would have been for efficient implementation of neural network models with respect to computation and memory usage, while maintaining high accuracy (Yao: ¶1-3).
Claims 14 and 20 are substantially similar in scope and spirit to claim 7.  Therefore, the rejection of claim 7 is applied accordingly.
Allowable Subject Matter
Claims 2-4, 9-11 and 16-18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is the statement of reasons for the indication of allowable subject matter:  The prior art disclosed by the applicant and cited by the Examiner fail to teach or suggest, alone or in combination, all the limitations of the independent claims 1, 8 and 15, further including the particular notable limitation: for each of the plurality of child network architectures, the size of the network architecture is expanded using AutoML, wherein AutoML is defined in the instant specification.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Patents and/or related publications are cited in the Notice of References Cited (Form PTO-892) attached to this action to further show the state of the art with respect to generating a plurality of child networks expanded from a parent network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALAN CHEN whose telephone number is (571) 272-4143. The examiner can normally be reached M-F 10-7.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALAN CHEN/Primary Examiner, Art Unit 2125