DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/25/2022 has been entered.
 
Response to Amendments
Acknowledgement is made of Applicant's claim amendments on 4/14/2022. The claim amendments are entered. Presently, claims 1-3, 5-10, 12-16, and 18-23 are now pending. Claims 4, 11, and 17 have been canceled. Claims 1, 8, 15, 18, and 20 have been amended. Claims 21-23 have been newly added. 

Response to Arguments
Applicant's arguments filed on 4/14/2022 have been fully considered but they are not persuasive.


Applicant argues that the cited reference Desjardins allegedly does not teach the amended claim limitations regarding generating a copy of the neural network (Applicant’s Reply pgs. 8-9). This argument is moot because Desjardins is not being used to teach this limitation. 

Applicant also argues that Chilimbi allegedly does not teach the limitation as recited in canceled claim 4, which is partly incorporated into the amended claim 1 (Applicant’s reply pgs. 9-10). This argument is moot because Chilimbi is not being used to teach this limitation, as recited in the amended claim 1.  

Applicant also argues that the dependent claims should be permissible since the independent claims are permissible because the previously cited references allegedly do not teach the various claim limitations (Applicant’s reply pg. 10). This is not persuasive because as shown in the updated mapping below, the previously cited references in combination with the new reference Zhou teach the various claim limitations. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 2, 8, 9, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Desjardins et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0236482, hereinafter Desjardins) in view of Zoph et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0251439, hereinafter Zoph) and Zhou et. al., “Resource-Efficient Neural Architect” (hereinafter Zhou).


Regarding claim 1, Desjardins teaches:
A system for training a neural network, the system comprising ([0031]: describing a machine learning system for training a machine learning model. Wherein the machine learning model comprises a neural network ([0033]).): 
at least one memory including a training module ([0030]-[0031]: describing a machine learning (ML) system that can be “configured to train a machine learning model”, wherein the ML system can be “implemented as computer programs on one or more computers”. Whereby computer programs can be stored via various storage devices or media ([0082] and [0088]).); 
a processor coupled to the at least one memory ([0082]-[0083] and [0087]: describing processors and hardware coupled to the storage devices for executing the programmable instruction on the storage devices.); and 
the training module configured to: receive a plurality of sequential tasks ([0031] and [0035]: describing that “[t]he machine learning system 110 is configured to train a machine learning model 110 on multiple machine learning tasks sequentially”. See also [0028]-[0029]: describing that the task can be comprise “different supervised learning tasks” or “different reinforcement tasks”.)  and the neural network to be trained on the plurality of sequential tasks ([0031], [0033], [0035], and [0045]: describing that the machine learning system comprising the neural network can be trained on multiple sequential tasks.); 
for each task in the plurality of sequential tasks (see the previous citation regarding the various sequential tasks): 
…
identify parameters in the task specific neural network … , wherein the parameters are associated with architectural weights ([0036]: describing the determination of importance weights for the machine learning model and an acceptable level of the performance via the weights, wherein “[t]he set of importance weights for a given task generally includes a respective weight for each parameter of the model 110 that represents a measure of an importance of the parameter to the model 110 achieving acceptable performance on the task”. See also [0037]-[0040], [0047]-[0051], and [0055]: describing in further details the computation of the weights and an optimization of it in correlation with the parameters of the machine learning model.); 
retrain the parameters in the task specific neural network ([0037] and [0044]: describing the retraining based on the parameters of the machine learn model for the various tasks. See also [0059]-[0064]: describing in further details the training for the respective iterations of the various tasks.) to identify a parameter from the parameters with a corresponding maximum architectural weight from the architectural weights ([0036]-[0037]: describing a determination of importance of each parameter in a plurality of parameters, wherein the importance is based on weights in the neural network, i.e. architectural weights. From the determination, the parameter with a certain importance weight can be identified, wherein the level of importance can denote a maximal weight.); and 
update the neural network with the parameter ([0034], [0036], [0048]-[0049], and [0063]-[0064]: describing that the parameters are adjusted during training and that “[t]he adjusted values of the parameters are then used as current values of the parameters in the next iteration” of training the machine learning model. Wherein the parameters comprise the identified parameter as previously described.); and 
wherein the neural network trained on the plurality of sequential tasks is a trained neural network ([0034]-[0035] and [0064]-[0065]: describing that after the machine learning model has been iteratively trained on the parameter values, training data, importance weights and the like, the result is a trained machine learning model that can optimally perform the tasks.).

While the cited reference Desjardins teaches the limitations of claim 1, it does not explicitly teach: “generate a copy of the neural network that includes a plurality of layers from the neural network that is being trained;” on lines 8-9; “generate a task specific neural network from the copy of the neural network by performing an architectural search on the plurality of layers in the copy of the neural network, wherein the architectural search identifies a plurality of candidate choices in at least one layer in the plurality of layers, wherein at least one candidate choice in the plurality of candidate choices adds a new parameter to the task specific neural network that is not in the neural network” on lines 10-15; and “that correspond to the plurality of candidate choices” on lines 16-17. Zoph discloses the claim limitations, teaching: 
“generate a copy of the neural network that includes a plurality of layers from the neural network that is being trained (Zoph [0035] and [0038]: describing that copies of the controller neural network can be generated, wherein the controller neural network comprises a plurality of layers and can be trained and replicated into a plurality of controller neural networks.);”.
“generate a task specific neural network from the copy of the neural network by performing an architectural search on the plurality of layers in the copy of the neural network, wherein the architectural search identifies a plurality of candidate choices in at least one layer in the plurality of layers” and “that correspond to the plurality of candidate choices”: describing “a neural architecture search system 100 [that] includes a controller neural network 110, a training engine 120, and a controller parameter updating engine 130” (Zoph [0026]). Wherein the search system is used “to determine an architecture for a child neural network that is configured to perform the particular task. The architecture defines the number of layers in the child neural network, the operations performed by each of the layers, and the connectivity between the layers in the child neural network, i.e., which layers receive inputs from which other layers in the child neural network.” (Zoph [0023]). Wherein the child neural network denotes a task specific neural network and the architecture of each child neural network is determined via an “output sequence generated by the controller neural network” (Zoph [0027]). Wherein the output sequences comprise details of layer architecture and parameters of the controller neural network for the child network to ensure that the child neural network and its corresponding architecture performs optimally for the particular task (Zoph [0030]-[0033] and [0067]). That is, the output sequences comprise various candidate choices with regards to the layer architecture and parameters.),”
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks in the cited reference to include the neural network copy and architectural search in Zoph. Doing so would enable “a system implemented as computer programs on one or more computers in one or more locations that determines, using a controller neural network, an architecture for a child neural network that is configured to perform a particular neural network task” (Zoph [0014]). Wherein the system is a “neural architecture search system” (Zoph [0022]).

While the cited references in combination teach the above limitations of claim 1, they do not explicitly teach: “wherein at least one candidate choice in the plurality of candidate choices adds a new parameter to the task specific neural network that is not in the neural network” on lines 13-15. Zhou teaches: a neural network architectural search policy process that searches candidate choices regarding neural network metrics such as architecture, layers, parameters, computational abilities, model sizes, etc. (Zhou Sections 1 and 3). Wherein one of the candidate choices includes adding new parameters, such as inserting in new layers into a neural network with the new layers having corresponding new parameters, e.g. filter widths (Zhou Sections 4.1-4.2.2). The new layers and parameters are not originally present in the existing neural network and are added as part of the generation of the child neural networks (Zhou Section 4.3). The child neural networks are generated by adding in new layers into the existing neural network, wherein those new layers and parameters were not originally present in the existing neural network (see previous citations). The child neural networks are operable on specific tasks, e.g. image classification (Zhou Section 5.1.1) or keyword spotting (Zhou Section 5.2.1). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks with the neural network copy and architectural search in the combined cited references to include the new layers and parameters in Zhou. Doing so would enable a proposal of “the Resource-Efficient Neural Architect (RENA), an efficient resource-constrained NAS [neural architectural search] using reinforcement learning with network embedding. RENA uses a policy network to process the network embeddings to generate new configurations. We demonstrate RENA on image recognition and keyword spotting (KWS) problems. RENA can find novel architectures that achieve high performance even with tight resource constraints.” (Zhou Abstract). 

Regarding claim 2, Desjardins teaches:
The system of claim 1, wherein the processor is further configured to: perform a new task using the trained neural network ([0042]-[0044]: describing the trained machine learning model can be used to perform additional tasks B and C. Similarly, see [0052] and [0072-[0073]: describing the trained machine learning model can be used for a second or third task.).

Regarding independent claim 8, claim 8 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 8 is a method claim that corresponds to system claim 1.

Regarding claim 9, the rejection of claim 8 is incorporated. While the cited references teach the claim limitation “for each task in the plurality of sequential tasks” as previously shown, Zoph further teaches:
“generating a task specific layer in the task specific neural network (Zoph [0033]: describing that the neural architecture system comprising a controller neural network can “output architecture data 150 that specifies the architecture of the child neural network, i.e., data specifying the layers that are part of the child neural network, the connectivity between the layers, and the operations performed by the layers”.); and 
updating the neural network with the task specific layer (Zoph [0030]-[0033] and [0035]: describing transmission of updated parameters via the central updating server to the controller neural network and consequently, the child neural networks to achieve the desired architecture in the child neural network.).”
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks and the new layers and parameters in the combined cited references to include the task specific layers in Zoph. Doing so would enable a “neural architecture search system 100 is a system that obtains training data 102 for training a neural network to perform a particular task”, wherein the architecture defines the layers and operations in the layers of the neural network (Zoph [0023]).

Regarding independent claim 15, claim 15 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 15 is a medium claim that corresponds to system claim 1. 
A mapping is shown below for the preamble of claim 15 since that differs from claim 1. Desjardin teaches:
“A non-transitory machine-readable medium having stored thereon machine-readable instructions executable to cause a machine to perform operations that train a neural network, the operations comprising ([0082]-[0083]: describing a “non-transitory storage medium” that can be executed by various hardware/processors/computing machines to implement the process. Wherein the process can include a machine learning (ML) system that can be “configured to train a machine learning model” ([0030]-[0031]).)….”

Claims 3, 5, 7, 10, 12, 14, 16, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Desjardins et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0236482, hereinafter Desjardins), Zoph et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0251439, hereinafter Zoph), and Zhou et. al., “Resource-Efficient Neural Architect” (hereinafter Zhou) in view of Rabinowitz et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0337464, hereinafter Rabinowitz).

Regarding claim 3, the rejection of claim 1 is incorporated. While the cited references in combination teach the claim limitation “wherein the at least one candidate choice in the plurality of candidate choices” as previously shown, they do not explicitly teach: “reuses one of parameters in the at least one layer of the neural network, wherein the one of the parameters is included in the neural network before the copy of the neural network is generated”. Rabinowitz discloses the claim limitations, teaching: that the parameters can be reused in the neural network, wherein such parameters can “be integrated at each layer of the current model” (Rabinowitz [0030]). Wherein the parameters comprise a parameter value (Rabinowitz [0052] and [0055]). And that “each neural network may be copied before fine-tuning to explicitly remember all previous tasks”, wherein the elements being copied include “learnt neural network parameters corresponding to previous tasks” (Rabinowitz [0028]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks along with the neural network copy, architectural search, and new layers and parameters in the combined cited references to include the reuse and copying in Rabinowitz. Doing so would enable a “progressive neural network system 100 [that] may learn multiple machine learning tasks in sequence, where[in] task features are preserved so that new tasks can benefit from all previously learned features and so that the final neural network system can be evaluated on each machine learning task” (Rabinowitz [0039]). 

Regarding claim 5, the rejection of claim 1 is incorporated. Zoph further teaches:
The system of claim 1, wherein at least one candidate choice in the plurality of candidate choices adds an adaptation to one of parameters in the at least one layer of the task specific18Attorney Docket No. 70689.51US01 salesforce.com, inc. Reference No. A4159USneural network (Zoph [0040] and [0042]-[0045]: describing that a child neural network can have hyperparameters defining convolutional filters with corresponding height, width, and stride values that can be can be added/modified as desired. Wherein the hyperparameters comprise a respective parameter, e.g. respective of the layers (Zoph [0028] and previous citation).), ….
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks with the new layers and parameters in the combined cited references to include the adaptation parameters in Zoph. Doing so would enable “a system implemented as computer programs on one or more computers in one or more locations that determines, using a controller neural network, an architecture for a child neural network that is configured to perform a particular neural network task” (Zoph [0014]). Wherein the system is a “neural architecture search system” (Zoph [0022]).
While the cited reference Zoph teaches the above limitations of claim 5, it does not explicitly teach: “wherein the one of parameters is included in the neural network before the copy of the neural network is generated”. Rabinowitz teaches: that “each neural network may be copied before fine-tuning to explicitly remember all previous tasks”, wherein the elements being copied include “learnt neural network parameters corresponding to previous tasks” (Rabinowitz [0028]). Wherein the parameters comprise a parameter value (Rabinowitz [0052] and [0055]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the adaptation parameters in the cited reference to include the copying in Rabinowitz. Doing so would enable a “progressive neural network system 100 [that] may learn multiple machine learning tasks in sequence, where[in] task features are preserved so that new tasks can benefit from all previously learned features and so that the final neural network system can be evaluated on each machine learning task” (Rabinowitz [0039]).

Regarding claim 7, the rejection of claim 1 is incorporated. While the cited references in combination teach the claim limitation “wherein the training module is further configured to retrain the task specific neural network” as shown above, Zoph further teaches: 
“by tuning one of parameters that at least one candidate choice identifies (Zoph [0034]: describing that the neural network search system can obtain a child neural network with the desired architectural characteristics by “fine-tun[ing] the parameter values”. Wherein the parameter values comprises a respective parameter (Zoph [0028]).)….” 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks with the new layers and parameters in the combined cited references to include the tuning in Zoph. Doing so would enable “a system implemented as computer programs on one or more computers in one or more locations that determines, using a controller neural network, an architecture for a child neural network that is configured to perform a particular neural network task” (Zoph [0014]). Wherein the system is a “neural architecture search system” (Zoph [0022]).

While the cited reference Zoph teaches the above limitations of claim 7, it does not explicitly teach: “as one of the parameters that is reused from the neural network before the copy of the neural network is generated”. Rabinowitz discloses the claim limitations, teaching: that the parameters can be reused in the neural network, wherein such parameters can “be integrated at each layer of the current model” (Rabinowitz [0030]). And that “each neural network may be copied before fine-tuning to explicitly remember all previous tasks”, wherein the elements being copied include “learnt neural network parameters corresponding to previous tasks” (Rabinowitz [0028]). Wherein the parameters comprise a parameter value (Rabinowitz [0052] and [0055]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the tuning in the cited reference to include the copying in Rabinowitz. Doing so would enable a “progressive neural network system 100 [that] may learn multiple machine learning tasks in sequence, where[in] task features are preserved so that new tasks can benefit from all previously learned features and so that the final neural network system can be evaluated on each machine learning task” (Rabinowitz [0039]).  

Regarding claim 10, claim 10 is substantially similar to claim 3 and therefore is rejected on the same grounds as claim 3. Claim 10 is a method claim that corresponds to system claim 3.

Regarding claim 12, claim 12 is substantially similar to claim 5 and therefore is rejected on the same grounds as claim 5. Claim 12 is a method claim that corresponds to system claim 5.

Regarding claim 14, claim 14 is substantially similar to claim 7 and therefore is rejected on the same grounds as claim 7. Claim 14 is a method claim that corresponds to system claim 7.

Regarding claim 16, claim 16 is substantially similar to claim 3 and therefore is rejected on the same grounds as claim 3. Claim 16 is a medium claim that corresponds to system claim 3.

Regarding claim 18, claim 18 is substantially similar to claim 5 and therefore is rejected on the same grounds as claim 5. Claim 18 is a medium claim that corresponds to system claim 5.

Regarding claim 20, claim 20 is substantially similar to claim 7 and therefore is rejected on the same grounds as claim 7. Claim 20 is a medium claim that corresponds to system claim 7.
Claims 6, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Desjardins et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0236482, hereinafter Desjardins), Zoph et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0251439, hereinafter Zoph), and Zhou et. al., “Resource-Efficient Neural Architect” (hereinafter Zhou) in view of Chilimbi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2016/0092765, hereinafter Chilimbi) and Rabinowitz et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0337464, hereinafter Rabinowitz).
	
Regarding claim 6, the rejection of claim 1 is incorporated. While the cited references in combination teach the claim limitation “wherein the training module is further configured to retrain the task specific neural network” as shown above, they do not explicitly teach: “by fixing one of parameters in the parameters that at least one candidate choice identifies….” Chilimbi discloses the claim limitation, teaching: that a distributed processing system (DPS) can generate a desired deep neural network model (DNN) by performing forward and backward computation in relation to weight parameters for corresponding neuron layer(s) parameters in the DNN (Chilimbi [0067]-[0068] and [0072]). Wherein “correction factors” are computed for the weight parameters that can be used to update the weights via backpropagation (Chilimbi [0073]-[0074]). The DPS being able to generate a plurality of DNNs for operation in various replica units (Chilimbi [0082]).  
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks along with the neural network copy, architectural search, and new layers and parameters in the combined cited references to include the retraining and fixing parameters in Chilimbi. Doing so would enable “a distributed processing system (DPS) corresponds to a set of computing units 106 that performs a graph processing task, such as training the type of DNN model 114 described in Subsection A.2. Each particular DPS embodies a resource allocation architecture….” (Chilimbi [0077]). Wherein the resource allocation architecture comprises replica units with “one or more parameter modules” (Chilimbi [0083]). 

While the cited reference Chilimbi teaches the above limitations of claim 6, it does not explicitly teach: “as one of the parameters that is reused from the neural before the copy of the neural network is generated”. Rabinowitz discloses the claim limitations, teaching: that the parameters can be reused in the neural network, wherein such parameters can “be integrated at each layer of the current model” (Rabinowitz [0030]). And that “each neural network may be copied before fine-tuning to explicitly remember all previous tasks”, wherein the elements being copied include “learnt neural network parameters corresponding to previous tasks” (Rabinowitz [0028]). Wherein the parameters comprise a parameter value (Rabinowitz [0052] and [0055]).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the retraining and fixing parameters in the cited reference to include the copying in Rabinowitz. Doing so would enable a “progressive neural network system 100 [that] may learn multiple machine learning tasks in saequence, where[in] task features are preserved so that new tasks can benefit from all previously learned features and so that the final neural network system can be evaluated on each machine learning task” (Rabinowitz [0039]).  


Regarding claim 13, claim 13 is substantially similar to claim 6 and therefore is rejected on the same grounds as claim 6. Claim 13 is a method claim that corresponds to system claim 6.

Regarding claim 19, claim 19 is substantially similar to claim 6 and therefore is rejected on the same grounds as claim 6. Claim 19 is a medium claim that corresponds to system claim 6.

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Desjardins et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0236482, hereinafter Desjardins), Zoph et. al.  (U.S. Pat. App. Pre-Grant Pub. No. 2019/0251439, hereinafter Zoph), and Zhou et. al., “Resource-Efficient Neural Architect” (hereinafter Zhou) in view of Liu et., “Recurrent Neural Network for Text Classification with Multi-Task Learning” (hereinafter Liu).

Regarding claim 21, the rejection of claim 1 is incorporated. The cited rejections in combination do not explicitly teach: “wherein the neural network further comprises shareable layers that are shared by the plurality of sequential tasks and at least one task-specific layer that is specific to at least one task in the plurality of sequential tasks.” Liu teaches: 
“wherein the neural network further comprises shareable layers that are shared by the plurality of sequential tasks (Liu Section 3: describing that the architecture of the various neural network model comprises shareable layers for the tasks. Wherein the tasks comprise sequential tasks (Liu Sections 2.1-2.2). The various neural network model is shown in Fig. 2.) and
 at least one task-specific layer that is specific to at least one task in the plurality of sequential tasks (Liu Section 3: describing that the architecture of the various neural network model comprises separate layers or particular assigned layers for each task. Wherein the tasks comprise sequential tasks (Liu Sections 2.1-2.2). The various neural network model is shown in Fig. 2.).”
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the process of training the neural network for sequential tasks with the neural network copy, the architectural search, and the new layers and parameters in the combined cited references to include the shareable and separate layers in Liu. Doing so would enable “use [of] the multitask learning framework to jointly learn across multiple related tasks. Based on recurrent neural network, we propose three different mechanisms of sharing information to model text with task-specific and shared layers. The entire network is trained jointly on all these tasks. Experiments on four benchmark text classification tasks show that our proposed models can improve the performance of a task with the help of other related tasks.” (Liu Abstract).  

Regarding claim 22, claim 22 is substantially similar to claim 21 and therefore is rejected on the same grounds as claim 21. Claim 22 is a method claim that corresponds to system claim 21.

Regarding claim 23, claim 23 is substantially similar to claim 21 and therefore is rejected on the same grounds as claim 21. Claim 23 is a medium claim that corresponds to system claim 21.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Rabinovich et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0262737): describing an “improved approach to implement structure learning of neural networks by exploiting correlations in the data/problem the networks aim to solve”. Wherein the improved approach comprises optimization of parameters in a neural network, e.g. adding in new parameters such as specialist layers in the neural network. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762.  The examiner can normally be reached on M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SELENE A. HAEDI/Examiner, Art Unit 2128