DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 10-2019-0018400, filed on 18 Feb 2019.
Claim Objections
Claim 11 objected to because of the following informalities: “imporances” instead of “importances.”  Appropriate correction is required. 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 16 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claim 16 recites the limitation "and the memory" in ln 1.  However, it is unclear if applicant intends to claim that the memory is part of the communication device or if the memory is separate from the communication device. Therefore, for the purposes of examination, the claim is interpreted as “a communication interface configured to receive the input data, wherein the communication interface comprises the memory.”
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-6, 8, 14-21 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by U.S. Patent No. 11,210,559 (Kolouri et al).

Regarding Claim 1 and analogous claim 15, Kolouri teaches:
1. A processor implemented neural network method, the method comprising: determining a target task with respect to input data;
(Kolouri, col 5:13-42, figs. 1-2) 
“FIG. 1 is a flowchart illustrating tasks of a method 100 of training an artificial neural network utilizing selective synaptic plasticity according to one embodiment of the present disclosure [i.e. determining a target task]. FIG. 2 depicts an example of an artificial neural network 200 undergoing training according to the method 100 illustrated in FIG. 1. In the embodiment illustrated in FIG. 2, the artificial neural network 200 includes an input layer 201 having a series of input layer neurons 202, a first hidden layer 203 having a series of first hidden layer neurons 204, a second hidden layer 205 having a series of second hidden layer neurons 206, and an output layer 207 having a series of output layer neurons 208. Moreover, each of the synapses 209, 210, 211 between the neurons in adjacent layers have an associated connection weight. Additionally, each of the neurons 202, 204, 206, 208 in the artificial neural network 200 is associated with an activation function configured to receive the inputs to the neurons 202, 204, 206, 208 as arguments to the activation function and compute an output value for the neurons 202, 204, 206, 208 based on the inputs to determine the activation states of the neurons 202, 204, 206, 208 [i.e. with respect to input data].”
(Kolouri, col 11:8-27)“In the illustrated embodiment, the autonomous system 300 includes a memory device 301 (e.g., non-volatile memory, such as read-only memory (“ROM”), programmable ROM (“PROM”), erasable programmable ROM (“EPROM”), electrically erasable programmable ROM “EEPROM”), flash memory, etc.), a processor or a processing circuit 302, a controller 303, and at least one sensor 304 . The memory device 301, the processor or processing circuit 302, the controller 303, and the at least one sensor 304 may communicate with each other over a system bus 305 [ i.e. a processor implemented neural network method]. In one or more embodiments in which the autonomous system 300 is configured to control an autonomous or semi-autonomous vehicle, the sensors 304 may be any suitable type or kind of sensors configured to detect objects or situations in a path of the autonomous vehicle, such as one or more cameras, lidars, and/or radars, and the controller 303 may be connected to any suitable vehicle components for controlling the vehicle, such as brakes, the steering column, and/or the accelerator, based on the objects or situations detected by the one or more sensors 304.”

2. acquiring a second parameter that is prestored to correspond to the target task among first parameters included in a neural network for a plurality of tasks;
(Kolouri, col 6:20-30, figs. 1-2, eq. 1) In one or more embodiments, the task 130 of identifying the importance of the synapses 209, 210, 211 utilizing the Hebbian learning algorithm includes calculating a synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 [i.e. acquiring a second parameter that is prestored to correspond to the target task]. According to one or more embodiments of the present disclosure, the synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 is initialized to zero and, during the training of the artificial neural network 200 to perform the first task [i.e. target task] during task 110 [i.e. for a plurality of tasks]. 
3. adapting the neural network to the target task by setting a value of a portion of the first parameters of the neural network to a value of the second parameter;
(Kolouri, col 5:56-64, figs. 1-2) In the embodiment illustrated in FIG. 1, the method 100 includes a task 110 of training the artificial neural network 200 to perform a first task A [i.e. to the target task] (e.g., semantic segmentation of an image of a driving scene, such as nighttime image, a daytime image, or a rainy image). The task 110 of training the artificial neural network 200 includes updating the artificial neural network [i.e. adapting the neural network] 200 via backpropagation to update the synaptic weights to minimize the loss according to a suitable loss function [i.e. by setting a value of a portion of the first parameters of the neural network to a value of the second parameter].
4. implementing the adapted neural network with respect to the input data for the target task.
(Kolouri, col 4:58-64)
The present disclosure is directed to various embodiments of artificial neural networks [i.e. implementing the adapted neural network] and methods of training artificial neural networks utilizing selective plasticity such that the artificial neural network can learn new tasks (e.g., road detection during nighttime) without forgetting old tasks (e.g., road detection during daytime). [i.e. with respect to the input data for the target task].
Regarding Claim 2 and analogous claim 17,
Kolouri teaches the method of claim 1.
Kolouri further teaches:
1. wherein the second parameter comprises at least one of a parameter corresponding to a key neuron for the target task, an index of the key neuron, a parameter corresponding to a key synapse for the target task, and an index of the key synapse. 
(Kolouri, col 5:65-67; 6:1-4)
“In the illustrated embodiment, the method 100 also includes a task 120 of calculating or determining the neurons 202, 204, 206, 208 of the artificial neural network 200 that are significant for the performance of the first task A (i.e., the task 120 includes identifying task-significant neurons 202, 204, 206, 208 in the artificial neural network 200) [wherein the second parameter comprises at least one of a parameter corresponding to a key neuron for the target task].” 

Regarding Claim 3 and analogous claim 18,
Kolouri teaches the method of claim 1.
Kolouri further teaches:
1. wherein the second parameter comprises at least one of a parameter corresponding to a key filter for the target task and an index of the key filter.
(Kolouri, col 7:50-67, eq. 4)
“However, Hebbian learning of importance parameters may suffer from the problem of unbounded growth of the importance parameters. To avoid the problems of Hebbian learning, in one or more embodiments the task 130 of determining the synaptic importance utilizes Oja's learning rule (i.e., Oja's learning algorithm) to calculate the importance, γ.sub.ji.sup.l, of the synapse between the neurons f.sub.j.sup.(l−1) and f.sub.i.sup.l for the first task A [i.e. target task] as follows: where ∈ is the rate of Oja's learning rule, i and j are neurons, l is a layer of the artificial neural network, and P.sub.c is a probability [i.e corresponding to a key filter for the target task and an index of a key filter]. The task 130 of updating the importance parameters [i.e. wherein the second parameter comprises at least one of a parameter] via Oja's learning rule is performed in an online manner, starting from γ.sub.ji.sup.l=0, during or following the task of updating the artificial neural network 200 via back-propagation during the task 110 of training the artificial neural network 200.”
Regarding Claim 4 and analogous claim 19, 
Kolouri teaches the method of claim 1 and the apparatus of claim 15.
Kolouri further teaches:
1. further comprising: receiving the input data;
(Kolouri, col. 7:10-13, fig. 3)
“As illustrated in FIG. 3, the top-down signals contain the task-relevant portions of the input (i.e., the input neurons 202). [i.e. further comprising: receiving the input data]”
2. and the determining of the target task includes estimating the target task based on the input data.
(Kolouri, col. 7:4-10, fig. 3)
“The left column of images in FIG. 3 are the input images (e.g., images of handwritten numbers “5,”, “8, and 7”), the middle column of images in FIG. 3 are the attentional map generated by c-EBP during task 120 for the predicted labels (i.e., the highest activity after the softmax layer), and the right column of images in FIG. 3 are the runner-up predicted labels [i.e. and the determining of the target task includes estimating the target task based on the input data].” 
Regarding Claim 5 and analogous claim 20:
Kolouri teaches the method of claim 1.
Kolouri further teaches:
1. wherein the adapting of the neural network comprises: initializing the neural network to include all of the first parameters;
(Kolouri, col. 7:47-50)
“Additionally, in one or more embodiments, the probability distribution for the output layer 207 is set to the one-hot vector of the input label, P(a.sub.j.sup.L(x.sub.n))=y.sub.nb [i.e. wherein the adapting of the neural network comprises: initializing the neural network to include all of the first parameters].”
2. and updating, to generate the adapted neural network, the initialized neural network based on the second parameter.
(Kolouri, col. 8:10-18)
“In one or more embodiments, the weights associated with the important synapses [i.e. based on the second parameter] may not be fixed, but the important synapses may be allocated relatively less plasticity than the synapses that are not important for the performance of the first task A. In this manner, the artificial neural network 200 [i.e. the initialized neural network], following the task 140 of rigidifying the synapses associated with the important neurons, exhibits selective plasticity without catastrophic forgetting.”
Regarding Claim 6 and analogous claim 21,
Kolouri teaches the method of claim 1.
Kolouri further teaches:
1. wherein the target task corresponds to one of the plurality of tasks
(Kolouri, col. 3:41-50)
“The method may also include training the artificial neural network on a second task different than the first task. The training of the artificial neural network on the second task includes sending at least one input of the second task to an input layer of the plurality of layers, generating, at an output layer of the plurality of layers, at least one output based on the at least one input, generating a reward based on a comparison between the at least one output and a desired output, and modifying the connection weights based on the reward [i.e. wherein the target task corresponds to one of the plurality of tasks].”
Regarding Claim 8 and analogous claim 14:
Kolouri teaches the method of claim 1. 
Kolouri further teaches:
1. A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1.
(Kolouri, col. 10:37-45)
“In the illustrated embodiment, the autonomous system 300 includes a memory device 301 (e.g., non-volatile memory, such as read-only memory (“ROM”), programmable ROM (“PROM”), erasable programmable ROM (“EPROM”), electrically erasable programmable ROM “EEPROM”), flash memory, etc.), a processor or a processing circuit 302, a controller 303, and at least one sensor 304. The memory device 301, the processor or processing circuit 302, the controller 303, and the at least one sensor 304 may communicate with each other over a system bus 305 [i.e. A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1].”
Regarding Claim 16:
Kolouri teaches the apparatus of Claim 15.
Kolouri teaches:
1.  a communication interface configured to receive the input data; 
(Kolouri, col. 6:66-67; col. 7:1-4, Fig. 3)
“FIG. 3 is a depiction of c-EBP top-down attention maps at the input layer 201 of the artificial neural network 200 [i.e. a communication interface configured to receive the input data] of the present disclosure when trained on a Modified National Institute of Standards and Technology (MNIST) handwritten digit dataset, which is a benchmark problem for optical character classification.”
2.  and the memory.
(Kolouri, col 11:8-12)
“In the illustrated embodiment, the autonomous system 300 includes a memory device 301 (e.g., non-volatile memory, such as read-only memory (“ROM”), programmable ROM (“PROM”), erasable programmable ROM (“EPROM”), electrically erasable programmable ROM “EEPROM”), flash memory, etc.), a processor or a processing circuit 302, a controller 303, and at least one sensor 304 [i.e. and the memory].”
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 7, 9-14, 22-38 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent No. 11,210,559 (Kolouri et al) in view of Pre-Grant Publication No. US 20190236482 (Desjardins et al).
Regarding Claim 7 and analogous claim 28,
The method of Claim 1 is taught by Kolouri.
Kolouri teaches:
 1. determining one or more key parameters of the neural network for the plurality of tasks;
 (Kolouri, col. 7:15-21)
“In one or more embodiments, the task 130 [i.e. for the plurality of tasks] of identifying the importance of the synapses 209, 210, 211 utilizing the Hebbian learning algorithm includes calculating a synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 [i.e. determining one or more key parameters of the neural network].”
Kolouri does not explicitly teach:
1. obtaining an importance matrix with respect to the neural network for the plurality of tasks;
2. updating the importance matrix with respect to the determined one or more key parameters;
3.  and training the neural network for the plurality of tasks with training data and for a new task using the updated importance matrix.
Desjardins teaches: 
1. obtaining an importance matrix with respect to the neural network for the plurality of tasks;
(Desjardins, ¶0039): “In some implementations, the engine 112 can approximate the posterior distribution using an approximation method, for example, using a Fisher Information Matrix (FIM) [i.e. obtaining an importance matrix with respect to the neural network]. The engine 112 can determine an FIM of the parameters of the model 110 with respect to task A [i.e. for the plurality of tasks] in which, for each of the parameters, the respective importance weight of the parameter is a corresponding value on a diagonal of the FIM. That is, each value on the diagonal of the FIM corresponds to a different parameter of the machine learning model 110.”
2. updating the importance matrix with respect to the determined one or more key parameters;
(Desjardins, ¶0038): “For example, the engine 112 may determine a posterior distribution over possible values of the parameters of the model 110 after the model 110 has been trained on previous training data from previous machine learning task(s). For each of the parameters, the posterior distribution assigns a value to the current value of the parameter in which the value represents a probability that the current value is a correct value of the parameter [i.e. with respect to the determined one or more key parameters].”
(Desjardins, ¶0051): “In some implementations, the system can approximate the posterior distribution using an approximation method, for example, using a Fisher Information Matrix (FIM). In particular, the system determines a Fisher Information Matrix (FIM) of the parameters of the machine learning model with respect to the first machine learning task in which, for each of the parameters, the respective measure of an importance of the parameter is a corresponding value on a diagonal of the FIM. That is, each value on the diagonal of the FIM corresponds to a different parameter of the machine learning model [i.e. updating the importance matrix].”
It is noted by the examiner that ¶0051 is within the context of a second task wherein the parameters are further updated (see Desjardins, ¶0044-¶0051).
3.  and training the neural network for the plurality of tasks with training data and for a new task using the updated importance matrix.
(Desjardins, ¶0042): “To allow the model 118 to learn task B without forgetting task A, during the training of the model 110 on task B, the system 100 uses the set of importance weights 120 corresponding to task A to form a penalty term in the objective function 118 that aims to maintain an acceptable performance of task A. That is, the model 110 is trained [i.e. and training the neural network for the plurality of tasks] to determine trained parameter values 116 that optimize the objective function 118 with respect to task B and, because the objective function 118 include the penalty term, the model 110 maintains acceptable performance on task A even after being trained on task B [i.e. with training data and for a new task using the updated importance matrix]. The process for training the model 110 and the objective function 118 are described in more detail below with reference to FIG. 2.”

One of ordinary skill, at the time the invention was filed, would have been motivated to include the teaching of Desjardins in order to modify Kolouri in order to provide a machine learning system that performs better with respect to parameters (Desjardins, ¶0008). “Training the machine learning model on the training data may include: adjusting the first values of the parameters to optimize, more particularly aim to minimize, an objective function that includes: (i) a first term that measures a performance of the machine learning model on the second machine learning task, and (ii) a second term that imposes a penalty for parameter values deviating from the first parameter values, wherein the second term penalizes deviations from the first values more for parameters that were more important in achieving acceptable performance on the first machine learning task than for parameters were less important in achieving acceptable performance on the first machine learning task. The second term may depend on, for each of the plurality of parameters, a product of the respective measure of importance of the parameter and a difference between the current value of the parameter and the first value of the parameter.”
Regarding Claim 9 and analogous claim 22, 
Kolouri teaches:
1. training a neural network based on first training data for a first task, the trained neural network including a plurality of parameters;
(Kolouri, col 5:56-64, figs. 1-2)
“In the embodiment illustrated in FIG. 1, the method 100 includes a task 110 of training the artificial neural network 200 to perform a first task A [i.e. training a neural network based on first training data for a first task] (e.g., semantic segmentation of an image of a driving scene, such as nighttime image, a daytime image, or a rainy image). The task 110 of training the artificial neural network 200 includes updating the artificial neural network 200 via backpropagation to update the synaptic weights to minimize the loss according to a suitable loss function.”
(Kolouri, col 6:20-22)
“In one or more embodiments, the task 130 of identifying the importance of the synapses 209, 210, 211 utilizing the Hebbian learning algorithm includes calculating a synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 [i.e. including a plurality of parameters].”
2.   extracting a second parameter from among the plurality of parameters based on determined importances of the plurality of parameters;
(Kolouri, col. 7:15-21)
“According to one or more embodiments of the present disclosure, the synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 is initialized to zero, and, during the training of the artificial neural network 200 to perform the first task during task 110 [i.e. extracting a second parameter from among the plurality of parameters based on determined importances of the plurality of parameters], for each input image x.sub.n, the importance parameters β.sub.ji.sup.l of the artificial neural network 200 are updated according to Equation 3 as followed [i.e. of the plurality of parameters].”
3. updating the importances, including updating an importance of the second parameter among the determined importances;
(Kolouri, col 7:34-41)
“According to one or more embodiments of the present disclosure, the synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 is initialized to zero, and, during the training of the artificial neural network 200 to perform the first task during task 110, for each input image x.sub.n, the importance parameters β.sub.ji.sup.l of the artificial neural network 200 are updated according to Equation 3 as follows [i.e. updating the importances, including updating an importance of the second parameter among the determined importances].”
Kolouri does not explicitly teach:
1. and retraining the neural network based on the updated importances and second training data for a second task.
2. storing a value of the second parameter;
Desjardins teaches:
1. and retraining the neural network based on the updated importances and second training data for a second task.
 (Desjardins, ¶0036): “The system 100 determines the information about previous tasks using an importance weight calculation engine 112 [i.e. retraining the neural network based on the updated importances]. In particular, for each task that the model 110 was previously trained on, the engine 112 determines a set of importance weights corresponding to that task [i.e. and second training data for a second task].”
2. storing a value of the second parameter;
 (Desjardins, ¶0036) “The system 100 then uses the sets of importance weights corresponding to previous tasks to train the model 110 on a new task such that the model 110 achieves an acceptable level of performance on the new task while maintaining an acceptable level of performance on the previous tasks [i.e. storing a value of the second parameter].”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.

Regarding Claim 10 and analogous claims 23, 29, and 35:
Kolouri in view of Desjardins teaches the method of claim 9.
Desjardins teaches:
1. wherein the updating of the importances comprises updating the importance of the second parameter by setting an element value of an importance matrix corresponding to the second parameter to a first logic value.
(Desjardins, ¶0038): “For example, the engine 112 may determine a posterior distribution over possible values of the parameters of the model 110 after the model 110 has been trained on previous training data from previous machine learning task(s) [i.e. updating the importances comprises updating the importance of the second parameter]. For each of the parameters, the posterior distribution assigns a value [i.e. first logic value] to the current value of the parameter [i.e. element value of an importance matrix] in which the value represents a probability that the current value is a correct value of the parameter.”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.

Regarding Claim 11 and analogous claim 24:
Kolouri in view of Desjardins teaches the method of claim 9.
Kolouri does not explicitly teach:
1. determining the importances of the plurality of parameters by calculating the importances of the plurality of parameters.
Desjardins teaches: 
1. determining the importances of the plurality of parameters by calculating the importances of the plurality of parameters.
(Desjardins, ¶0039): “In some implementations, the engine 112 can approximate the posterior distribution using an approximation method, for example, using a Fisher Information Matrix (FIM) [i.e. calculating the importances of the plurality of parameters]. The engine 112 can determine an FIM of the parameters of the model 110 with respect to task A in which, for each of the parameters, the respective importance weight of the parameter is a corresponding value on a diagonal of the FIM [i.e. determining the importances of the plurality of parameters]. That is, each value on the diagonal of the FIM corresponds to a different parameter of the machine learning model 110.”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.

Regarding Claim 12 and analogous claim 25:
Kolouri in view of Desjardins teaches the method of claim 9.
Kolouri does not explicitly teach:
1.  wherein the calculating of the importances comprises calculating the importances of the   plurality of parameters based on a set importance matrix.
Desjardins teaches: 
(Desjardins, ¶0051): “In some implementations, the system can approximate the posterior distribution using an approximation method, for example, using a Fisher Information Matrix (FIM) [i.e. calculating the importances]. In particular, the system determines a Fisher Information Matrix (FIM) of the parameters of the machine learning model with respect to the first machine learning task in which, for each of the parameters, the respective measure of an importance of the parameter is a corresponding value on a diagonal of the FIM. That is, each value on the diagonal of the FIM corresponds to a different parameter of the machine learning model. [i.e. of the plurality of parameters based on a set importance matrix] .”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.

Regarding Claims 13, and analogous claims 26, 31, and 37:
Kolouri in view of Desjardins teaches the method of 9.
Kolouri teaches:
1. wherein the second parameter comprises at least one of a parameter corresponding to a key neuron for the target task, an index of the key neuron, a parameter corresponding to a key synapse for the target task, and an index of the key synapse. 
(Kolouri, col 5:65-67; 6:1-4)
“In the illustrated embodiment, the method 100 also includes a task 120 of calculating or determining the neurons 202, 204, 206, 208 of the artificial neural network 200 that are significant for the performance of the first task A (i.e., the task 120 includes identifying task-significant neurons 202, 204, 206, 208 in the artificial neural network 200) [wherein the second parameter comprises at least one of a parameter corresponding to a key neuron for the target task].“
Regarding Claim 27:
Kolouri teaches:
1. A processor implemented neural network method, the method comprising: obtaining first parameters of a neural network trained for a plurality of tasks, wherein the obtained first parameters of the neural network are configured to implement less than the plurality of tasks
(Kolouri, col. 6:3-8)
“In one or more embodiments, the task 120 of calculating the neurons 202, 204, 206, 208 of the artificial neural network 200 that are significant for the performance of the first task may utilize the contrastive version of the EBP algorithm (c-EBP) to make the top-down signal more task-specific [i.e. obtaining first parameters of a neural network trained for a plurality of tasks].”
(Kolouri, col. 5:65ff, col. 6:1-2)
“In the illustrated embodiment, the method 100 also includes a task 120 of calculating or determining the neurons 202, 204, 206, 208 of the artificial neural network 200 that are significant for the performance of the first task A [i.e. wherein the obtained first parameters of the neural network are configured to implement less than the plurality of tasks].”
2. implementing the adapted neural network with respect to input data for the target task.
(Kolouri, col. 5:42-50)
“Although in the illustrated embodiment the artificial neural network 200 includes two hidden layers 203, 205, in one or more embodiments, the artificial neural network 200 may include any other suitable number of hidden layers and each layer may have any suitable number of neurons depending, for instance, on the desired complexity of the task that the artificial neural network is capable of learning and performing during artificial neural network inference [i.e. implementing the adapted neural network with respect to input data for the target task].”
Kolouri does not explicitly teach:
1. adapting the neural network trained for the plurality of tasks to include all of the first parameters except for one or more parameters of the first parameters that are respectively replaced by the one or more second parameters
Desjardins teaches:
1. adapting the neural network trained for the plurality of tasks to include all of the first parameters except for one or more parameters of the first parameters that are respectively replaced by the one or more second parameters
 (Desjardins, ¶0043): “When there are more than two tasks in the sequence of machine learning tasks, e.g., when the model 110 still needs to be trained on a third task, e.g., task C, after being trained on task B [i.e. adapting the neural network trained for the plurality of tasks], after the trained parameter values 116 for task B have been determined, the machine learning system 100 provides the trained parameter values 116 [i.e. to include all of the first parameters] to the engine 112 so that the engine 112 can determine a new set of importance weights corresponding to task B [i.e. that are respectively replaced by the one or more second parameters.]”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.

Regarding Claim 30: 
Kolouri in view of Desjardins teaches the method of claim 29.
Kolouri does not explicitly teach:
1. further comprising: generating the importance matrix by calculating importances of respective parameters of the neural network trained for the plurality of tasks.
Desjardins teaches: 
1. further comprising: generating the importance matrix by calculating importances of respective parameters of the neural network trained for the plurality of tasks.(Desjardins, ¶0013): “One way of determining, for each of the plurality of parameters, a respective measure of an importance of the parameter to the machine learning model achieving acceptable performance on the first machine learning task may include: determining a Fisher Information Matrix (FIM) of the plurality of parameters of the machine learning model with respect to the first machine learning task, in which, for each of the plurality of parameters, the respective measure of the importance of the parameter is a corresponding value on a diagonal of the FIM [i.e. generating the importance matrix by calculating importances of respective parameters of the neural network trained for the plurality of tasks].“
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.
Regarding claim 32 and analogous claim 38: 
Kolouri in view of Desjardins teaches the apparatus of claim 27.
Kolouri teaches:
1. wherein the second parameter comprises at least one of a parameter corresponding to a key filter for the target task and an index of the key filter.
(Kolouri, col 7:50-67, eq. 4)
“However, Hebbian learning of importance parameters may suffer from the problem of unbounded growth of the importance parameters. To avoid the problems of Hebbian learning, in one or more embodiments the task 130 of determining the synaptic importance utilizes Oja's learning rule (i.e., Oja's learning algorithm) to calculate the importance, γ.sub.ji.sup.l, of the synapse between the neurons f.sub.j.sup.(l−1) and f.sub.i.sup.l for the first task A [i.e. target task] as follows: where ∈ is the rate of Oja's learning rule, i and j are neurons, l is a layer of the artificial neural network, and P.sub.c is a probability [i.e corresponding to a key filter for the target task and an index of a key filter]. The task 130 of updating the importance parameters [i.e. wherein the second parameter comprises at least one of a parameter] via Oja's learning rule is performed in an online manner, starting from γ.sub.ji.sup.l=0, during or following the task of updating the artificial neural network 200 via back-propagation during the task 110 of training the artificial neural network 200.”
Regarding Claim 33:
 1. A processor implemented neural network method, the method comprising: obtaining first parameters of a trained neural network trained for a first task;
(Kolouri, col 5:56-64, figs. 1-2)
“In the embodiment illustrated in FIG. 1, the method 100 includes a task 110 of training the artificial neural network 200 to perform a first task A [i.e. training a neural network based on first training data for a first task] (e.g., semantic segmentation of an image of a driving scene, such as nighttime image, a daytime image, or a rainy image). The task 110 of training the artificial neural network 200 includes updating the artificial neural network 200 via backpropagation to update the synaptic weights to minimize the loss according to a suitable loss function.”
(Kolouri, col. 6:3-8)
“In one or more embodiments, the task 120 of calculating the neurons 202, 204, 206, 208 of the artificial neural network 200 that are significant for the performance of the first task may utilize the contrastive version of the EBP algorithm (c-EBP) to make the top-down signal more task-specific [i.e. obtaining first parameters].”
(Kolouri, col 11:8-27)“In the illustrated embodiment, the autonomous system 300 includes a memory device 301 (e.g., non-volatile memory, such as read-only memory (“ROM”), programmable ROM (“PROM”), erasable programmable ROM (“EPROM”), electrically erasable programmable ROM “EEPROM”), flash memory, etc.), a processor or a processing circuit 302, a controller 303, and at least one sensor 304 . The memory device 301, the processor or processing circuit 302, the controller 303, and the at least one sensor 304 may communicate with each other over a system bus 305 [ i.e. a processor implemented neural network method]. In one or more embodiments in which the autonomous system 300 is configured to control an autonomous or semi-autonomous vehicle, the sensors 304 may be any suitable type or kind of sensors configured to detect objects or situations in a path of the autonomous vehicle, such as one or more cameras, lidars, and/or radars, and the controller 303 may be connected to any suitable vehicle components for controlling the vehicle, such as brakes, the steering column, and/or the accelerator, based on the objects or situations detected by the one or more sensors 304.”
 2. obtaining one or more key parameters of the neural network;
(Kolouri, col. 7:30-34)
“In one or more embodiments, the task 130 of identifying the importance of the synapses 209, 210, 211 utilizing the Hebbian learning algorithm includes calculating a synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 [i.e. obtaining one or more key parameters of the neural network].”
Kolouri does not explicitly teach:
1. obtaining an importance matrix with respect to the neural network;
2. updating the importance matrix with respect to the determined one or more key parameters;
3. retraining, using a loss dependent on the updated importance matrix, the neural network with training data to have a plurality of parameters configured to implement a second task.

Desjardins teaches
 (Desjardins, ¶0039): “In some implementations, the engine 112 can approximate the posterior distribution using an approximation method, for example, using a Fisher Information Matrix (FIM) [i.e. obtaining an importance matrix with respect to the neural network].”
 (Desjardins, ¶0038): “For example, the engine 112 may determine a posterior distribution over possible values of the parameters of the model 110 after the model 110 has been trained on previous training data from previous machine learning task(s) [i.e. updating the importances comprises updating the importance of the second parameter].”
(Desjardins, ¶0058): “The second term is a penalty term that imposes a penalty for parameter values deviating from the first parameter values [i.e. using a loss dependent]. In particular, the penalty term depends on, for each parameter i of the multiple parameters of the machine learning model [i.e. the neural network with training data to have a plurality of parameters], a product of (i) the respective measure of importance F.sub.i of the parameter to the machine learning model achieving an acceptable level of performance on the first machine learning task, and (ii) a difference between the current value of the parameter θ.sub.i and the first value of the parameter θ*.sub.A,I]. The second term also depends on x, which sets how important the old task (e.g., the first machine learning task) is compared with the new one (e.g., the second machine learning task) [i.e. configured to implement a second task]. The F.sub.i values may represent neural network weight uncertainties and may be derived from the FIM diagonal values or otherwise [i.e. on the updated importance matrix].”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.
Regarding Claim 34:
Kolouri teaches:
1. acquiring one or more second parameters prestored to correspond to a target task;
(Kolouri, col 6:20-30, figs. 1-2, eq. 1)
“In one or more embodiments, the task 130 of identifying the importance of the synapses 209, 210, 211 utilizing the Hebbian learning algorithm includes calculating a synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 [i.e. acquiring one or more second parameters that is prestored to correspond to the target task]. According to one or more embodiments of the present disclosure, the synaptic importance parameter β.sub.ji.sup.l for each of the synapses 209, 210, 211 is initialized to zero and, during the training of the artificial neural network 200 to perform the first task during task 110 [i.e. to correspond to a target task].”
2. implementing the adapted neural network with respect to input data for the target task.
(Kolouri, col 4:58-64)
“The present disclosure is directed to various embodiments of artificial neural networks [i.e. implementing the adapted neural network] and methods of training artificial neural networks utilizing selective plasticity such that the artificial neural network can learn new tasks (e.g., road detection during nighttime) without forgetting old tasks (e.g., road detection during daytime). [i.e. with respect to the input data for the target task].”
Kolouri does not explicitly teach:
1. adapting the retrained neural network to include all of the plurality of parameters except for one or more parameters of the plurality of parameters that are respectively replaced by the one or more second parameters
Desjardins teaches: 
(Desjardins, ¶0043): “When there are more than two tasks in the sequence of machine learning tasks, e.g., when the model 110 still needs to be trained on a third task, e.g., task C, after being trained on task B [i.e. adapting the neural network trained for the plurality of tasks], after the trained parameter values 116 for task B have been determined, the machine learning system 100 provides the trained parameter values 116 [i.e. to include all of the first parameters] to the engine 112 so that the engine 112 can determine a new set of importance weights corresponding to task B [i.e. that are respectively replaced by the one or more second parameters.]”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.
Regarding Claim 36:
Kolouri in view of Desjardins teach the method of claim 35.
Kolouri does not explicitly teach:
1. generating the importance matrix by calculating importances of respective parameters of the neural network trained for the first task.
Desjardins teaches:
1. generating the importance matrix by calculating importances of respective parameters of the neural network trained for the first task.
 (Desjardins, ¶0013): “One way of determining, for each of the plurality of parameters, a respective measure of an importance of the parameter to the machine learning model achieving acceptable performance on the first machine learning task may include: determining a Fisher Information Matrix (FIM) of the plurality of parameters of the machine learning model with respect to the first machine learning task, [i.e. generating the importance matrix by calculating importances of respective parameters of the neural network trained for the first task].”
The motivation for combining Kolouri and Desjardins is the same as the motivation for Claim 7.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL JUSTIN BREENE whose telephone number is (571)272- 6320. The examiner can normally be reached Monday-Friday 9AM-5PM. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on 303-297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
 Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
/PAUL J BREENE/
Examiner, Art Unit 2129

/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129