DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The following claims is/are pending in this office action: 1-22
Claim(s) 1-22 is/are rejected
Drawings
The drawings were received on 02/11/2019 are accepted.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 02/11/2019 has been accepted.  The submissions are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner. Initialed and dated copies of Applicant’s IDS forms 1449 are attached to the Office Action.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 14-22 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by McMahan et al. (GB2556981A; hereinafter “McMahan”).

Regarding claim 14, McMahan teaches a device comprising: an artificial intelligence (AI) chip (Para 0065: “The one or more processors 212 can include any suitable processing device, such as a microprocessor, microcontroller, integrated circuit, logic device, or other suitable processing device.” Para 0069: “Any number of client devices 230 can be connected to the server 210 over the network 242… Similar to the server 210, a client device 230 can include one or more processor(s) 232 and a memory 234.” Para 0055: “Server 104 can be configured to access machine learning model 106, and to provide model 106 to a plurality of client devices 102. Model 106 can be, for instance, a linear regression model, logistic regression model, a support vector machine model, a neural network.” Server provides ML models to the clients. Each processor has ICs performing AI tasks which makes them AI chips per the definition of AI chip provided in spec para 0014: “The term "AI chip" refers to a hardware- or software-based device that is capable of performing functions of an AI logic circuit. An AI chip can be a physical IC.”) 
and a processing device containing programming instructions that, when executed, will cause the processing device to: (Para 0065: “The one or more memory devices 214 can store information accessible by the one or more processors 212, including computer readable instructions 216 that can be executed by the one or more processors 212) (i) access a dataset (Fig. 3 shows data can be accessed by client device and a sever.)

(ii) receive an initial artificial intelligence (AI) model from a host device (Fig. 3 shows model is received from server to the client in the first iteration.) 
 (iii) update the initial AI model by updating a subset of parameters of the initial AI model (Fig. 4 shows initial ML model is updated based on updated values of parameters.)
(iv) load the initial AI model into the Al chip (Fig. 3 shows model is sent to client device so it can get loaded.) to determine a first performance value of the initial AI model based on the dataset) (Fig. 6 shows accuracy of models at various iterations. Para 0115: “Note the x-axis is on a log scale for top-right plot. Over 70% accuracy was achieved with fewer than 100MB of communication.”)
(v) determine a first probability that a current AI model should be replaced by the
initial AI model (Para 0046: “For example, the weights can be probabilistically quantized.” Para 0047: Probability of compressed update is shown in Para 0047. By setting J =1, it will represent the probability corresponds to first round or iteration or the first probability, which also gives a direction if desired update is achieved so model will not be updated further.) (In Fig. 6, Model performance in terms of accuracy is plotted for each iteration. Before starting a new iteration, a performance value of current AI can be compared with the initial AI model.)
(vi) determine, based on the first probability, whether to replace the current AI model with the initial AI model (In Fig. 6 Model performance in terms of accuracy is plotted for each iteration. Before starting a new iteration, a performance value of current AI can be compared with the initial AI model. Also the probabilistic update method is discussed in Para 0047 and also in above limitation.)
(vii) if it is determined that the current AI model be replaced with the initial Al model, replace the current AI model with the initial AI model (Para 0071: “For example, the local updater can perform one or more training techniques such as, for example, backwards propagation of errors to re-train or otherwise update the model based on the locally stored training data.” Based on performance comparison in previous limitation, a model can be updated.)
(viii) transmit the current Al model and the first performance value of the initial AI
model to the host device (Fig. 3 shows current ML model is transmitted to the host, the model will have its associated accuracy as calculated in previous limitation.).

Regarding claim 15, McMahan teaches the method of claim 14.
McMahan also teaches further comprising additional programming instructions configured to cause the processing device to repeat steps (iii-vii) for a number of iterations (Para 0022: “Performing a plurality of rounds of the above actions iteratively improves the global model based on training data stored at the client devices.” Para 0090: “Any number of iterations of local and global updates can be performed.).

Regarding claim 16, McMahan teaches the method of claim 14.
McMahan also teaches wherein programming instructions for loading the initial AI model into the AI chip comprise programming instructions to load the subset of parameters of the initial AI model into the AI chip (Para 0065: “The one or more processors 212 can include any suitable processing device, such as a microprocessor, microcontroller, integrated circuit, logic device, or other suitable processing device.” Para 0071: “The instructions 236 can include instructions for implementing a local updater configured to determine one or more local updates according to example aspects of the present disclosure.” Fig. 2 shows multiple client devices that contain processors and instructions for local updater which will include the instructions to load AI models to AI chip at local client device.).

Regarding claim 17, McMahan teach the method of claim 14.
McMahan also teaches wherein the subset of parameters of the initial AI model include: weights of a convolution layer of a CNN model or a group of parameters of the CNN model selected from one of: kernels, scalars, and bias values of one or more convolution layers of the CNN model (Para 0031: “For example, a neural network with five layers can have five model parameter matrices W that respectively represent the parameters of such layers, and five update matrices Hi that respectively represent updates to the parameters of such layers. In many instances, W and Hi are 2D matrices… However, the kernel of a convolutional layer is a 4D tensor of the shape #input x width x height x #output. In such case, W can be reshaped from the kernel to be of shape (#input x width x height) x #output.” Kernel are included in subset of parameters as shown. Para 0052: “Randomly rotating h before the quantization can more evenly distribute the scalar values across the intervals. In the decoding phase, the server can perform the inverse rotation before aggregating all the updates.” Scalar values are also part of parameter subset when updates of the model is done. Para 0031: “In implementations in which deep learning is performed, a separate matrix W can be used to represent the parameters of each layer. Thus, each reference herein to a matrix (e.g., a model parameter matrix Wf or an update matrix Hi) should be understood as not being limited to a single matrix but instead refer to one or more matrices that may be used to represent a given model or update to such model.” As we noted CNN which is a type of deep neural network is used in McMahan. Parameters of each layer will be updated in the model update step. Bias is a one of the parameters of the parameters, which implies Bias is included in this process.

Regarding claim 18, McMahan teach the method of claim 14.
McMahan also teaches wherein programming instructions for updating the initial AI model comprise programming instructions configured to: determine a second probability of updating the subset of parameters of the initial Al model (Para 0046: “For example, the weights can be probabilistically quantized.” Para 0047: Probability of compressed update is shown in Para 0047. By setting J =1, it will represent the probability corresponds to first round or iteration or the first probability, which also gives a direction if desired update is achieved so model will not be updated further.)
(Fig. 6 shows model accuracy for each iteration or communication round. Based on that parameters will be updated. If accuracy is low, amplitude of change of parameters can be high. The amplitude will be low once convergence is being achieved as shown in Fig. 6.)
determine, based on the second probability, whether to update the subset of parameters of the initial AI model (Para 0047: After each iteration, probability of comprehension update can be found for next round which will advise if the parameter subset of the initial or previous ML model needs to be updated.)
if it is determined that the subset of parameters of the initial AI model be updated, update the subset of parameters of the initial AI model by changing the subset of parameters of the initial AI model by the amplitude of change; otherwise, do not update the subset of parameters of the initial Al model (Para 0047: Based on the probability, if parameters subset needs to be updated, it can get updated as described in Para 0004: “The method includes training, by the client computing device, the machine learned model based at least in part on a local dataset to obtain an update matrix that is descriptive of updated values for the set of parameters of the machine-learned model.”).

Regarding claim 19, McMahan teaches the method of claim 14. 
McMahan also teaches wherein programming instructions for determining the first probability comprise programming instructions configured to determine the first probability based on a closeness of the first performance value of the initial Al model relative to the second performance value of the current Al model (Para 0005: “The client device includes at least one processor and at least one non-transitory computer-readable medium…The operations include training the machine-learned model based at least in part on a local dataset to obtain an update matrix that is descriptive of updated values for the set of parameters of the machine-learned model.” The probability of compression update is discussed in claim 14 and 18 also in para 0047 where probability of updates can be found with respect to performances of the models.).

Regarding claim 20, McMahan teaches the method of claim 14.
McMahan also teaches wherein programming instructions for determining whether to replace the current AI model with the initial AI model comprise programming instructions configured to (Para 0006: “…present disclosure is directed to at least one nontransitory computer-readable medium that stores instructions that, when executed by a client computing device, cause the client computing device to perform operations.”) if the first probability has a value of one, determine that the current AI model be replaced by the initial AI model (Para 0047: Using the compression update probability method, if probability is calculated as 1 (means 100%), a determination can be made to update the model.)
 if the first probability has a value of less than one: generate a random value (Para 0096: “In some of such implementations, the method 400 can further include, prior to training the model at 404: generating the first matrix based at least in part on a seed and a pseudo-random number generator.” A random number generator is used to generate random number.)
(Para 0107: “In some of such implementations, generating the parameter mask can include generating the parameter mask based at least in part on a seed and a pseudo-random number generator” A random number generated can be compared with probability calculated in previous limitation, which can be used if parameters of the initial model should be updated (in other words initial AI model to be replaced.).

Regarding claim 21, McMahan teaches the method of claim 14.
McMahan also teaches wherein the host device is configured to: receive the current AI model and the first performance value of the initial AI model from the processing device (Fig. 3 shows current ML model is received by the host, the model will have its associated accuracy (or performance) as calculated.)
receive trained AI models from additional processing devices (Fig. 2 shows multiple client devices and Fig. 3 shows trained ML model is received from each client device by the server.)
obtain a global AI model based on the current AI model received from the processing device and the trained AI models from the additional processing devices (Fig. 2 shows multiple client devices and Fig. 3 shows trained ML model is received from each client device by the server. Fig. 3 also shows global model is received by client devices from server.)
cause the global AI model to be loaded into a physical AI chip (Para 0066: “The instructions 216 can be any set of instructions that when executed by the one or more processors 212, cause the one or more processors 212 to perform operations. For instance, the instructions 216 can be executed by the one or more processors 212 to implement a global updater 220.” Para 0065: “The one or more processors 212 can include any suitable processing device, such as a microprocessor, microcontroller, integrated circuit, logic device, or other suitable processing device.”).

Regarding claim 22, McMahan teaches the method of claim 21.
McMahan also teaches wherein the physical AI chip is coupled to a sensor (Para 0022: “In some implementations, systems implementing federated learning can perform the following actions… and the server redistributes the global model to all the clients. Performing a plurality of rounds of the above actions iteratively improves the global model based on training data stored at the client devices.” Also in Fig. 3 shows that after completion of iteration model is loaded or sent to local computing device containing ICs or AI chip.)
and configured to: receive data captured from the sensor (Para 0064: “The server 210 can also include a network interface used to communicate with one or more client devices 230 over the network 242. The network interface can include any suitable components for interfacing with one more networks, including for example, transmitters, receivers, ports, controllers, antennas, or other suitable components.” “The client device 230 of Figure 2 can include various input/output devices for providing and receiving information from a user, such as a touch screen, touch pad, data entry keys, speakers, and/or a microphone suitable for voice recognition.” AI chip can receive data using network interface or via I/O devices similar to the arrangement provided in the spec in para 0071: “The hardware may also include a user interface sensor 745 that allows for receipt of data from input devices 750 such as a keyboard, a mouse, a joystick, a touchscreen, a remote control, a pointing device, a video input device, and/or an audio input device, such as a microphone.”)
perform an AI task based on the captured data and the global AI model in the physical AI chip (Fig. 3 shows the process of performing AI task and generating global AI model on a computing device.).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-13 are rejected under 35 U.S.C. 103 as being unpatentable over McMahan et al. (GB2556981A; hereinafter “McMahan”) in view of Shao et al. (A performance analysis of convolutional neural network models in sar target recognition; hereinafter “Shao”).

Regarding claim 1, McMahan teaches a system comprising: a plurality of artificial intelligence (AI) chips (Para 0065: “The one or more processors 212 can include any suitable processing device, such as a microprocessor, microcontroller, integrated circuit, logic device, or other suitable processing device.” Para 0069: “Any number of client devices 230 can be connected to the server 210 over the network 242… Similar to the server 210, a client device 230 can include one or more processor(s) 232 and a memory 234.” Para 0055: “Server 104 can be configured to access machine learning model 106, and to provide model 106 to a plurality of client devices 102. Model 106 can be, for instance, a linear regression model, logistic regression model, a support vector machine model, a neural network.” Server provides ML models to the clients. Each processor has ICs performing AI tasks which makes them AI chips per the definition of AI chip provided in spec para 0014: “The term "AI chip" refers to a hardware- or software-based device that is capable of performing functions of an AI logic circuit. An AI chip can be a physical IC.”)
and a processing device communicatively coupled to the plurality of AI chips and configured to: (i) transmit a respective initial AI model to each of the plurality of AI chips (Para 0055: “Server 104 can be configured to access machine learning model 106, and to provide model 106 to a plurality of client devices 102. Model 106 can be, for instance, a linear regression model, logistic regression model, a support vector machine model, a neural network (e.g. convolutional neural network, recurrent neural network, etc.), or other suitable model.)
(ii) receive a respective AI model … from each of the plurality of AI chips (Fig. 3 shows that AI model is received from clients processor to host or server.) wherein the respective AI model is updated based on the respective initial AI model by one of a plurality of subsets of parameters of the respective initial AI model (Para 0004: “The method includes training, by the client computing device, the machine learned model based at least in part on a local dataset to obtain an update matrix that is descriptive of updated values for the set of parameters of the machine-learned model.” Para 0043: “One way of encoding the updates is by sampling only a random subset of the parameters described by an update.” Subset of parameters are used to update individual AI model.)
… from the plurality of AI chips (Para 0022: “the updated models or model updates are sent by each client to the server.” AI models from clients (which contains processors with ICs or AI chips as discussed in previous limitations) are received by the server.)
and (iv) determine a global AI model based on the optimal AI model (Para 0081: “At (308), method (300) can include determining, by the server, a global model based at least in part on the received local model.” Fig. 3 contains determining a global model based on the local optimal model provided to the server.)
McMahan does not explicitly teach and an associated performance value of the respective AI model; (iii) determine an optimal AI model that has a best performance value among the performance values associated with the respective AI models
Shao, however, teaches and an associated performance value of the respective AI model (Page 4: Table 1 shows model accuracy for multiple CNN models.)
(iii) determine an optimal AI model that has a best performance value among the performance values associated with the respective AI models (“Table 1 shows the final accuracy on the test set of all models, except for LeNet, each model has reached more than 99%. The recognition accuracy of SE-ResNet18 have reached 100%.” Model with the best accuracy can be called optimal model and can be transmitted to server.)
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to combine the global model determination method of McMahan with (McMahan, Para 0021).

Regarding claim 2, McMahan and Shao teach the method of claim 1.
wherein the processing device is further configured to repeat steps (i)-(iv) for multiple iterations (Para 0090: “Any number of iterations of local and global updates can be performed. That is, method (300) can be performed iteratively to update the global model based on locally stored training data over time.”)
wherein: a number of subsets in the plurality of subsets of parameters equals a number of iterations in the multiple iterations (Para 0085: “In some implementations, the local update can be determined based at least in part using one or more stochastic updates or iterations. For instance, the client device may randomly sample a partition of data examples stored on the client device to determine the local update.” Para 0043: “One way of encoding the updates is by sampling only a random subset of the parameters described by an update.” For each update or iteration a random set of subsets are selected. In other words, each time when iteration is performed a subset is selected for update. This way, total number of iterations equals total number of sampling of subsets performed.)
in each of the multiple iterations: the respective AI model is updated by a respective subset of parameters based on the respective initial AI model (Para 0085: “In some implementations, the local update can be determined based at least in part using one or more stochastic updates or iterations.” Para 0043: “One way of encoding the updates is by sampling only a random subset of the parameters described by an update.” Para 0090: “Any number of iterations of local and global updates can be performed. That is, method (300) can be performed iteratively to update the global model based on locally stored training data over time.” A subset of parameters are used to update AI model at each iteration).

Regarding claim 3, McMahan and Shao teach the method of claim 2.
McMahan also teaches wherein a subset of the plurality of subsets of parameters of the respective initial AI model include weights of a respective convolution layer of a CNN model (Para 0046: “Another way of encoding the updates is by quantizing the weights. For example, the weights can be probabilistically quantized.” Para 0113: “Table I provides low rank and sampling parameters for the example CIFAR experiments. The Sampling Probabilities column gives the fraction of elements uploaded for the two convolutional layers and the two fully-connected layers, respectively.” Local AI models are CNN models, in which subset of parameters includes weights of the layer. The subset of parameters are used to update the model as we learnt in claim 2.).

Regarding claim 4, McMahan and Shao teach the method of claim 2.
McMahan also teaches wherein a subset of the plurality of subsets of parameters of the respective initial AI model include a respective group of parameters of a CNN model selected from one of: kernels, scalars, and bias values of one or more convolution layers of the CNN model (Para 0031: “For example, a neural network with five layers can have five model parameter matrices W that respectively represent the parameters of such layers, and five update matrices Hi that respectively represent updates to the parameters of such layers. In many instances, W and Hi are 2D matrices… However, the kernel of a convolutional layer is a 4D tensor of the shape #input x width x height x #output. In such case, W can be reshaped from the kernel to be of shape (#input x width x height) x #output.” Kernel are included in subset of parameters as shown. Para 0052: “Randomly rotating h before the quantization can more evenly distribute the scalar values across the intervals. In the decoding phase, the server can perform the inverse rotation before aggregating all the updates.” Scalar values are also part of parameter subset when updates of the model is done. Para 0031: “In implementations in which deep learning is performed, a separate matrix W can be used to represent the parameters of each layer. Thus, each reference herein to a matrix (e.g., a model parameter matrix Wf or an update matrix Hi) should be understood as not being limited to a single matrix but instead refer to one or more matrices that may be used to represent a given model or update to such model.” As we noted CNN which is a type of deep neural network is used in McMahan. Parameters of each layer will be updated in the model update step. Bias is a one of the parameters of the parameters, which implies Bias is included in this process.).

Regarding claim 5, McMahan and Shao teach the method of claim 2.
McMahan also teaches wherein the processing device is further configured to, at each of the multiple iterations, generate the respective initial AI model for at least one of the plurality of AI chips based on a respective previous initial AI model for that AI chip that is generated at a preceding iteration (Para 0022: “In some implementations, systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server; the server aggregates the updates (e.g, by averaging the updates) and improves the global model; and the server redistributes the global model to all the clients. Performing a plurality of rounds of the above actions iteratively improves the global model based on training data stored at the client devices.” Model is distributed to all clients at the end of iteration. This model will be used by clients, and the model update process will be again repeated on the model received from previous iteration.) and velocity of AI model for that AI chip (Para 0085: “In particular, the local update may be determined using stochastic model descent techniques to determine a direction in which to adjust one or more parameters of the loss function.” Para 0079: “At (302), method (300) can include determining, by a client device, a local model… In some implementations, structured update, sketched update, or other techniques can be used at (302) to render the learned local model or local update communication efficient.” The velocity is defined in Spec Para 0035 as “The new velocity may also be determined based on the closeness of the current initial AI model for the client device relative to the local optimal AI model for that client device.” In McMahan the model is updated in the direction of the loss function to ensure convergence is achieved, which in other words updated in the direction of optimality.).

Regarding claim 6, McMahan and Shao teach the method of claim 5.
McMahan also teaches wherein the velocity of AI model for the AI chip is based on at least one of (1) a closeness of the respective previous initial AI model relative to the optimal AI model (Para 0085: “In particular, the local update may be determined using stochastic model descent techniques to determine a direction in which to adjust one or more parameters of the loss function.” Para 0079: “At (302), method (300) can include determining, by a client device, a local model… In some implementations, structured update, sketched update, or other techniques can be used at (302) to render the learned local model or local update communication efficient.” The parameters update will be done in the direction of model optimality by minimizing the loss function. The higher the loss function the higher will be the velocity.)
and (2) a closeness of the respective previous initial Al model relative to the global AI model (As seen in Fig. 3 global model is received by client from server, which defines the updates to be performed on local machine learning model at each client. The goal is to train or tune local AI model relative to the global model.).

Regarding claim 7, McMahan and Shao teach the method of claim 2.
McMahan also teaches wherein the processing device is further configured to, upon a completion of the multiple iterations, cause the global AI model to be loaded into a physical AI chip coupled to a sensor (Para 0022: “In some implementations, systems implementing federated learning can perform the following actions… and the server redistributes the global model to all the clients. Performing a plurality of rounds of the above actions iteratively improves the global model based on training data stored at the client devices.” Also in Fig. 3 shows that after completion of iteration model is loaded or sent to local computing device containing ICs or AI chip.)
(Para 0064: “The server 210 can also include a network interface used to communicate with one or more client devices 230 over the network 242. The network interface can include any suitable components for interfacing with one more networks, including for example, transmitters, receivers, ports, controllers, antennas, or other suitable components.” “The client device 230 of Figure 2 can include various input/output devices for providing and receiving information from a user, such as a touch screen, touch pad, data entry keys, speakers, and/or a microphone suitable for voice recognition.” AI chip can receive data using network interface or via I/O devices similar to the arrangement provided in the spec in para 0071: “The hardware may also include a user interface sensor 745 that allows for receipt of data from input devices 750 such as a keyboard, a mouse, a joystick, a touchscreen, a remote control, a pointing device, a video input device, and/or an audio input device, such as a microphone.”)
and perform an AI task based on the captured data and the global AI model in the physical AI chip (Fig. 3 shows the process of performing AI task and generating global AI model on a computing device).

Regarding claim 8 and 9, they are substantially similar to claims 1 and 2, and are rejected in the same manner, same art and reasoning applying.

Regarding clam 10, McMahan and Shao teach the method of claim 9.
(Para 0031: “For example, a neural network with five layers can have five model parameter matrices W that respectively represent the parameters of such layers, and five update matrices Hi that respectively represent updates to the parameters of such layers. In many instances, W and Hi are 2D matrices… However, the kernel of a convolutional layer is a 4D tensor of the shape #input x width x height x #output. In such case, W can be reshaped from the kernel to be of shape (#input x width x height) x #output.”)
or a respective group of parameters of a CNN model selected from one of: kernels, scalars, and bias values of one or more convolution layers of the CNN model (Para 0031: “For example, a neural network with five layers can have five model parameter matrices W that respectively represent the parameters of such layers, and five update matrices Hi that respectively represent updates to the parameters of such layers. In many instances, W and Hi are 2D matrices… However, the kernel of a convolutional layer is a 4D tensor of the shape #input x width x height x #output. In such case, W can be reshaped from the kernel to be of shape (#input x width x height) x #output.” Kernel are included in subset of parameters as shown. Para 0052: “Randomly rotating h before the quantization can more evenly distribute the scalar values across the intervals. In the decoding phase, the server can perform the inverse rotation before aggregating all the updates.” Scalar values are also part of parameter subset when updates of the model is done. Para 0031: “In implementations in which deep learning is performed, a separate matrix W can be used to represent the parameters of each layer. Thus, each reference herein to a matrix (e.g., a model parameter matrix Wf or an update matrix Hi) should be understood as not being limited to a single matrix but instead refer to one or more matrices that may be used to represent a given model or update to such model.” As we noted CNN which is a type of deep neural network is used in McMahan. Parameters of each layer will be updated in the model update step. Bias is a one of the parameters of the parameters, which implies Bias is included in this process.).

Regarding claim 11, 12, and 13, they are substantially similar to claims 5, 6, and 7, and are rejected in the same manner, same art and reasoning applying.

Conclusion
An inquiry concerning this communication or earlier communication from the examiner should be directed QAMAR IQBAL whose telephone number is 571-272-2563. The examiner can normally be reached on M-F 10-6pm (EST). 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

/Q.I/ 
Examiner 
Art unit 2123
03/31/2021

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123