DETAILED ACTION
This action is in response to the claims filed 01/27/2022 for application 16/147,939. Claims 1-4, 8, 9, and 13-15 have been amended and claim 5 is canceled. Thus claim 1-4 and 6-19 are currently pending. This action is made NON-FINAL.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The disclosure is objected to because of the following informalities: The specification recites, in multiple instances, "local leaning" which should read "local learning" to correspond to the claim amendments.  
Appropriate correction is required.
Claim Objections
Claim 1 is objected to because of the following informalities:  The claim recites "1." twice in the first line of the claim.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Visser et al. ("US 20170270406 A1", hereinafter "Visser") in view of Ma et al. ("US 20180075344 A1", hereinafter "Ma").

Regarding claim 1, Visser teaches A local learning system in a local artificial intelligence (AI) device, comprising: 
at least one data source (“Local devices may also be configured to track sources (e.g., one or more speakers or other source of a sound captured by a microphone).” [¶0038]); 
a data collector connected to the at least one data source (“Modern digital devices acquire a variety of sensor data and are able to communicate with a remote computing device, such as a cloud-based computing system or processor (which may be referred to as the “cloud”), for data analytics.” [¶0036]), and used to collect input data (“In circumstances where it is desirable to send a stream of data from devices with multiple inputs (e.g., cameras, microphones, or video feeds) to a server for processing, the sensor allocation and space-time information may be relevant” [¶0037]); 
a training data generator connected to the data collector, and used to analyze the input data to produce paired examples for supervised learning (“Deep Convolutional Network (DCN) may be trained with supervised learning. During training, a DCN may be presented with an image, such as a cropped image of a speed limit sign 326, and a “forward pass” may then be computed to produce an output 322. The output 322 may be a vector of values corresponding to features such as “sign,” “60,” and “100.” The network designer may want the DCN to output a high score for some of the neurons in the output feature vector, for example the ones corresponding to “sign” and “60” as shown in the output 322 for a network 300 that has been trained.” [¶0052]), or unlabeled data for unsupervised learning (note: under BRI, the claim recites “or” thus this limitation is not required, however Visser does disclose unsupervised learning in [¶0056]); and 
a local learning engine connected to the training data generator, and including a local neural network, wherein the local neural network is trained by the paired examples (“In an aspect of the present disclosure, a method of training a device specific cloud-based audio processor is presented. The method includes receiving sensor data captured from multiple sensors at a local device and receiving spatial information labels computed on the local device using local configuration information. The spatial information labels are associated with the captured sensor data. The method also includes training lower layers of a first neural network based on the spatial information labels and sensor data. Additionally, the method includes incorporating the trained lower layers into a second, larger neural network for audio classification. The method further includes retraining the second neural network using the trained lower layers of the first neural network.” [¶0010]) or the unlabeled data in a training phase (note: under BRI, the claim recites “or” thus this limitation is not required, however Visser does disclose unsupervised learning in [¶0056] which implies unlabeled data.), and 
makes inference in an inference phase (“After learning, the DCN may be presented with new images 326 and a forward pass through the network may yield an output 322 that may be considered an inference or a prediction of the DCN.” [¶0055]); 
However Visser fails to explicitly teach
wherein the local learning engine is designed in a way that the inference phase is not interrupted during the training phase.
Ma teaches wherein the local learning engine is designed in a way that the inference phase is not interrupted during the training phase (“Notice that, there are distinguishes between online learning and on-chip learning. The Online Learning refers to the NN can continuously perform inferences, classifications or other NN tasks while simultaneously perform learning and updating the synaptic weights (predictors) without stopping the NN functions or switching to another mode, such as backprop mode.” [¶0041; note: Without stopping would be equivalent to “not interrupted”. See further ¶0046-¶0048 discloses online learning implementation.])
Visser and Ma are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Ma discloses neural network hardware accelerator architectures. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by implementing Ma’s online learning method. One would have been motivated to make this modification in order to improve the performance and efficiency of the neural network. [¶0007, Ma]

	Regarding claim 2, Visser/Ma teaches The local learning system in the local Al device as claimed in claim 1, where Visser teaches wherein the local learning system is trained in the local Al device, wherein the local Al device comprises a smartphone, tablet, smart-TV, telephone, computer, home entertainment or wearable device (“Many smart phones, tablets, and other portable multimedia devices have multiple sensors (e.g., multiple microphones, multiple cameras, etc.) As such, a local device may, for example, encode sound in different formats (e.g., 5.1 format, 7.1 format, and stereo) because the placement of sensors on the local device is known.” [¶0038])

Regarding claim 3, Visser/Ma teaches The local learning system in the local Al device as claimed in claim 1, where Visser teaches wherein the local learning engine allows inputting a single training data point in sequence or data points in parallel (“A recurrent architecture may be helpful in recognizing patterns that span more than one of the input data chunks that are delivered to the neural network in a sequence.” [¶0049; note: under BRI, the claim recites “or”, thus the examiner has provided a citation corresponding to inputting a single training data point in sequence.]).

Regarding claim 6, Visser/Ma teaches The local learning system in the local AI device as claimed in claim 1, where Visser teaches wherein the local AI device is a smartphone (“The local device 602 may comprise a multimedia device such as a mobile phone (e.g., smartphone), a camera, audio device or the like.” [¶0068]), the at least one data source includes a primary microphone and a secondary microphone (“The DSP may, in some aspects, be coupled to or included within one or more sensors. The sensors, which may for instance comprise audio sensors (e.g., microphones), visual sensors (e.g., cameras) and/or other types of sensors, may detect environmental conditions.” [¶0068; one or more microphones implies a first and second microphone.]), and the training data generator produces data pairs from at least one of the primary microphone or the secondary microphone (“The local device 602 may gather sensor information, which may include the raw sensor data from each of the sensors and related information (e.g., timestamp and location), and produce a package of information. The package may include, for example, raw sensor data, labels, user device identification and other information. The labels may be based on information available only to the local device 602, such as microphone location, speed, and device location. In some aspects, the labels may be based on device geometry, separated beamformed streams, device identification and/or the like.” [¶0069]).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Visser in view of Ma and further in view of Williamson ("Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps", hereinafter "Williamson").

Regarding claim 4, Visser/Ma teaches The local learning system in the local AI device as claimed in claim 1, however fails to explicitly teach wherein the local leaning learning engine employs an incremental leaning learning mechanism.
Williamson teaches wherein the local leaning learning engine employs an incremental leaning learning mechanism (“Gaussian ARTMAP is essentially an incremental learning Gaussian classifier in which each output class is determined during training to correspond to any number of sources of Gaussianly distributed data.” [pg. 884, Gaussian ARTMAP, ¶5])
Visser, Ma, and Williamson are in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Ma discloses neural network hardware accelerator architectures. Williamson discloses an incremental learning method of noisy multidimensional maps. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Ma’s teachings by implementing the incremental learning method taught by Williamson. One would have been motivated to make this modification in order to improve the future prediction of an input data sample without any prior knowledge. [pg. 881, §1. Introduction, ¶1, Williamson]

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Visser in view of Ma and further in view of López-Espejo et al. ("Deep Neural Network-Based Noise Estimation for Robust ASR in Dual-Microphone Smartphones", hereinafter "López-Espejo").

Regarding claim 7, Visser/Ma teaches The local learning system in the local AI device as claimed in claim 6, however fails to explicitly teach wherein the data pairs imply a clean sound and a noisy sound 
López-Espejo teaches wherein the data pairs imply a clean sound and a noisy sound (“In AURORA2- 2C, the multi-style training dataset is created from the training clean utterances of Aurora-2 and consists of dual-channel utterances contaminated with the types of noise in test set A at the signal-to-noise ratios (SNRs) of 5 dB, 10 dB, 15 dB and 20 dB, along with the clean condition. Noisy utterances are compensated with VTS using the corresponding noise estimation algorithm prior to training the multi-style acoustic models. [pg. 122, § 3.1 Recognition Framework, ¶4]).
Visser, Ma, and López-Espejo are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Ma discloses neural network hardware accelerator architectures. López-Espejo discloses deep neural network-based noise estimation in dual-microphone smartphones. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Ma’s teachings with the noise estimation method as taught by López-Espejo. One would have been motivated to make this modification in order to perform noise estimation in dual-microphone smartphones. [Abstract, López-Espejo]

Regarding claim 8, Visser/Ma/López-Espejo teaches The local learning system in the local AI device as claimed in claim 7, where Visser teaches wherein the local learning engine is trained by stochastic gradient descent with the data pairs (“In practice, the error gradient of weights may be calculated over a small number of examples, so that the calculated gradient approximates the true error gradient. This approximation method may be referred to as stochastic gradient descent. Stochastic gradient descent may be repeated until the achievable error rate of the entire system has stopped decreasing or until the error rate has reached a target level.” [¶0054]), 
López-Espejo teaches so as to perform sound enhancement by identifying and further filtering out the noise from the noisy sound (“These vectors are comprised of M frequency components where M is the total number of filterbank channels. Moreover, the subscript indicates the channel to which each vector belongs. Our DNN works on a frame-by-frame basis so that it gives a noise frame estimate at each time t from an input consisting of the dual-channel noisy speech observation at time t along with its temporal context.” [pg. 120, § 2.1 Duel-Channel Noise Estimation Based on DNN, ¶2; See further: “Training input data consisted of a mixture of samples contaminated with the noises of test set A at the SNRs of −5 dB, 0 dB, 5 dB, 10 dB, 15 dB and 20 dB. Thus, the noise types of test set B are useful to test the generalization capability of the DNN to unseen noise conditions during training.” [pg. 123, § 3.2 DNN Setup, ¶1]]).
Visser, Ma, and López-Espejo are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Ma discloses neural network hardware accelerator architectures. López-Espejo discloses deep neural network-based noise estimation in dual-microphone smartphones. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Ma’s teachings with the noise estimation method as taught by López-Espejo. One would have been motivated to make this modification in order to perform noise estimation in dual-microphone smartphones. [Abstract, López-Espejo]

Claims 9-12, 15, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Visser in view of Hu et al. ("Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures", hereinafter "Hu").

Regarding claim 9, Visser teaches A local learning system in a local artificial intelligence (AI) device, comprising: 
at least one data source (“Local devices may also be configured to track sources (e.g., one or more speakers or other source of a sound captured by a microphone).” [¶0038]); 
a data collector connected to the at least one data source (“Modern digital devices acquire a variety of sensor data and are able to communicate with a remote computing device, such as a cloud-based computing system or processor (which may be referred to as the “cloud”), for data analytics.” [¶0036]), and used to collect input data (“In circumstances where it is desirable to send a stream of data from devices with multiple inputs (e.g., cameras, microphones, or video feeds) to a server for processing, the sensor allocation and space-time information may be relevant” [¶0037]); 
a data generator connected to the data collector, and used to analyze the input data (“Deep Convolutional Network (DCN) may be trained with supervised learning. During training, a DCN may be presented with an image, such as a cropped image of a speed limit sign 326, and a “forward pass” may then be computed to produce an output 322. The output 322 may be a vector of values corresponding to features such as “sign,” “60,” and “100.” The network designer may want the DCN to output a high score for some of the neurons in the output feature vector, for example the ones corresponding to “sign” and “60” as shown in the output 322 for a network 300 that has been trained.” [¶0052]); and 
a local engine connected to the data generator, and including a local neural network (“In an aspect of the present disclosure, a method of training a device specific cloud-based audio processor is presented. The method includes receiving sensor data captured from multiple sensors at a local device and receiving spatial information labels computed on the local device using local configuration information. The spatial information labels are associated with the captured sensor data. The method also includes training lower layers of a first neural network based on the spatial information labels and sensor data. Additionally, the method includes incorporating the trained lower layers into a second, larger neural network for audio classification. The method further includes retraining the second neural network using the trained lower layers of the first neural network.” [¶0010]), and makes inference with the input data in an inference phase (“After learning, the DCN may be presented with new images 326 and a forward pass through the network may yield an output 322 that may be considered an inference or a prediction of the DCN.” [¶0055]), 
wherein each of the neurons of the local neural network is used for classifying features (“A connection from a neuron in a given layer to a neuron in a lower layer is called a feedback (or top-down) connection. A network with many feedback connections may be helpful when the recognition of a high-level concept may aid in discriminating the particular low-level features of an input.” [¶0049; See further “The instructions loaded into the general-purpose processor 102 may further comprise code for receiving classification results from the cloud and for performing tasks based on the classification results.” [¶0043]]).
However Visser fails to explicitly teach wherein the local neural network is a pruned neural network that some neurons or some links thereof are pruned by a neuron statistic engine
Hu teaches wherein the local neural network is a pruned neural network that some neurons or some links thereof are pruned by a neuron statistic engine (“In this paper, we introduce network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset. Our algorithm is inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero, regardless of what inputs the network received” [Abstract, Hu; See further: “Starting from an empirically designed network, our algorithm first identifies redundant weak neurons by analyzing their activations on a large validation dataset. Then those weak neurons are pruned while others are kept to initialize a new model. Finally, the new model is retrained or fine-tuned depending on the performance drop. The retrained new model can maintain the same or achieve higher performance with smaller number of neurons. This process can be carried out iteratively until a satisfying model is produced.” [pg. 2, top para; Examiner is interpreting Hu’s algorithm to be equivalent to a “neuron statistic engine”. The algorithm identifies weak neurons to be pruned.]])
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Regarding claim 10, Visser/Hu teaches The local learning system in the local AI device as claimed in claim 9, where Hu teaches wherein the neuron statistic engine is designed to compute and store activity statistics for each neuron at an application phase (“We use the definition of APoZ to evaluate the importance of each neuron in a network. To validate our observation that the outputs of some neurons in a large network are mostly zero, we calculate the APoZ of each neuron and find that there are 631 neurons in the VGG-16 network which have APoZ larger than 90%.” [pg. 3, ¶2; See further: “We have presented Network Trimming to prune redundant neurons based on the statistics of neurons’ activations’ [pg. 8, § 6. Conclusion, ¶1; The algorithm disclosed by Hu computes APoZ (Average Percentage of Zeros) to measure the activity of a neuron thus correspond to computing neuron statistics at an application phase. See further: “We define Average Percentage of Zeros (APoZ) to measure the percentage of zero activations of a neuron after the ReLU mapping” [pg. 2, § 3.1. Zero Activations in VGG-16, ¶1; See also §4.1.2: implies these statistics are stored from previous iterations.]]).
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Regarding claim 11, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 10, where Hu teaches wherein the activity statistics include a histogram, a mean, or a variance of neuron's input and/or output (“To better understand the behavior of zero activations in a network, we compute the mean APoZ (Table 1) of all neurons in each layer (except for the last one) of the VGG-16 network.” [pg. 3, ¶3; See Fig. 1/Fig. 2 discloses histogram.]).
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Regarding claim 12, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 9, where Hu teaches wherein the neuron statistic engine deactivates neurons with small output values (“Since a neural network has a multiplication-addition-activation computation process, a neuron which has its outputs mostly zeros will have very little contribution to the output of subsequent layers, as well as to the final results. Thus, we can remove those neurons without harming too much to the overall accuracy of the network. In this way, we can find the optimal number of neurons for each layer and thus obtain a better network without redesign and extensive human labor” [pg. 3, ¶3; Examiner is interpreting removing to correspond to deactivating.]).
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Regarding claim 15, Visser/Hu teaches The local learning system in the local AI device as claimed in claim 9, where Hu teaches wherein the neuron statistic engine prunes the local neural network by an aggressive pruning without verification or a defensive pruning with verification, wherein the aggressive pruning means to directly prune the neurons that satisfy a pruning/merging criteria (“Since a neural network has a multiplication-addition-activation computation process, a neuron which has its outputs mostly zeros will have very little contribution to the output of subsequent layers, as well as to the final results. Thus, we can remove those neurons without harming too much to the overall accuracy of the network. In this way, we can find the optimal number of neurons for each layer and thus obtain a better network without redesign and extensive human labor” [pg. 3, ¶3; See further: “Neurons with high APoZ are pruned according to certain criteria. The connections to and from the neuron are removed accordingly when a neuron is pruned (see Figure 4 5).” [pg. 3, § 3.2. Network Trimming and Retraining, ¶2]]).
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Regarding claim 16, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 9, where Hu teaches wherein the pruned neural network in the local Al device is derived by pruning an original neural network possessing model generality (“In this paper, we introduce network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset. Our algorithm is inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero, regardless of what inputs the network received” [Abstract, Hu]).
Visser and Hu are both in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s teachings by Hu’s network pruning method. One would have been motivated to make this modification in order to remove unimportant neurons in order to reduce computational and memory costs. [Abstract, Hu]

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Visser in view of Hu and further in view of Scardapane et al. ("Group sparse regularization for deep neural networks", hereinafter "Scardapane").

Regarding claim 13, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 9, however fails to explicitly teach wherein the neuron statistic engine replaces a part of neurons respectively with simple bias units.
Scardapane teaches wherein the neuron statistic engine replaces a part of neurons respectively with simple bias units (“We note that having a separate group for every bias is not the unique choice. We can consider having a single bias unit for every layer feeding every neuron in that layer. In this case, we would have a single bias group per layer, corresponding to keeping or deleting every bias in it.” [pg. 83, § 3.1 Formulation of the algorithm, ¶3]).
Visser, Hu and Scardapane are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. Scardapane discloses sparse regularization for deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Hu’s teachings by replacing a part of neurons with a respective bias unit. One would have been motivated to make this modification in order to simultaneously carry out pruning and feature selection while optimizing the weights of a neural network. [Conclusion, Scardapane]
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Mariet et al. ("Diversity Networks: Neural Network Compression Using Determinantal Point Processes", hereinafter "Mariet").

Regarding claim 14, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 9, however fails to explicitly teach wherein the neuron statistic engine merges neurons with same histogram
Mariet teaches wherein the neuron statistic engine merges neurons with same histogram (“Divnet leverages similarities between the behaviors of neurons in a layer to detect redundant parameters and merge them, thereby enforcing neuronal diversity within each hidden layer. Using Divnet, large, redundant networks can be shrunk to much smaller structures without impacting their performance and without requiring further training.” [pg. 8, § 4 Future Work and Conclusion, ¶1; note: merging similar neurons would imply that they would have the same histogram.])
Visser, Hu, and Mariet are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. Mariet teaches diversity networks by merging similar neurons. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Hu’s teachings by merging similar neurons as taught by Mariet. One would have been motivated to make this modification in order to merge redundant neurons which leads to smaller network sizes. [Abstact, Mariet]
Claims 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Visser in view of Hu and further in view of Gouttaya et al. ("Improving the Proactive Recommendation in Smart Home Environments: An Approach Based on Case Based Reasoning and BP-Neural Network", hereinafter "Gouttaya").

Regarding claim 17, Visser/Hu teaches The local learning system in the local Al device as claimed in claim 9, where Visser teaches wherein the neuron statistic engine is connected to the local neural network (“Modern digital devices acquire a variety of sensor data and are able to communicate with a remote computing device, such as a cloud-based computing system or processor (which may be referred to as the “cloud”), for data analytics.” [¶0036; note: Visser teaches a cloud-based computing system, thus connecting the system/algorithm of Hu would be easily implemented.]), 
However Visser/Hu fails to explicitly teach and includes a plurality of profiles, wherein a model structure of the local neural network is decided based on a selected profile from the profiles.
Gouttaya teaches and includes a plurality of profiles (“Several hybrid approaches are based on Collaborative Filtering, but also maintain a content-based profile for each user.” [pg. 30, § C. Hybrid filtering systems, ¶1]), wherein a model structure of the local neural network is decided based on a selected profile from the profiles (“The application can be capable to automatically launch, for the user, the appropriate services according to his current context in any context of use and disables other unused services.  The information context in smart home environment can be defined using several parameters. In this study, we define the context of the user profile by: User Context = {P1= Time, P2= Localization, P3= temperature} P1: represents the current time of the day {1-6, 6-8, 8-10, 10-12,12-13, 13-14, 14-16, 16-18, 18-20, 20-22, 22-01} P2: indicates the current location of the user {kitchen, living room, bedroom, dining room} P3 indicates the daytime temperatures {cold, warm, hot} We define a smart home service by two parameters: The service identifier and the state of the service (ON/OFF). Where: Service-id: represents the name of a specific smart service: {Light, TV, air-conditioner, etc…} State represents the state of the service: {ON, OFF} The proposed system activates or deactivates the appropriate candidate solutions (TV, radio, air-conditioner, stores, and lighting) according to a given context.” [pg. 34, § V. Experimentation, ¶1]).
Visser, Hu, and Gouttaya are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. Mariet teaches diversity networks by merging similar neurons. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Hu’s teachings by using the profiles as taught by Gouttaya. One would have been motivated to make this modification in order to present the most relevant services to the user in response to any significant change to their context/profile. [Abstract, Gouttaya] 

Regarding claim 18, Visser/Hu/Gouttaya teaches The local learning system in the local Al device as claimed in claim 17, where Gouttaya teaches wherein the profiles imply different users, scenes, or computing resources (“The application can be capable to automatically launch, for the user, the appropriate services according to his current context in any context of use and disables other unused services.  The information context in smart home environment can be defined using several parameters. In this study, we define the context of the user profile by: User Context = {P1= Time, P2= Localization, P3= temperature}” [pg. 34, § V. Experimentation, ¶1]).
Visser, Hu, and Gouttaya are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. Mariet teaches diversity networks by merging similar neurons. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Hu’s teachings by using the profiles as taught by Gouttaya. One would have been motivated to make this modification in order to present the most relevant services to the user in response to any significant change to their context/profile. [Abstract, Gouttaya] 

Regarding claim 19, Visser/Hu/Gouttaya teaches The local learning system in the local Al device as claimed in claim 17, where Visser teaches further comprising a classification engine (“The method further includes predicting an audio event classification based on the sensor data without retraining the neural network.” [¶0011]) connected to the neuron statistic engine (“Modern digital devices acquire a variety of sensor data and are able to communicate with a remote computing device, such as a cloud-based computing system or processor (which may be referred to as the “cloud”), for data analytics.” [¶0036; note: Visser teaches a cloud-based computing system, thus connecting the system/algorithm of Hu would be easily implemented.]), and 
Gouttaya teaches designed to classify the raw input(s) to select a suitable profile for the local neural network (“Our approach aims to integrate to the pervasivem recommender systems, the ability to predict the most useful services to be offered to the user in future contextual situations and this by exploiting the context history. This latter is extracted from past interactions between the user and Pervasive recommender systems.” [pg. 31, § IV. The Proposed Approach, ¶1]).
Visser, Hu, and Gouttaya are all in the same field of endeavor of machine learning. Visser discloses cloud-based processing using local devices with sensory inputs. Hu teaches a network trimming method by neuron pruning. Mariet teaches diversity networks by merging similar neurons. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Visser’s/Hu’s teachings by using the profiles as taught by Gouttaya. One would have been motivated to make this modification in order to present the most relevant services to the user in response to any significant change to their context/profile. [Abstract, Gouttaya] 

Response to Arguments
Applicant’s arguments with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see the updated 103 rejection with the newly presented arts of Visser and Ma for claim 1. 

Applicant’s arguments with respect to claim 9, in particular, the limitation of “wherein each of the neurons of the local neural network is used for classifying features” has been considered but are moot because the amended limitation is now taught by the newly presented art of Visser.


Applicant’s arguments with respect to the rejections of the dependent claims have been fully considered but they are not persuasive as they rely upon the allowability of the independent claims.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Babaeizadeh et al. ("NoiseOut: A Simple Way to Prune Neural Networks") discloses merging neurons with similar correlation for pruning neural networks.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/M.H.H./Examiner, Art Unit 2122                                                                                                                                                                                                        

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122