Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Applicant’s Application filed on 05/10/2019 has been reviewed.
Claims 1-20 have been examined.
Notice of Pre-AIA  or AIA  Status
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
(Step 1)The claim(s) 1 and 11 recite(s) a method and an electronic apparatus, respectively, that are directed to eligible statutory categories under 35 U.S.C. 101. 
(Step 2A1-Judicial Exception?)The limitation of claim 1 “obtaining first multiplicative variables for input elements of the recurrent neural network; obtaining second multiplicative variables for an input neuron and a hidden neuron of the recurrent neural network; obtaining a mean and a variance for weights of the recurrent neural network, the first multiplicative variables, and the second multiplicative variables; and performing sparsification for the recurrent neural network based on the mean and the variance”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind and/or manual processes. Further, the steps are performed are directed toward data manipulation using mathematical process, such as “…performing sparsification for the recurrent neural network based on the mean and the variance…”. That is, nothing in the claim element precludes the step from practically being performed in the mind, and/or mathematical process of manipulating data. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind and/or manually performed, and/or mathematical process, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
As discussed above, the dependent claims 2-10 incorporate the limitations of claim 1, from which they depend, and therefore, recited same abstract metal process.
(Step 2A2-Integrate Into a Practical Application?)This judicial exception is not integrated into a practical application. In particular, the claim 1 does not recite (i) an improvement to the functionality of a computer or other technology or technical field (ii) use a “particular machine” to apply or use the judicial exception (iii) a particular transformation of an article to a different thing or state; or (iv) any other meaningful limitation . The claims are directed to an abstract idea. The dependent claims 2-10 merely further refine the abstract idea or add insignificant extra-solution activity. Thus, these dependent claims merely narrow the abstract idea by adding additional steps, which is insufficient to integrate the abstract idea into a practical application.
(Step 2B-Inventive Concept?)The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the claims do not add a specific limitation, or combination of limitations, that is not well-understood, routine, conventional activities at a high level of generality. Since the claims are directed toward data recognitions and learning in general routine and conventional, the claims have not been transformed into a patent eligible application of an abstract idea, the claims are not patent eligible.
Claims 11-20, other than using generic processing elements such as “memory…processor”, recite substantial similar limitations, and are likewise rejected.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by U.S. Patent No. 10305766 to Zhang et al. (hereinafter “Zhang”).
As to claim 1, Zhang teaches a method for compressing a recurrent neural network, the method comprising (col. 3 ln. 33-64, col. 35 ln. 8-49, computer implemented method and apparatus comprising processor and non-transitory computer readable storage medium): 
obtaining first multiplicative variables for input elements of the recurrent neural network (Fig. 7B, col. 17 ln. 66-col. 18 ln. 50, obtaining variables, i.e. “With further reference to FIG. 7A, the method 700 may begin with the processing logic receiving, from a first wireless receiver (RX) located in a building, data that includes first channel properties in a first communication link between a wireless transmitter (TX) and the first wireless RX that existed during a time period (710). The method 700 may continue with the processing logic receiving second channel properties in a second communication link between the wireless TX and the second wireless RX that existed during the time period (715). In one embodiment, channel properties represent wireless signal propagation characteristics of wireless signals being transmitted within the building. The method 700 may continue with the processing logic applying, in an input layer of a neural network, a separability function to the data to separate the first channel properties associated with a first group of frequencies from the second channel properties associated with a second group of frequencies, to generate a first input variable and a second input variable, respectively, to a multi-layered neural network (720). In one embodiment, the separability function is selected to optimize data separability of the first channel properties from the second channel properties, and according to the groups of frequencies for each of the first channel properties and the second channel properties”); 
obtaining second multiplicative variables for an input neuron and a hidden neuron of the recurrent neural network (col. 10 ln. 23-39, col. 15 ln. 60-col. 16 ln. 64, obtaining data from hidden network, i.e. “With further reference to FIG. 7A, the method 700 may begin with the processing logic receiving, from a first wireless receiver (RX) located in a building, data that includes first channel properties in a first communication link between a wireless transmitter (TX) and the first wireless RX that existed during a time period (710). The method 700 may continue with the processing logic receiving second channel properties in a second communication link between the wireless TX and the second wireless RX that existed during the time period (715). In one embodiment, channel properties represent wireless signal propagation characteristics of wireless signals being transmitted within the building. The method 700 may continue with the processing logic applying, in an input layer of a neural network, a separability function to the data to separate the first channel properties associated with a first group of frequencies from the second channel properties associated with a second group of frequencies, to generate a first input variable and a second input variable, respectively, to a multi-layered neural network (720). In one embodiment, the separability function is selected to optimize data separability of the first channel properties from the second channel properties, and according to the groups of frequencies for each of the first channel properties and the second channel properties”); 
obtaining a mean and a variance for weights of the recurrent neural network, the first multiplicative variables, and the second multiplicative variables (Fig. 6, 7A, 12-14, col. 24 ln. 9-33, col. 31 ln. 11-col. 32 ln. 3, obtaining variance for weights, i.e. “The statistical parameters may include, related to the labeled training data generated at block 1212, at least one of a Fast Fourier Transform (FFT) value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”); and 
performing sparsification for the recurrent neural network based on the mean and the variance (Fig. 12, col. 24 ln. 66-col. 25 ln. 46, performing sparsification in the neural network, i.e. “The method 1200 may continue with the processing logic sequentially feeding the input vectors to the LSTM neural network model 1240, as previously trained during the training stage 1201. The method 1200 may continue with the processing logic classifying the channel properties data represented by the statistical parameter values in the input vectors for the sampling time period as present or not present (1278). The classifying may take the form of a series of predictive outputs, e.g., a predictive output for each set of the discrete samples of the channel properties data. Note that the output of the LSTM neural network model 1240 may feed back into the LSTM neural network model 1240 so that current output states may be defined by both input state values and past state values,… In various embodiments, the method 1200 may also include statistical processing of the channel properties data to generate the statistical parameter, or feature, values for each set of discrete samples that may be further pre-processed before being input into the LSTM neural network model for either learning (in the training stage 1201) or classification (in the classification stage 1251). Specifically, the statistical parameter values may be useable as feature values to define the LSTM neural network model 1240. By using the values of one or more statistical features, the classifier of the LSTM neural network model 1240 may improve classification accuracy. Each statistical parameter value may be representative of the propagation characteristics of a set of discrete samples of the channel properties data. In various embodiments, the statistical parameter values may come from, related to respective sets of discrete samples of the channel properties data, one or more of a FFT value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 2, Zhang teaches the method as claimed in claim 1, wherein the performing of the sparsification includes: calculating an associated value for performing the sparsification based on the mean and the variance for weights of the recurrent neural network, the first multiplicative variables, and the second multiplicative variables; and setting a weight, a first multiplicative variable, or a second multiplicative variable in which the associated value is smaller than a predetermined value to zero (col. 27 ln. 33-53, using predetermined value, i.e. “FIG. 14 is a second block diagram of the LSTM architecture 1300 for presence detection according to one embodiment. The LSTM architecture 1300 may further include an output layer 1330 that includes a fully-connected layer 1334 and a software max layer 1338. In various embodiments, the fully-connected layer 1334 may generate a predictive output that is an analog signal (or other comparable value in software) based on the learned state from hidden state values 1320 of the LSTM layer. A predictive output, which may be a value between zero (“0”) and one (“1”), may be generated by the fully-connected layer 1224 for each set of discrete samples of the channel properties data, for example. The software max layer 1338 may include a threshold detector that detects the analog signal as a zero or one based on a comparison to some threshold value (e.g., over 0.50 is a one) and generates a detection decision based on a combination of the predictive outputs over the sampling period. This allows the LSTM architecture 1300 to turn continuously-varying numbers into one of a number of finite classes (e.g., predictive outputs), and make a detection decision based on a combination of prediction outputs.”).
As to claim 3, Zhang teaches the method as claimed in claim 2, wherein the associated value is a ratio of square of mean to variance (col. 8 ln. 49-col. 9 ln. 7, using associated value, example of one of cost function is mean-squared error, i.e. “One example of a cost function is the mean-squared error. In another embodiment, other cost functions may be used. For example, if the softmax classifier is used (where the output are finite classes (e.g. either 1 or 0), the logarithmic cost function performs better than mean-square cost function, as the mean-squared error cost function is not always convex function in all optimization problems. For example, the logarithmic cost function may be expressed as C=−(1−y) log(1−f(x))+y log(f(x)). The mean-squared error cost function attempts to minimize the average squared error between the network's output, f(x), and the target value y over all the example pairs. Minimizing this cost using gradient descent for the class of neural networks is called multilayer perceptrons (MLP), which produces the backpropagation algorithm for training neural networks. Backpropagation is a method to calculate the gradient of the loss function (produces the cost associated with a given state) with respect to the weights in the NN. A perceptron algorithm is an algorithm for supervised learning of binary classifiers, e.g., functions that can decide whether an input, represented by a vector of numbers (e.g., frequencies in this case), belong to some specific class or not. Accordingly, the computing device 150 may apply a perceptron algorithm to the training data 160 to train the NN 158 and create additional pre-trained classifiers, and the training may be updated periodically based on updates to the training data 160 over time.”).
As to claim 4, Zhang teaches the method as claimed in claim 2, wherein the predetermined value is 0.05 (col. 27 ln. 33-53, using predetermined value, i.e. “FIG. 14 is a second block diagram of the LSTM architecture 1300 for presence detection according to one embodiment. The LSTM architecture 1300 may further include an output layer 1330 that includes a fully-connected layer 1334 and a software max layer 1338. In various embodiments, the fully-connected layer 1334 may generate a predictive output that is an analog signal (or other comparable value in software) based on the learned state from hidden state values 1320 of the LSTM layer. A predictive output, which may be a value between zero (“0”) and one (“1”), may be generated by the fully-connected layer 1224 for each set of discrete samples of the channel properties data, for example. The software max layer 1338 may include a threshold detector that detects the analog signal as a zero or one based on a comparison to some threshold value (e.g., over 0.50 is a one) and generates a detection decision based on a combination of the predictive outputs over the sampling period. This allows the LSTM architecture 1300 to turn continuously-varying numbers into one of a number of finite classes (e.g., predictive outputs), and make a detection decision based on a combination of prediction outputs.”).
As to claim 5, Zhang teaches the method as claimed in claim 1, further comprising: based on the recurrent neural network being included a gated structure, obtaining third multiplicative variables for preactivation of gates to make gates and information flow elements of a recurrent layer of the recurrent neural network constant, wherein the obtaining of the mean and the variance includes obtaining a mean and a variance for the weights of the recurrent neural network, the first multiplicative variables, the second multiplicative variables, and the third multiplicative variables such as statistical parameter values (Fig. 12, col. 24 ln. 66-col. 25 ln. 46, col. 31 ln. 34-col. 32 ln. 3, recurrent neural network implementation with multiple variables, i.e. “The method 1200 may continue with the processing logic sequentially feeding the input vectors to the LSTM neural network model 1240, as previously trained during the training stage 1201. The method 1200 may continue with the processing logic classifying the channel properties data represented by the statistical parameter values in the input vectors for the sampling time period as present or not present (1278). The classifying may take the form of a series of predictive outputs, e.g., a predictive output for each set of the discrete samples of the channel properties data. Note that the output of the LSTM neural network model 1240 may feed back into the LSTM neural network model 1240 so that current output states may be defined by both input state values and past state values,… In various embodiments, the method 1200 may also include statistical processing of the channel properties data to generate the statistical parameter, or feature, values for each set of discrete samples that may be further pre-processed before being input into the LSTM neural network model for either learning (in the training stage 1201) or classification (in the classification stage 1251). Specifically, the statistical parameter values may be useable as feature values to define the LSTM neural network model 1240. By using the values of one or more statistical features, the classifier of the LSTM neural network model 1240 may improve classification accuracy. Each statistical parameter value may be representative of the propagation characteristics of a set of discrete samples of the channel properties data. In various embodiments, the statistical parameter values may come from, related to respective sets of discrete samples of the channel properties data, one or more of a FFT value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 6. , Zhang teaches the method as claimed in claim 5, wherein the gated structure is implemented by a long-short term memory (LSTM) layer (Fig. 12, col. 24 ln. 66-col. 25 ln. 46, LSTM implementation, i.e. “The method 1200 may continue with the processing logic sequentially feeding the input vectors to the LSTM neural network model 1240, as previously trained during the training stage 1201. The method 1200 may continue with the processing logic classifying the channel properties data represented by the statistical parameter values in the input vectors for the sampling time period as present or not present (1278). The classifying may take the form of a series of predictive outputs, e.g., a predictive output for each set of the discrete samples of the channel properties data. Note that the output of the LSTM neural network model 1240 may feed back into the LSTM neural network model 1240 so that current output states may be defined by both input state values and past state values,… In various embodiments, the method 1200 may also include statistical processing of the channel properties data to generate the statistical parameter, or feature, values for each set of discrete samples that may be further pre-processed before being input into the LSTM neural network model for either learning (in the training stage 1201) or classification (in the classification stage 1251). Specifically, the statistical parameter values may be useable as feature values to define the LSTM neural network model 1240. By using the values of one or more statistical features, the classifier of the LSTM neural network model 1240 may improve classification accuracy. Each statistical parameter value may be representative of the propagation characteristics of a set of discrete samples of the channel properties data. In various embodiments, the statistical parameter values may come from, related to respective sets of discrete samples of the channel properties data, one or more of a FFT value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 7, Zhang teaches the method as claimed in claim 1, wherein the obtaining of the mean and the variance includes: initializing the mean and the variance for the weights, a first group variable, and a second group variable; and obtaining a mean and a variance for the weights, the first group variable and the second group variable by optimizing objectives associated with the mean and the variance of the weights, the first group variable, and the second group variable  (Fig. 6, 7A, 12-14, col. 24 ln. 9-33, col. 31 ln. 11-col. 32 ln. 3, partition data processing using recurrent neural network using statistical parameters, including using mean and variance value, i.e. “The statistical parameters may include, related to the labeled training data generated at block 1212, at least one of a Fast Fourier Transform (FFT) value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 8, Zhang teaches the method as claimed in claim 7, wherein the obtaining of the mean and the variance further includes: selecting a mini batch of the objectives (Fig. 21, col. 32 ln. 15-col. 33 ln. 10); generating the weights, the first group variable, and the second group variable from approximated posterior distribution; forward passing the recurrent neural network by using the mini batch based on the generated weights, first group variable, and second group variable; calculating the objectives and calculating gradients for the objectives; and obtaining the mean and the variance for the weights, the first group variable, and the second group variable based on the calculated gradients (Fig. 6, 7A, 12-14, col. 24 ln. 9-33, col. 31 ln. 11-col. 32 ln. 3, partition data processing using recurrent neural network using statistical parameters, including using mean and variance value, i.e. “The statistical parameters may include, related to the labeled training data generated at block 1212, at least one of a Fast Fourier Transform (FFT) value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 9, Zhang teaches the method as claimed in claim 8, wherein the weights are generated by the mini batch, and wherein the first group variable and the second group variable are generated separately from the objectives (Fig. 6, 7A, 12-14, col. 17 ln. 66-col. 18 ln. 50, col. 24 ln. 9-33, col. 31 ln. 11-col. 32 ln. 3, partition data processing using recurrent neural network using statistical parameters, including using mean and variance value, i.e. “The method 700 may continue with the processing logic applying, in an input layer of a neural network, a separability function to the data to separate the first channel properties associated with a first group of frequencies from the second channel properties associated with a second group of frequencies, to generate a first input variable and a second input variable, respectively, to a multi-layered neural network (720). In one embodiment, the separability function is selected to optimize data separability of the first channel properties from the second channel properties, and according to the groups of frequencies for each of the first channel properties and the second channel properties….The statistical parameters may include, related to the labeled training data generated at block 1212, at least one of a Fast Fourier Transform (FFT) value, a maximum value, a minimum value, a mean value, a variance value, an entropy value, a mean cross rate value, a skewness value, or a kurtosis value.”).
As to claim 10, Zhang teaches the method as claimed in claim 1, wherein the input elements are vocabularies or words (col. 9 ln. 44-60, “…if training a neural network using RSSI data as the training data 160 …when a user utters a wake-up word, the wireless device 200 may tag the RSSI from the built-in WiFi® of the RF modules 286, and use that as the training data for the maximum likelihood for the RSSI”).
Regarding claims 11-20, are essentially the same as claims 1-10, except that it sets forth the claimed invention as an electronic apparatus rather than a method and rejected for the same reasons as applied hereinabove. 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANHTAI V TRAN whose telephone number is (571)270-5129.  The examiner can normally be reached on Monday through Thursday from 8:00 AM to 4:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571)272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANHTAI V TRAN/Primary Examiner, Art Unit 2168