Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 9 is objected to because of the following informality. In the limitation “receiving a test data set, the text data set including audio data with unseen noise”, text data is referenced when from the context it is clear that test data is implied. For examination purposes, “text” will be replaced with “test” for this claim. 
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 1, 2, 3, 4, 12, 13, 14, 15 and 19, are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv (US 20170286830 A1), in further view of Courbariaux (Courbariaux, Matthieu, Yoshua Bengio, and Jean-Pierre David. "Binaryconnect: Training deep neural networks with binary weights during propagations." Advances in neural information processing systems 28 (2015))

With respect to claim 1, 12 and 19, El-Yaniv  teaches a method/data storage device/computer-readable storage that stores instructions for improved real-time audio processing ([0040] The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention)
constructing a deep neural network model, including a plurality of at least one- bit neurons, configured to output a predicted label of audio data ([0047] Reference is now made to FIG. 1 which is a method for training a neural network having neurons with quantized activation functions for calculating quantized activation value connected by connections with quantized weight functions for calculating quantized weights, optionally binary, weights and referred to herein as a quantized neural network (QNN) for inference or otherwise analyzing new data, according to some embodiments of the present invention. The training is done at a training phase such that at run-time both the quantized activation functions and the quantized weight functions are set to provide quantized values for computing parameter gradients, and [0112] Now, as shown at 602, new data element is received. The new data element may be image data, video data, textual content, audio data, genetic data, and medical data such as outputs of image modality, for example CT, MRI, PET-CT and/or medical test outcomes, such as blood test, blood pressure. The data may be normalized and/or canonized), the plurality of at least one-bit neurons arranged in a plurality of layers, including at least one hidden layer, and being connected by a plurality of connections, each connection having at least a one-bit weight ([0082] Optionally, the quantized weight functions and the quantized activation functions are sign functions. For hidden units, namely neurons at the layers which are not the input layer or the output layer, sign function nonlinearity is used to obtain quantized activation values such as binary activations, and for connection weight values the following two are combined:), 
[[wherein one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision ]];
receiving a training data set, the training data set including audio data ([0064] As shown at 102, a training set is received, for instance designated by using the GUI 207, uploaded linked or otherwise selected. The training set may be image data, video data, speech data, textual content, genetic data, medical data such as outputs of image modality, for example CT, MRI, PET-CT and/or medical test outcomes, such as blood test, blood pressure.); 
training the deep neural network model using the training data set ([0064] During the training, parameter gradients such as weight gradients are computed based on outputs of the quantized connection weight functions and quantized activation functions for forward passes and backward passes (i.e. backpropagation actions).); and 
outputting a trained deep neural network model configured to output a predicted label of real-time audio data ([0096] Now, as shown at 104, the trained neural network is outputted, for instance as an inference object for inference. The inference object may be a script or a code or instructions to update an inference object or code)
El-Yaniv fails to explicitly disclose, however, Courbariaux teaches wherein one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision  ( Section 5: ll 6-10: With the deterministic version of BinaryConnect the impact at test time could be even more important, getting rid of the multiplications altogether and reducing by a factor of at least 16 (from 16 bits single-float precision to single bit precision) the memory requirement of deep networks [reducing from 16-bit to single bit precision], which has an impact on the memory to computation bandwidth and on the size of the models that can be run.).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv in view of Courbariaux, in order for one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision to take advantage of lowering computational bandwidth and size of the model that can be run (5. Conclusion ll 6-13, Courbariaux);



With respect to claim 2, 13  El-Yaniv teaches wherein the deep neural network comprises one of one hidden layer, two hidden layers, and three hidden layers (([0082] Optionally, the quantized weight functions and the quantized activation functions are sign functions. For hidden units [implies at least one hidden layer], namely neurons at the layers which are not the input layer or the output layer, sign function nonlinearity is used to obtain quantized activation values such as binary activations, and for connection weight values the following two are combined:)
With respect to claim 3, 14  El-Yaniv fails to explicitly disclose, however, Courbariaux teaches wherein the connections include one- bit weights. (Section 2.2 ll 1-2: The binarization operation transforms the real-valued weights into the two possible values. A very straightforward binarization operation would be based on the sign function:)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv in view of Courbariaux, in order for one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision to take advantage of lowering computational bandwidth and size of the model that can be run (5. Conclusion ll 6-13, Courbariaux);

With respect to claim 4, 15 El-Yaniv  teaches  wherein the plurality of at least one-bit neurons are one of one-bit neurons and two-bit neurons ([0047] Reference is now made to FIG. 1 which is a method for training a neural network having neurons with quantized activation functions for calculating quantized activation value connected by connections with quantized weight functions for calculating quantized weights, optionally binary [one-bit neurons])

Claims 5, 6, 16,  are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv, Courbariaux as applied to claims 1, 5 and 15 respectively,  in further view of DeFelice (US 20190236148 A1)

With respect to claim 5, 16 El-Yaniv , Courbariaux fail to explicitly disclose, however, DeFelice teaches wherein the at least one hidden layer includes less neurons than an input layer of the deep neural network and an output later of the deep neural network ([0087] For example, a typical VAE may have 1024 input nodes, then 512, 256, 128, 64, 32, and 16 in the hidden layers [hidden layers have fewer neurons] represented by nodes 1014. A final reduction leads to the compressed data representation 1030, which may have, for example, only 8 nodes in one embodiment. … and the output layer 1024 having an equal number of outputs to the input).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of DeFelice so that at least one hidden layer includes less neurons than an input layer of the deep neural network and an output later of the deep neural network to improve the correlation of different inputs to outputs  ([0083], DeFelice);

With respect to claim 6  El-Yaniv , Courbariaux fail to explicitly disclose, however, DeFelice teaches wherein the at least one hidden layer may include one of 32, 128, and 512 neurons ([0087] For example, a typical VAE may have 1024 input nodes, then 512, 256, 128, 64, 32, and 16 in the hidden layers [hidden layers have fewer neurons] represented by nodes 1014. A final reduction leads to the compressed data representation 1030, which may have, for example, only 8 nodes in one embodiment)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of DeFelice so that at least one hidden layer includes less neurons than an input layer of the deep neural network and an output later of the deep neural network to improve the correlation of different inputs to outputs  ([0083], DeFelice);

Claims 7, 17 are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv, Courbariaux as applied to claims 1,  in further view of Chamberlain (US 20070299659 A1)
With respect to claim 7, 17 El-Yaniv , Courbariaux fail to explicitly disclose, however, Chamberlain teaches wherein the received training data set includes audio data that has been previously scaled for precision ([0086] The codebook was trained using training data scaled by multiple levels to prevent sensitivity to speech input level. During the codebook training process, a new block of four energy values is created for every new frame so that energy transitions are represented in each of the four possible locations within the block. The resulting codebook is searched resulting in a codebook vector that minimizes mean squared error.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of Chamberlain so the received training data set includes audio data that has been previously scaled for precision to  improve speech quality ([0061], Chamberlain);

Claims 9 are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv, Courbariaux as applied to claims 1, in further view of Fukuda (US 20190012594 A1)
With respect to claim 9 El-Yaniv , Courbariaux fail to explicitly disclose, however, Fukuda teaches receiving a test data set, the test data set including audio data with unseen noise ([0084] The test speech data including test data sets labeled with “clean” and “noisy” in the Aurora-4 were used.); and  
evaluating the trained deep neural network using the received test data set ([0084] In the examples and the comparative examples, after the training of the neural network was completed, the neural network from the input layer to the output layer was stored as the acoustic model. The test speech data including test data sets labeled with “clean” and “noisy” in the Aurora-4 were used. ASR accuracy of the obtained speech recognition models was evaluated for the examples and the comparative examples by using several test data sets. WER (Word Error Rate) was utilized as ASR accuracy metric.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of Fukuda in order to receive a test data set, the test data set including audio data with unseen noise to  improve recognition accuracy ([0087], Fukuda);

Claims 10 are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv, Courbariaux as applied to claims 1,  in further view of Yao (US 20200380357 A1)
With respect to claim 10 El-Yaniv , Courbariaux fail to explicitly disclose, however, Yao teaches wherein the steps are iteratively or concurrently performed to output a plurality of trained deep neural network models, each configured to output a predicted label of real-time audio data, wherein each constructed deep neural network comprises at least one of a different number of neurons, a different number of neuron bits, a different number of layers, and a different number of weight bits ([0178] Deep Neural Networks (DNNs) have demonstrated results in a variety of artificial intelligence fields, e.g., in computer vision using deep Convolutional Neural Networks (CNNs) and in speech recognition using deep Recurrent Neural Networks (RNNs) and [0182] In a second aspect the weight partition operation uses a proven measure to divide the weights in each layer of a pre-trained full-precision DNN model into two disjoint groups which play complementary roles in our INQ. The weights in the first group are quantized by a novel variable-length encoding method, forming a low-precision base for the original model. The weights in the other group are re-trained while keeping the quantized weights fixed, compensating for the accuracy loss resulted from the quantization). 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of Yao in so that steps are iteratively or concurrently performed to output a plurality of trained deep neural network models, each configured to output a predicted label of real-time audio data, wherein each constructed deep neural network comprises at least one of a different number of neurons, a different number of neuron bits, a different number of layers, and a different number of weight bits to  improve training speed ([0135], Yao);

Claims 11 are rejected under 35 U.S.C. 103 as being unpatentable over El-Yaniv, Courbariaux as applied to claims 1,  in further view of Stoltze (US 20190313187 A1)
With respect to claim 11 El-Yaniv , Courbariaux fail to explicitly disclose, however, Stoltze teaches wherein receiving audio data, wherein audio data includes one or more audio frames represented in one of a time domain or a conversion domain ([0025] The neural network 370 in FIG. 3 is a mathematical model made up of a known number of nodes and layers used to determine whether a current audio frame is human voice or not.); 
processing the audio data in segments using the trained deep neural network, wherein each segment includes one or more frames of the audio data ([0025] The neural network 370 in FIG. 3 is a mathematical model made up of a known number of nodes and layers used to determine whether a current audio frame is human voice or not… The input to the neural network 370 comprises of features 310 to 365 extracted from the audio frame and the output of the final layer are two values representing human voice and noise probabilities).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify El-Yaniv, Courbariaux, in view of Stoltze in such that audio data includes one or more audio frames represented in one of a time domain or a conversion domain to  improve the quality of audio signals captured by microphones ([0003], Stoltze);


Allowable Subject Matter
Claims 8, 18 and 20 are objected to as being dependent upon a rejected base claims, but would be allowable if rewritten in independent form including all the limitations of the base claim and any intervening claims.
Claims 8, 18 and 20 recite “assigning a first representative bit for each input value of audio data of the training data set based on a sign of each input value; calculating an average distance from a predetermined reference value; assigning a second representative bit for each input value of audio data of the training data set based on the calculated average distance; calculating a second average distance based on the calculated average distance; and assigning, for each input value, an approximate value as a reference of each section of the assigned first representative bit and second representative bit based on the second average distance.” The closest teachings come from  Tseng (US 20140112493 A1) who teaches “[0043] Because digital audio input S is quantized as N-bit numeric values, including a 1-bit sign, a J-bit MSB part and a K-bit LSB part, wherein N=J+K,” and  Kim (US 20180374477 A1)  who teaches “As will be described in more detail below in reference to FIG. 5, acoustic model 412 may be trained by a machine learning module (e.g., machine learning module 564) that uses a speech simulation module (e.g., speech simulation module 565) to create simulated audible sounds from a simulated user at one or more distances to determine acoustic features 415 that are associated with reference “Distance (M)” (e.g., an average of the one or more distances used by the speech simulation module)). Therefore Seng teaches assigning a bit based on the sign and Kim teaches finding average distance from a predetermined value. However, neither Tseng, nor Kim or any other cited references teach  assigning first bit based on sign and calculating an average distance based on input value compared to a reference, second bit that is based on second average distance which is a function of the first calculated average distance and assigning for each input an approximate value as a reference.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ATHAR N PASHA whose telephone number is (408)918-7675.  The examiner can normally be reached on Monday-Thursday Alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.   Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ATHAR N PASHA/Examiner, Art Unit 2657     

                                                                                                                                                                                                  
/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657