DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Claims 1, 3, 7, 9, 13, 15, 19-25, and 27 are amended.
Claims 1-30 are pending.

Allowable Subject Matter
Claims 6, 12, 18, 24, and 30 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1, 7, 13, 19 and 25 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Korjani (US Patent #10614827).

Regarding Claim 1, Korjani discloses a processor, comprising:
one or more circuits to use one or more neural networks to identify a noise signal in one or more speech signals (Korjani abstract discloses the neural network is trained to isolate various types of noise from the user speech in the speech data and then subtract the noise from the speech data, thus leaving only the user speech free of noise. This filtering is dynamically performed on a frame-by-frame basis from each frame of the speech data, thereby making it possible to specifically identify and remove different types and levels of noise in each frame. col. 1 lines 55-56 discloses the neural network is trained to isolate [identify] various types of noise from the user speech in the speech data. lines 65-67 discloses the neural network is trained only to isolate noise, and the noise profile that is estimated for each frame consists of the noise estimate alone. Col. 2 lines 28-30 discloses the DNN is trained to recognize and isolate the noise in any speech data so that the noise can be dynamically removed in a later stage of processing. lines 41-44 discloses the DNN is first trained to recognize noise data in a first stage shown at the left and then utilized to isolate noise in speech data in a second stage shown at the right).

Claims 7, 13, 19 and 25 are rejected for the same results as set forth in Claim 1.

Claim(s) 1, 7, 13, 19 and 25 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kristjansson (US #2017/0092268).

Regarding Claim 1, Kristjansson (Fig. 1) discloses a processor, comprising:
one or more circuits to use one or more neural networks to identify a noise signal in one or more speech signals (Kristjansson abstract discloses the present invention discloses a way to improve the noise robustness of a speech recognition system by providing additional input to a Neural Network speech classifier. The additional information characterizes the noise environment of the speech. The speech separation system employs models for the speech and for the distractor or noise. The neural network is used to identify the most likely combinations of speech and noise. ¶0032 discloses to obtain the noise environment information 124, the feature vectors 112 or other feature vectors derived from the audio signal 112, can be analyzed by a noise environment module. The result of processing with the noise environment module can be a noise-vector).

Claims 7, 13, 19 and 25 are rejected for the same results as set forth in Claim 1.

Claim(s) 1, 7, 13, 19 and 25 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tommy (US #2021/0118462).

Regarding Claim 1, Tommy (abstract: identify a plurality of noises; Figs. 1, 3) discloses a processor, comprising:
one or more circuits to use one or more neural networks to identify  (Tommy Figs. 1, 3:  steps 108, 304: noise .

Claims 7, 13, 19 and 25 are rejected for the same results as set forth in Claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, 8-9, 14-15, 20-21, and 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over Korjani (US Patent #10614827) in view of Lee et al. (US PGPUB #2018/0190268).

Regarding Claim 2, Korjani discloses the processor of claim 1,
wherein the one or more circuits are further to extract one or more features from the one or more speech signals (Korjani col. 1 lines 45-50 discloses a feature extraction .
Korjani may not explicitly disclose wherein the one or more circuits are further to generate an audio spectrogram corresponding to one or more features extracted from the one or more speech signals.
However, Lee (Figs. 2, 9) teaches wherein the one or more circuits are further to generate an audio spectrogram corresponding to one or more features extracted from the one or more speech signals (Lee ¶0065 discloses the speech recognizing apparatus 110 obtains or generates a spectrogram from/of the speech signal and extracts a frequency feature of the speech signal from the spectrogram).
Korjani and Lee are analogous art as they pertain to enhancing speech using neural networks. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the speech enhancement system (as taught by Korjani) to generates a spectrogram from/of the speech signal and extracts a frequency feature of the speech signal from the spectrogram (as taught by Lee, ¶0065) to provide speech recognition as a desire for convenience (Lee, ¶0003).
3, Korjani in view of Lee discloses the processor of claim 2. Korjani may not explicitly disclose wherein the one or more circuits are further to provide the audio spectrogram as input to the one or more neural networks, wherein the one or more neural networks generate an audio mask corresponding to the noise signal identified in the one or more speech signals.
However, Lee (Figs. 2, 9) teaches wherein the one or more circuits are further to provide the audio spectrogram as input to the one or more neural networks (Lee ¶0115 discloses Fig. 9 operation 910 the speech recognizing apparatus obtains a spectrogram of a speech frame. The speech recognizing apparatus generates the spectrogram by converting a speech signal to a signal of a frequency area through Fourier transform and extracts a feature of the speech signal from the spectrogram. Fig. 1, for example, can be applicable to the extracting of the features of the speech signal from the spectrogram. In operation 920, the speech recognizing apparatus determines one or more attention weights to be applied to, or with respect to, a speech frame),
wherein the one or more neural networks generate an audio mask corresponding to the noise signal identified  in the one or more speech signals (Lee ¶0069 discloses the speech recognizing model implemented by the speech recognizing apparatus 110 configured as the neural network can dynamically implement spectral masking by receiving a feedback on a result calculated by the neural network at the previous time. When the spectral masking is performed, feature values for each frequency band can selectively not be used in full as originally determined/captured, but rather, a result of a respective adjusting of the magnitudes of all or select feature values for all or select frequency bands, e.g., according to the dynamically implemented spectral masking, can be used for or within speech recognition. Also, for example, such a spectral masking .
Korjani and Lee are analogous art as they pertain to enhancing speech using neural networks. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the speech enhancement system (as taught by Korjani) to generates a spectrogram from/of the speech signal and extracts a frequency feature of the speech signal from the spectrogram (as taught by Lee, ¶0065) to provide speech recognition as a desire for convenience (Lee, ¶0003).

Claims 8-9, 14-15, 20-21, and 26-27 are rejected for the same results as set forth in Claims 2-3.

Claims 4, 10, 16, 22, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Korjani (US Patent #10614827) in view of Lee et al. (US PGPUB #2018/0190268) further in view of Calle et al. (US PGPUB #2018/0358003).

Regarding Claim 4, Korjani in view of Lee discloses the processor of claim 3. Korjani may not explicitly disclose wherein the one or more neural networks include two parallel paths for determining patterns in the audio spectrogram, the two parallel paths including a first path with a sequence of convolutional layers and a second path with one or more gated recurrent unit (GRU) layers.
However, Lee (Figs. 2, 9) teaches wherein the one or more neural networks include two parallel paths for determining patterns in the audio spectrogram (Lee ¶0114 discloses operations of Fig. 9 can be performed in parallel or simultaneously).
Korjani and Lee are analogous art as they pertain to enhancing speech using neural networks. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the speech enhancement system (as taught by Korjani) to generates a spectrogram from/of the speech signal and extracts a frequency feature of the speech signal from the spectrogram (as taught by Lee, ¶0065) to provide speech recognition as a desire for convenience (Lee, ¶0003).
And Calle teaches wherein the one or more neural networks include two parallel paths for determining patterns in the audio spectrogram (Calle Fig. 2: spectrogram as input to CNN, RNN; ¶0037 discloses RNNs can come in a variety of forms including GRU. The exemplary deep convolutional network 200 also includes multiple convolution blocks (e.g., C1 and C2). Each of the convolution blocks can be configured with a convolution layer [CONV], a normalization layer [LNorm], and a pooling layer [MAX POOL]. The convolution layers can include one or more convolutional filters. ¶0038 discloses the parallel filter banks, for example, of a deep convolutional network can be loaded on a CPU or GPU of an SOC, optionally based on an Advanced RISC Machine [ARM] instruction set, to achieve high performance and low power consumption),
the two parallel paths including a first path with a sequence of convolutional layers and a second path with one or more gated recurrent unit (GRU) layers (Calle ¶0037 discloses Fig. 2 shows the exemplary deep convolutional network 200 includes a preprocessing block. The preprocessing block has a waveform input. The preprocessing block includes a spectrogram block, convolutional neural network [CNN] block, recurrent neural network [RNN] block, and a decoding block. RNNs can come in a variety of forms including generic RNN, LSTM, and GRU, which can be designed with stable memory allowing association over long input sequences of indefinite lengths. ¶0047 discloses the ASR can contain various RNN layers including bi-direction RNN. Examples of specialized RNNs include LSTM [long short-term memory] units and GRU [gated recurrent units], which can further be configured to process incoming data front-to-back, or in the case of buffered data, both front-to-back and back-to-front, creating a so called bidirectional RNN networks that is known to improve accuracy).
Korjani, Lee, and Calle are analogous art as they pertain to enhancing speech using neural networks. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Korjani in view of Lee in light of the teachings of Calle to include any number of convolutional blocks [in parallel] in the deep convolutional network (as taught by Calle, ¶0037) to improve speech communication and speech interface quality using neural networks (Calle, ¶0001).

Claims 10, 16, 22, and 28 are rejected for the same results as set forth in Claim 4.

Claims 5, 11, 17, 23, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Korjani (US Patent #10614827) in view of Lee et al. (US PGPUB #2018/0190268) further in view of Calle et al. (US PGPUB #2018/0358003) and further in view of Kim et al. (US #2021/0350796).

Regarding Claim 5, Korjani in view of Lee and Calle discloses the processor of claim 4, but may not explicitly disclose wherein the one or more circuits are further to concatenate the patterns determined by the two parallel paths and process those concatenated patterns using a sequence of GRU layers to identify important noise patterns in the one or more speech signals for use in generating the audio mask.
However, Kim (Figs. 1-5) teaches wherein the one or more circuits are further to concatenate the patterns determined by the two parallel paths and process those concatenated patterns using a sequence of GRU layers to identify important noise patterns in the one or more speech signals for use in generating the audio mask (Kim ¶0026 discloses [Figs. 1, 3] the first level of context aggregation in a densely connected convolutional and recurrent network [DCCRN] is done by a dilated ID convolutional network component, with a DenseNet architecture [20], to extract the target speech from the noisy mixture in the time domain. It is followed by a compact gated recurrent unit [GRU] component [21] to further leverage the contextual information in the "many-to-one" fashion. ¶0027 discloses the speech processing represents that the hybrid architecture of dilated convolution neural network [CNN] and GRU in DCCRN consistently helps outperform the CNN variations with only one level of context aggregation. ¶0029 discloses here, the densely connected hybrid network can be a network in which a CNN and a recurrent neural network [RNN] are combined. ¶0030 discloses the densely connected hybrid network can include a plurality of dense blocks. And, each of the dense blocks can be composed of a plurality .
Korjani, Lee, Calle, and Kim are analogous art as they pertain to enhancing speech using neural networks. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify teaching of Korjani in view of Lee and Calle in light of the teachings of Kim for each of the plurality of dense blocks included in the densely connected hybrid network to be represented as a repetitive neural network due to having the same convolutional layers with each other (as taught by Kim, ¶0033) since the responsiveness of a practical RNN system, such as LSTM, comes at the cost of increased model parameters, which are neither as easy to train nor resource-efficient (Kim, ¶0008).

Claims 11, 17, 23, and 29 are rejected for the same results as set forth in Claim 5.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-30 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YOGESHKUMAR G PATEL whose telephone number is (571)272-3957. The examiner can normally be reached 7:30 AM-4 PM PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/YOGESHKUMAR PATEL/Primary Examiner, Art Unit 2651