DETAILED ACTION
This communication is in response to the Amendments and Arguments filed on   06/16/2022. 
Claims 1-18 are pending and have been examined.
All previous objections/rejections not mentioned in this Office Action have been withdrawn by the examiner. 
	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 and 9 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see new mappings with respect to cited art Moritz for further detail.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 4, 6, 9, 12, 14, 17, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S. PG Pub No. 2017/0061978), hereinafter Wang, in view of Wang et al. (U.S. PG Pub No. 2015/0066499), hereinafter Wang 2, and further in view of Moritz et al. (“Integration of Optimized Modulation Filter Sets Into Deep Neural Networks for Automatic Speech Recognition”, IEEE, 2016), hereinafter Moritz.

Regarding claims 1 and 9, Wang teaches
(claim 1) A signal processing method for generating stimulation signals for a hearing implant implanted in a patient, the method comprising (a method to improve speech intelligibility for a hearing aid [0005]):
(claim 9) A signal processing system for generating stimulation signals for a hearing implant implanted in a patient, the system comprising (a system to improve speech intelligibility for a hearing aid [0005]):
(claim 9) an audio scene classifier comprising a multi-layer neural network... (a DNN classifies whether different sub-bands are either speech- or noise-dominant based on extracted features from an input audio signal, i.e. an audio scene classifier [0028-9],[0032], where the DNN has more than one layer, i.e. a multi-layer neural network [0031]):
classifying an audio input signal from an audio scene with a multi-layer neural network (a DNN classifies whether different sub-bands are either speech- or noise-dominant based on extracted features from an input audio signal, i.e. classifying an audio input signal from an audio scene [0028-9],[0032], where the DNN has more than one layer, i.e. a multi-layer neural network [0031]), the classifying comprising:
a) pre-processing the audio input signal ... using initial classification parameters to produce an initial signal classification (features are extracted from an input audio signal, i.e. pre-processing the audio input signal, including spectral patterns, amplitude, MFCCs, and spectral perceptual linear predictions, i.e. using initial classification parameters, on a per-frame basis, and the extracted features are concatenated to form a vector representing the temporal, spectral, and cepstral characteristics at each frame, i.e. produce an initial signal classification [0028]), and 
b) processing the initial signal classification with a scene classifier neural network using scene classification parameters to produce an audio scene classification output (the extracted features are inputs, i.e. processing the initial signal classification, into a single DNN, i.e. with a scene classifier neural network, with weights resulting from training, i.e. using scene classification parameters, to classify whether different sub-bands are either speech- or noise-dominant, i.e. produce an audio scene classification output [0029],[0032]), 
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data, and the scene classification parameters reflect neural network training on a second set of classification audio training data separate and different from --other sets of-- training data,... (the DNN classifier is trained using extracted features that are taken from noisy environments, i.e. neural network training on a second set of classification audio training data separate and different from other sets of training data, which results in weights of the DNN, i.e. scene classification parameters [0032]);
processing the audio input signal and the audio scene classification output with a hearing implant signal processor for generating the stimulation signals (the outputs of the DNN classifier constitute an estimated IBM, i.e. the audio scene classification output, which is combined with the speech, i.e. processing the audio input signal, to produce a resynthesized, enhanced speech signal, which is then played, i.e. generating the stimulation signals [0033], such as in a cochlear implant hearing aid system capable of signal processing, i.e. a hearing implant signal processor [0020],[0037]).  
While Wang provides the extraction of features from an input audio signal, Wang does not specifically teach that the extraction is performed by a trained neural network, and thus does not teach
a pre-processing neural network;
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data,... second set of classification audio training data separate and different from the first set of initial audio training data, wherein the pre-processing neural network optimizes meta-parameters without explicit training of weights via back propagation, and wherein the classifier neural network includes a back propagation procedure.
Wang 2, however, teaches a pre-processing neural network (received sound is divided into time-frequency units, and a deep neural network extracts the features of the time-frequency unit, i.e. pre-processing neural network [0010:1-12]);
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data,... second set of classification audio training data separate and different from the first set of initial audio training data (the deep neural network is trained with a plurality of training noises, i.e. reflect neural network training based on a first set of initial audio training data [0010], where the training sounds include utterances of speech and background noises [0027]), ...wherein the classifier neural network includes a back propagation procedure (the DNN classifier, i.e. classifier neural network, is trained via a backpropagation algorithm, i.e. includes a back propagation procedure [0031-2]).
Where Wang teaches that the DNN classifier is trained using extracted features, i.e. second set of classification audio training data separate and different from the first set of initial audio training data [0032].
Wang and Wang 2 are analogous art because they are from a similar field of endeavor in processing sounds for use in hearing aids and cochlear implants. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the extraction of features from an input audio signal teachings of Wang with the extraction being performed by a deep neural network, with the DNN classifier further being trained using backpropagation as taught by Wang 2. It would be obvious to combine the references to enable the separation of speech from background noises (Wang 2 [0010]).
 While Wang in view of Wang 2 provides a trained feature extraction neural network, Wang in view of Wang 2 does not specifically teach that the feature extraction neural network is trained without the use of backpropagation, and thus does not teach
wherein the pre-processing neural network optimizes meta-parameters without explicit training of weights via back propagation.
Moritz, however, teaches wherein the pre-processing neural network optimizes meta-parameters without explicit training of weights via back propagation (a feature extraction method using a DNN, i.e. pre-processing neural network, where an amplitude modulation filter bank is fused with the DNN, and the AMFB establishes a relationship between the center frequency and bandwidth of the filters, i.e. meta-parameters, that can be adjusted to the data during training of the DNN, i.e. optimizes meta-parameters, including transforming with the weights and biases of the first layer of the DNN, where training can be performed similarly to greedy layer-wise supervised training or stochastic gradient descent, i.e. without explicit training of weights via back propagation (Sec.I, para. 11-14),(Sec.IIC, para. 1, lines 1-3), (Sec.IIIB(3), lines 1-5),(Sec.IIIE, para. 1 and 3)).
Wang, Wang 2, and Moritz are analogous art because they are from a similar field of endeavor in processing human speech. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the trained feature extraction neural network teachings of Wang, as modified by Wang 2, with a DNN trained without backpropagation that is fused with an AMFB as taught by Moritz. It would have been obvious to combine the references to utilize an AMFB to provide substantial improvements to other existing feature extraction methods (Moritz (Sec.I, para. 12)).

Regarding claims 4 and 12, Wang in view of Wang 2 and Moritz teaches claims 1 and 9, and Wang further teaches
the pre-processing neural network includes an -22-envelope processing block configured for calculating sub-band signal envelopes for the audio input signal (to extract features for each frame, i.e. pre-processing, of the received speech, i.e. audio input signal, the speech signal is divided into frames, and is processed to produce a signal representing the envelope of the signal, i.e. envelope processing block configured for calculating ... signal envelopes  [0043],[0049], and where a T-F unit is a signal within a frame and sub-band that is later classified by the DNN utilizing the extracted features, i.e. sub-band signal envelopes [0023:6-10],[0041],[0043]).  
And Wang 2 teaches that the feature extraction is performed by a DNN [0010:1-12]).
Where the motivation to combine is the same as previously presented.

Regarding claims 6 and 14, Wang in view of Wang 2 and Moritz teaches claims 1 and 9, and Wang further teaches 
the initial signal classification is a multi-dimensional feature vector (features are extracted from an input audio signal including spectral patterns, amplitude, MFCCs, and spectral perceptual linear predictions on a per-frame basis, and the extracted features are concatenated to form a 177-dimensional vector, i.e. multi-dimensional feature vector, representing the temporal, spectral, and cepstral characteristics at each frame, i.e. initial signal classification [0028],[0052]).  

Regarding claims 17 and 18, Wang in view of Wang 2 and Moritz teaches claims 1 and 9, and Moritz further teaches
the meta-parameters include filter bandwidths (an amplitude modulation filter bank is fused with the DNN, and the AMFB establishes a relationship between the center frequency and bandwidth of the filters, i.e. meta-parameters include filter bandwidths, that can be adjusted to the data during training of the DNN (Sec.I, para. 13-14),(Sec.IIC, para. 1, lines 1-3),(Sec.IIIE, para. 1 and 3)).  
Where the motivation to combine is the same as previously presented.

Claim(s) 2, 5, 7, 10, 13, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, in view of Moritz, and further in view of Zhao et al. (“Recurrent Convolutional Neural Network for Speech Processing”, IEEE, 2017), hereinafter Zhao.

Regarding claims 2 and 10, Wang in view of Wang 2 and Moritz teaches claims 1 and 9.
While Wang in view of Wang 2 and Moritz provides the use of a deep neural network to extract features from a signal, Wang in view of Wang 2 and Moritz does not specifically teach that the network includes recurrent convolutional layers, and thus does not teach
the pre-processing neural network includes successive recurrent convolutional layers.  
Zhao, however, teaches the pre-processing neural network includes successive recurrent convolutional layers (an RCNN is used for feature extraction in speech, i.e. pre-processing neural network (Sec. 4, para. 1), where the core model inside the RCNN is the recurrent convolutional layer, and several RCLs are stacked to construct a deep RCNN, i.e. successive recurrent convolutional layers (sec. 3, para. 1,6)).  
Wang, Wang 2, Moritz, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a deep neural network to extract features from a signal teachings of Wang, as modified by Wang 2 and Moritz, with the use of a deep RCNN constructed of stacked RCLs as taught by Zhao. It would have been obvious to combine the references to enable a model to efficiently capture both temporal and frequency dependence in speech by combining the merits of a recurrent neural network and a convolutional neural network (Zhao (Abstract)).

Regarding claims 5 and 13, Wang in view of Wang 2 and Moritz teaches claims 1 and 9, and Wang further teaches
the pre-processing neural network ...configured for signal decimation within the pre-processing neural network (to extract features, the speech signal is full-wave rectified and then decimated, i.e. pre-processing includes signal decimation within the pre-processing [0049]).  
Where Wang 2 specifically teaches that the feature extraction is performed by a neural network [0010:1-12]. 
While Wang in view of Wang 2 and Moritz provides signal decimation for feature extraction by a neural network, Wang in view of Wang 2 and Moritz does not specifically teach the use of a pooling layer, and thus does not teach
the pre-processing neural network includes a pooling layer configured for signal decimation within the pre-processing neural network.  
Zhao, however, teaches the pre-processing neural network includes a pooling layer...(the deep RCNN used for feature extraction, i.e. pre-processing neural network, can be constructed by stacking several RCLs and interleaving pooling layers, i.e. includes a pooling layer (sec.3 para.6),(sec.4 para.1)).  
Wang, Wang 2, Moritz, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of signal decimation for feature extraction by a neural network teachings of Wang, as modified by Wang 2 and Moritz, with the use of a deep RCNN constructed of stacked RCLs and interleaved pooling layers as taught by Zhao. It would have been obvious to combine the references to enable a model to efficiently capture both temporal and frequency dependence in speech by combining the merits of a recurrent neural network and a convolutional neural network (Zhao (Abstract)).

Regarding claims 7 and 15, Wang in view of Wang 2 and Moritz teaches claims 1 and 9.
While Wang in view of Wang 2 and Moritz provides the use of a DNN to classify audio as either signal or noise, Wang in view of Wang 2 and Moritz does not specifically teach whether the DNN layers are fully connected, and thus does not teach
the scene classifier neural network comprises a fully connected neural network layer.  
Zhao, however, teaches the scene classifier neural network comprises a fully connected neural network layer (performance of a CNN can be boosted by adding several fully connected layers, i.e. neural network comprises a fully connected neural network layer (sec.3 para.6)).  
Where Wang specifically teaches that the DNN is used to classify the audio [0028-9],[0032].
Wang, Wang 2, Moritz, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a DNN to classify audio as either signal or noise teachings of Wang, as modified by Wang 2 and Moritz, with the use of fully connected layers into a CNN as taught by Zhao. It would have been obvious to combine the references to boost the performance of a convolutional neural network (Zhao (sec.3 para.6)).

Claim(s) 3 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, in view of Moritz, in view of Zhao, and further in view of Phan et al. (“Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks”, IEEE, 2017, hereinafter Phan.

Regarding claims 3 and 11, Wang in view of Wang 2, Moritz, and Zhao teaches claims 2 and 10, and Zhao further teaches
the recurrent convolutional layers are implemented as ... filter banks (the RCNN is constructed of stacked RCLs, i.e. recurrent convolutional layers (sec.3 para 6),(sec.4 para. 1), and used for feature extraction, which begins with the use of 40 filter-banks, i.e. implemented as filter banks (sec 4.1.2 para. 1)).   
While Wang in view of Wang 2, Moritz, and Zhao provides the use of an RCL and filter banks, Wang in view of Wang 2, Moritz, and Zhao does not specifically teach that the filter banks are recursive, and thus does not teach
recursive filter banks.
Phan, however, teaches recursive filter banks (amplitude modulation spectrum features are obtained by two-stage recursive filter banks, where the CNN is used for feature extraction (sec. I para 4),(sec. II para 2)). 
Wang, Wang 2, Moritz, Zhao, and Phan are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of an RCL and filter banks teachings of Wang, as modified by Wang 2, Moritz, and Zhao, with the specific use of recursive filter banks as taught by Phan. It would have been obvious to combine the references to enable proper representation of audio features in an acoustic scene for scene recognition (Phan (sec.I para.2).

Claim(s) 8 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, in view of Moritz, and further in view of Phan.

Regarding claims 8 and 16, Wang in view of Wang 2 and Moritz teaches claims 1 and 9.
While Wang in view of Wang 2 and Moritz provides the use of a DNN to classify audio as either signal or noise, Wang in view of Wang 2 and Moritz does not specifically teach that the classifier includes a linear discriminant analysis classifier, and thus does not teach
the scene classifier neural network comprises a linear discriminant analysis (LDA) classifier.  
Phan, however, teaches the scene classifier neural network comprises a linear discriminant analysis (LDA) classifier (after the feature extraction step, the classification is accomplished by classifiers such as Linear Discriminant Analysis (sec.II para 2)).  
Where Wang specifically teaches that the DNN is used to classify the audio [0028-9],[0032].
Wang, Wang 2, Moritz, and Phan are analogous art because they are from a similar field of endeavor in classifying audio scenes. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a DNN to classify audio as either signal or noise, teachings of Wang, as modified by Wang 2 and Moritz, with the specific use of an LDA classifier as taught by Phan. It would have been obvious to combine the references to enable proper representation of audio features in an acoustic scene for scene recognition (Phan (sec.I para.2).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/           Examiner, Art Unit 2659                                                                                                                                                                                             
/PIERRE LOUIS DESIR/           Supervisory Patent Examiner, Art Unit 2659