DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/25/2021 and 04/08/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings are objected to because of the following informalities:
Fig. 1 - element 107 is not in the specification
Fig. 7 - element 707 is not in the specification
Fig. 8 - element 803 is not in the specification
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Objections
Claims 1 and 9 are objected to because of the following informalities:  In limitations a)/c) the claims describe the pre-processing neural network as producing “an initial signal classification”, however in limitations b)/d) the claims say that “the initial scene classification” is processed. In the interest of compact prosecution, the Examiner is interpreting “the initial scene classification” to be --the initial signal classification--. It is suggested that “the initial scene classification” is amended to --the initial signal classification--. Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 4, 6, 9, 12, and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S. PG Pub No. 2017/0061978), hereinafter Wang, in view of Wang et al. (U.S. PG Pub No. 2015/0066499), hereinafter Wang 2.

Regarding claims 1 and 9, Wang teaches
(claim 1) A signal processing method for generating stimulation signals for a hearing implant implanted in a patient, the method comprising (a method to improve speech intelligibility for a hearing aid [0005]):
(claim 9) A signal processing system for generating stimulation signals for a hearing implant implanted in a patient, the system comprising (a system to improve speech intelligibility for a hearing aid [0005]):
(claim 9) an audio scene classifier comprising a multi-layer neural network... (a DNN classifies whether different sub-bands are either speech- or noise-dominant based on extracted features from an input audio signal, i.e. an audio scene classifier [0028-9],[0032], where the DNN has more than one layer, i.e. a multi-layer neural network [0031]):

classifying an audio input signal from an audio scene with a multi-layer neural network (a DNN classifies whether different sub-bands are either speech- or noise-dominant based on extracted features from an input audio signal, i.e. classifying an audio input signal from an audio scene [0028-9],[0032], where the DNN has more than one layer, i.e. a multi-layer neural network [0031]), the classifying comprising:
a) pre-processing the audio input signal ... using initial classification parameters to produce an initial signal classification (features are extracted from an input audio signal, i.e. pre-processing the audio input signal, including spectral patterns, amplitude, MFCCs, and spectral perceptual linear predictions, i.e. using initial classification parameters, on a per-frame basis, and the extracted features are concatenated to form a vector representing the temporal, spectral, and cepstral characteristics at each frame, i.e. produce an initial signal classification [0028]), and 
b) processing the initial --signal-- classification with a scene classifier neural network using scene classification parameters to produce an audio scene classification output (the extracted features are inputs, i.e. processing the initial signal classification, into a single DNN, i.e. with a scene classifier neural network, with weights resulting from training, i.e. using scene classification parameters, to classify whether different sub-bands are either speech- or noise-dominant, i.e. produce an audio scene classification output [0029],[0032]), 
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data, and the scene classification parameters reflect neural network training on a second set of classification audio training data separate and different from --other sets of-- training data (the DNN classifier is trained using extracted features that are taken from noisy environments, i.e. neural network training on a second set of classification audio training data separate and different from other sets of training data, which results in weights of the DNN, i.e. scene classification parameters [0032]);
processing the audio input signal and the audio scene classification output with a hearing implant signal processor for generating the stimulation signals (the outputs of the DNN classifier constitute an estimated IBM, i.e. the audio scene classification output, which is combined with the speech, i.e. processing the audio input signal, to produce a resynthesized, enhanced speech signal, which is then played, i.e. generating the stimulation signals [0033], such as in a cochlear implant hearing aid system capable of signal processing, i.e. a hearing implant signal processor [0020],[0037]).  
While Wang provides the extraction of features from an input audio signal, Wang does not specifically teach that the extraction is performed by a trained neural network, and thus does not teach
a pre-processing neural network;
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data,... second set of classification audio training data separate and different from the first set of initial audio training data.
Wang 2, however, teaches a pre-processing neural network (received sound is divided into time-frequency units, and a deep neural network extracts the features of the time-frequency unit, i.e. pre-processing neural network [0010:1-12]);
wherein the initial classification parameters reflect neural network training based on a first set of initial audio training data,... second set of classification audio training data separate and different from the first set of initial audio training data (the deep neural network is trained with a plurality of training noises, i.e. reflect neural network training based on a first set of initial audio training data [0010], where the training sounds include utterances of speech and background noises [0027]).
Where Wang teaches that the DNN classifier is trained using extracted features, i.e. second set of classification audio training data separate and different from the first set of initial audio training data [0032].
Wang and Wang 2 are analogous art because they are from a similar field of endeavor in processing sounds for use in hearing aids and cochlear implants. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the extraction of features from an input audio signal teachings of Wang with the extraction being performed by a deep neural network as taught by Wang 2. The motivation to do so would have been to achieve a predictable result of enabling the separation of speech from background noises (Wang 2 [0010]).

Regarding claims 4 and 12, Wang in view of Wang 2 teaches claims 1 and 9, and Wang further teaches
the pre-processing neural network includes an -22-envelope processing block configured for calculating sub-band signal envelopes for the audio input signal (to extract features for each frame, i.e. pre-processing, of the received speech, i.e. audio input signal, the speech signal is divided into frames, and is processed to produce a signal representing the envelope of the signal, i.e. envelope processing block configured for calculating ... signal envelopes  [0043],[0049], and where a T-F unit is a signal within a frame and sub-band that is later classified by the DNN utilizing the extracted features, i.e. sub-band signal envelopes [0023:6-10],[0041],[0043]).  
And Wang 2 teaches that the feature extraction is performed by a DNN [0010:1-12]).
Where the motivation to combine is the same as previously presented.

	Regarding claims 6 and 14, Wang in view of Wang 2 teaches claims 1 and 9, and Wang further teaches 
the initial signal classification is a multi-dimensional feature vector (features are extracted from an input audio signal including spectral patterns, amplitude, MFCCs, and spectral perceptual linear predictions on a per-frame basis, and the extracted features are concatenated to form a 177-dimensional vector, i.e. multi-dimensional feature vector, representing the temporal, spectral, and cepstral characteristics at each frame, i.e. initial signal classification [0028],[0052]).  

Claim(s) 2, 5, 7, 10, 13, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, and further in view of Zhao et al. (“Recurrent Convolutional Neural Network for Speech Processing”, IEEE, 2017), hereinafter Zhao.

Regarding claims 2 and 10, Wang in view of Wang 2 teaches claims 1 and 9.
While Wang in view of Wang 2 provides the use of a deep neural network to extract features from a signal, Wang in view of Wang 2 does not specifically teach that the network includes recurrent convolutional layers, and thus does not teach
the pre-processing neural network includes successive recurrent convolutional layers.  
Zhao, however, teaches the pre-processing neural network includes successive recurrent convolutional layers (an RCNN is used for feature extraction in speech, i.e. pre-processing neural network (Sec. 4, para. 1), where the core model inside the RCNN is the recurrent convolutional layer, and several RCLs are stacked to construct a deep RCNN, i.e. successive recurrent convolutional layers (sec. 3, para. 1,6)).  
Wang, Wang 2, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a deep neural network to extract features from a signal teachings of Wang, as modified by Wang 2, with the use of a deep RCNN constructed of stacked RCLs as taught by Zhao. The motivation to do so would have been to achieve a predictable result of enabling a model to efficiently capture both temporal and frequency dependence in speech by combining the merits of a recurrent neural network and a convolutional neural network (Zhao (Abstract)).

Regarding claims 5 and 13, Wang in view of Wang 2 teaches claims 1 and 9, and Wang further teaches
the pre-processing neural network ...configured for signal decimation within the pre-processing neural network (to extract features, the speech signal is full-wave rectified and then decimated, i.e. pre-processing includes signal decimation within the pre-processing [0049]).  
Where Wang 2 specifically teaches that the feature extraction is performed by a neural network [0010:1-12]. 
While Wang in view of Wang 2 provides signal decimation for feature extraction by a neural network, Wang in view of Wang 2 does not specifically teach the use of a pooling layer, and thus does not teach
the pre-processing neural network includes a pooling layer configured for signal decimation within the pre-processing neural network.  
Zhao, however, teaches the pre-processing neural network includes a pooling layer...(the deep RCNN used for feature extraction, i.e. pre-processing neural network, can be constructed by stacking several RCLs and interleaving pooling layers, i.e. includes a pooling layer (sec.3 para.6),(sec.4 para.1)).  
Wang, Wang 2, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of signal decimation for feature extraction by a neural network teachings of Wang, as modified by Wang 2, with the use of a deep RCNN constructed of stacked RCLs and interleaved pooling layers as taught by Zhao. The motivation to do so would have been to achieve a predictable result of enabling a model to efficiently capture both temporal and frequency dependence in speech by combining the merits of a recurrent neural network and a convolutional neural network (Zhao (Abstract)).

Regarding claims 7 and 15, Wang in view of Wang 2 teaches claims 1 and 9.
While Wang in view of Wang 2 provides the use of a DNN to classify audio as either signal or noise, Wang in view of Wang 2 does not specifically teach whether the DNN layers are fully connected, and thus does not teach
the scene classifier neural network comprises a fully connected neural network layer.  
Zhao, however, teaches the scene classifier neural network comprises a fully connected neural network layer (performance of a CNN can be boosted by adding several fully connected layers, i.e. neural network comprises a fully connected neural network layer (sec.3 para.6)).  
Where Wang specifically teaches that the DNN is used to classify the audio [0028-9],[0032].
Wang, Wang 2, and Zhao are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a DNN to classify audio as either signal or noise teachings of Wang, as modified by Wang 2, with the use of fully connected layers into a CNN as taught by Zhao. The motivation to do so would have been to achieve a predictable result of boosting the performance of a convolutional neural network (Zhao (sec.3 para.6)).
Claim(s) 3 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, in view of Zhao, and further in view of Phan et al. (“Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks”, IEEE, 2017, hereinafter Phan.

Regarding claims 3 and 11, Wang in view of Wang 2 and Zhao teaches claims 2 and 10, and Zhao further teaches
the recurrent convolutional layers are implemented as ... filter banks (the RCNN is constructed of stacked RCLs, i.e. recurrent convolutional layers (sec.3 para 6),(sec.4 para. 1), and used for feature extraction, which begins with the use of 40 filter-banks, i.e. implemented as filter banks (sec 4.1.2 para. 1)).   
While Wang in view of Wang 2 and Zhao provides the use of an RCL and filter banks, Wang in view of Wang 2 and Zhao does not specifically teach that the filter banks are recursive, and thus does not teach
recursive filter banks.
Phan, however, teaches recursive filter banks (amplitude modulation spectrum features are obtained by two-stage recursive filter banks, where the CNN is used for feature extraction (sec. I para 4),(sec. II para 2)). 
Wang, Wang 2, Zhao, and Phan are analogous art because they are from a similar field of endeavor in processing speech information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of an RCL and filter banks teachings of Wang, as modified by Wang 2 and Zhao, with the specific use of recursive filter banks as taught by Phan. The motivation to do so would have been to achieve a predictable result of properly representing audio features in an acoustic scene for scene recognition (Phan (sec.I para.2).

Claim(s) 8 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang, in view of Wang 2, and further in view of Phan.

Regarding claims 8 and 16, Wang in view of Wang 2 teaches claims 1 and 9.
While Wang in view of Wang 2 provides the use of a DNN to classify audio as either signal or noise, Wang in view of Wang 2 does not specifically teach that the classifier includes a linear discriminant analysis classifier, and thus does not teach
the scene classifier neural network comprises a linear discriminant analysis (LDA) classifier.  
Phan, however, teaches the scene classifier neural network comprises a linear discriminant analysis (LDA) classifier (after the feature extraction step, the classification is accomplished by classifiers such as Linear Discriminant Analysis (sec.II para 2)).  
Where Wang specifically teaches that the DNN is used to classify the audio [0028-9],[0032].
Wang, Wang 2, and Phan are analogous art because they are from a similar field of endeavor in classifying audio scenes. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of a DNN to classify audio as either signal or noise, teachings of Wang, as modified by Wang 2, with the specific use of an LDA classifier as taught by Phan. The motivation to do so would have been to achieve a predictable result of properly representing audio features in an acoustic scene for scene recognition (Phan (sec.I para.2).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
Aboulnasr et al. (U.S. PG Pub No. 2011/0123056): Adaptive classification system with feature extraction for hearing aids.
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/Examiner, Art Unit 2659 

/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659