DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to the claim(s) have been considered but they are moot in light of amendments made that require new grounds of rejection. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 1, 3, 5 – 9, 11, 13 – 15, 17, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zad Issa et al. (hereinafter Zad, U.S. Patent Application Publication 2016/0005422) in view of Korjani et al. (US 10,614,827).
Regarding Claim 1, Zad discloses:
A method (e.g. operation of the architecture of Fig.2 and the components of computing device of Fig. 3), comprising:
receiving a signal that includes noise (e.g. noisy signal input to window 202; Fig. 2; note processor 304 receives a noisy signal… the noisy signal includes a speech signal and a noise signal; para 13)
generating an estimate of a noise signal included in the received signal (e.g. block 208 the noise is estimated and tracked from the transformed noisy signal, and provides [“generates”] a noise estimate; para 25; a noise level is estimated based on the noise model associated with the identified user environment; para 13; and feature vectors (e.g., determined from the transformed noisy signal); para 36), 
wherein the estimate of the noise included in the received signal is generated by a model built using machine learning (the system uses a machine learning algorithm to characterize the user environments… enabling more accurate results in the estimation of the noise level; para 36; and note processor estimates a noise level “based on the noise model associated with the current user environment”; para 33; Therefore the above citation teaches the limitation “the estimate of the noise included in the received signal is generated by a model”. Zad further teaches that the noise model is “built using machine learning”, by disclosing classification data for each user environment…includes noise model associated therewith[0031], and  Machine learning algorithms are used to characterize user environment and to improve the quality of the classification data 310 (which includes noise model as provided above) [0036]. Further claim 14 of Zad also teaches using machine learning to define at least a portion of the classification data, and since classification data includes noise model as noted in [0031], Examiner notes Zad teaches that the noise model is “built using machine learning”); 
using the estimate of the noise signal included in the received signal to remove at least part of the noise from the received signal (See also Fig. 2 where noise removal 210 is performed in response to noise estimation to generate an enhanced signal; e.g. the noise estimate provided by 208 is then used by block 210 for noise elimination, reduction or removal; para 25; [0034],  Because the noise reduction component 326, like the estimation component 324, considers the identified current user environment parameters (e.g., the noise model), the noise is removed or at least reduced from the noisy signal without affecting the subjective quality of the enhanced signal).
Zad does not explicitly detail the specifics of the training of the model: wherein the model is initially trained by comparing a sample of the noise to a training phase estimated noise signal generated by the model.
The analogous art Korjani teaches neural network is trained to isolate various types of noise from the user speech in the speech data and then subtract the noise from the speech data, thus leaving only the user speech free of noise (abstract). Korjani teaches wherein the model is initially trained by comparing a sample of the noise to a training phase estimated noise signal generated by the model (In Korjani, DNN 120 of Fig. 1 is trained to model and recognize noise in speech data (See Fig. 1); Col. 2, line 57- Col. 3, line 5; During the training,  the estimate of the noise profile 130 is represented in terms of spectral coefficients in the preferred embodiment. In parallel, the feature extractor 110 provides the spectral coefficients of the training noise 112 to a DNN training module 140. The training module 140 compares the training noise 112 with the estimate of the noise 130, estimates an error between the two profiles 112, 130, and then generates or otherwise modifies one or more link weights and/or thresholds 142 in the DNN 120 in order to minimize the error between the training noise and noise estimate. The process of training the DNN 120 is repeated on multiple samples/frames of noise 104 and speech 102 until the error observed is below a predetermined threshold. Examiner notes this training phase teaches the claimed limitation, especially in light of the Applicant’s disclosure and figure 3A, which shows comparison of the estimated noise signal xestimated  generated by a model in generator 302 to a sample of the noise x.)  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to apply the features of Korjani to the system of Zad, to provide robust techniques to accurately represent and filter noise from speech data that operates independent of the amount of training speech data or language (Col. 1, lines 37-40 of Korjani).

Regarding Claim 3¸in addition to the elements stated above regarding claim 1, the combinaion further discloses:
Wherein the using the estimate of the noise signal included in the received signal to remove at least part of the noise from the received signal comprises using the estimate of the noise signal included in the received signal in a noise cancellation process (e.g. the noise estimate provided by 208 is then used by block 210 for noise elimination, reduction or removal; para 25 of Zad; further note the noise is removed or reduced from the noisy signal; para 34 of Zad).

Regarding Claim 5¸in addition to the elements stated above regarding claim 1, the combination further discloses:
Wherein the training phase estimated noise signal generated by the model is generated in response to the model receiving a training phase received signal that includes noise (Col. 2, lines 48-51 of Korjani, the training speech in database 102 and noise data in database 104 are combined by a mixer 106 and the acoustic features extracted 108 before being provided as input into the DNN 120).

Regarding Claim 6¸in addition to the elements stated above regarding claim 1, the combination further discloses:
wherein the initial training of the model further comprises: adjusting the model based on the comparison of the sample of the noise to the training phase estimated noise signal generated by the model (Col. 2, line 57- Col. 3, line 5 of Korjani; The training module 140 compares the training noise 112 with the estimate of the noise 130, estimates an error between the two profiles 112, 130, and then generates or otherwise modifies one or more link weights and/or thresholds 142 in the DNN 120 in order to minimize the error between the training noise and noise estimate. The process of training the DNN 120 is repeated on multiple samples/frames of noise 104 and speech 102 until the error observed is below a predetermined threshold. ).

Regarding Claim 7, Zad discloses:
A system (e.g. architecture of Fig.2 and the components of computing device of Fig. 3), comprising:
a first apparatus that carries a signal that includes noise (e.g. note communication between the computing device and other devices; para 38; and further note computing device represent a group of processing units or other computing devices… and the use of , for example, Bluetooth communication… noting any device with a microphone in a home setting to capture voice; para 26); and
a processor based apparatus in communication with the first apparatus (e.g. at least one processor; para 27, and note again that computing device represent a group of processing units or other computing devices; para 26);
wherein the processor based apparatus is configured to execute steps comprising:
receiving the signal that includes the noise (e.g. noisy signal input to window 202; Fig. 2; note processor 304 receives a noisy signal… the noisy signal includes a speech signal and a noise signal; para 13);
generating an estimate of a noise signal included in the received signal (e.g. block 208 the noise is estimated and tracked from the transformed noisy signal, and provides [“generates”] a noise estimate; para 25; a noise level is estimated based on the identified user environment; para 13; and feature vectors (e.g., determined from the transformed noisy signal); para 36), wherein the estimate of the noise signal included in the received signal is generated by a model built using machine learning (e.g. the system uses a machine learning algorithm to characterize the user environments… enabling more accurate results in the estimation of the noise level; para 36; and note processor estimates a noise level based on the noise model associated with the current user environment; para 33); 
using the estimate of the noise signal included in the received signal to remove at least part of the noise from the received signal (See also Fig. 2 where noise removal 210 is performed in response to noise estimation to generate an enhanced signal; e.g. the noise estimate provided by 208 is then used by block 210 for noise elimination, reduction or removal; para 25; [0034],  Because the noise reduction component 326, like the estimation component 324, considers the identified current user environment parameters (e.g., the noise model), the noise is removed or at least reduced from the noisy signal without affecting the subjective quality of the enhanced signal).
Zad does not explicitly detail the specifics of the training of the model: wherein the model is initially trained by comparing a sample of the noise to a training phase estimated noise signal generated by the model.
The analogous art Korjani teaches neural network is trained to isolate various types of noise from the user speech in the speech data and then subtract the noise from the speech data, thus leaving only the user speech free of noise (abstract). Korjani teaches wherein the model is initially trained by comparing a sample of the noise to a training phase estimated noise signal generated by the model (In Korjani, DNN 120 of Fig. 1 is trained to model and recognize noise in speech data (See Fig. 1); Col. 2, line 57- Col. 3, line 5; During the training,  the estimate of the noise profile 130 is represented in terms of spectral coefficients in the preferred embodiment. In parallel, the feature extractor 110 provides the spectral coefficients of the training noise 112 to a DNN training module 140. The training module 140 compares the training noise 112 with the estimate of the noise 130, estimates an error between the two profiles 112, 130, and then generates or otherwise modifies one or more link weights and/or thresholds 142 in the DNN 120 in order to minimize the error between the training noise and noise estimate. The process of training the DNN 120 is repeated on multiple samples/frames of noise 104 and speech 102 until the error observed is below a predetermined threshold. Examiner notes this training phase teaches the claimed limitation, especially in light of the Applicant’s disclosure and figure 3A, which shows comparison of the estimated noise signal xestimated  generated by a model in generator 302 to a sample of the noise x.)  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to apply the features of Korjani to the system of Zad, to provide robust techniques to accurately represent and filter noise from speech data that operates independent of the amount of training speech data or language (Col. 1, lines 37-40 of Korjani).

Regarding Claim 8¸in addition to the elements stated above regarding claim 7, Zad further discloses:
wherein the first apparatus comprises a microphone (e.g. computing device represent a group of processing units or other computing devices… and the use of , for example, Bluetooth communication… noting any device with a microphone in a home setting to capture voice; para 26).

Regarding Claim 9¸in addition to the elements stated above regarding claim 7, Zad further discloses:
wherein the first apparatus comprises an apparatus used for tracking a tangible object (e.g. computing device represent a group of processing units or other computing devices… and the use of , for example, Bluetooth communication; para 26; and further, environment classifier 214 includes data from a gyroscope 218; data from the gyroscope provides a state of the computing device, stationary or moving; para 18).

Claims 11 and 17 are rejected under the same grounds as claim 3 above.

Claims 13 and 19 are rejected under the same grounds as claim 5 above.

Claims 14 and 20 are rejected under the same grounds as claim 6 above.

Claim 15 is rejected under the same grounds as claims 1 and 7 above.

Claims 2, 10 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zad Issa et al. (hereinafter Zad, U.S. Patent Application Publication 2016/0005422) in view of Korjani et al. (US 10,614,827) and Yamasaki et al. (hereinafter Yam, U.S. Patent Application Publication 2020/0320992).

Regarding Claim 2¸in addition to the elements stated above regarding claim 1, Zad in view of Korjani fails to explicitly disclose:
wherein the model built using machine learning comprises a model built using a generative adversarial network (GAN).
Zad details the use of machine learning in functions estimating the noise level and its subsequent removal.  Zad further describes its features, detailing it in a fairly flexible and non-limiting manner, noting that aspects of the disclosure are operable with any form of machine learning algorithm; para 36.  Nevertheless, Zad is not explicit in terms of using a generative adversarial network (GAN).
In a related field of endeavor (e.g. active noise cancellation) Yam details features of a similarly trained model for generating improved acoustic results by using a generative adversarial network to generate audio filters and/or active noise cancellation parameters; para 33
Modifying Zad in view of Korjani to include the features of Yam further discloses:
wherein the model built using machine learning comprises a model built using a generative adversarial network (GAN) (e.g. Zad’s flexible, non limiting machine learning, now incorporating the features of Yam’s generative adversarial network to generate the necessary filters/parameters for the active noise cancellation; para 33).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the features of Yam to the system of Zad and Korjani.  Doing so would provide improvements to Yam’s noise reduction by providing a cost-effective solution that has been tuned or trained using the AI-based modules described; para 23 of Yam; note further advantageous features such as providing voice commands (mirroring that of Zad) with significantly reduced distortions; para 49.

Claims 10 and 16 are rejected under the same grounds as claim 2 above.

Claims 4, 12 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zad Issa et al. (hereinafter Zad, U.S. Patent Application Publication 2016/0005422) in view of Korjani et al. (US 10,614,827) and Visser et al. (hereinafter Vis, U.S. Patent Application Publication 2018/0233127).

Regarding Claim 4¸in addition to the elements stated above regarding claim 1, Zad in view of Korjani fails to explicitly disclose:
Using the estimate of the noise signal included in the received signal in an acoustic echo cancellation (AEC) process.
Zad details and recognizes multiple types of environmental noise, for example the noise model may describe a car, pub, cafe, pink noise, clean speech, etc; para 11. Zad describes these types of noise in a fairly flexible and non-limiting manner.  Nevertheless, Zad in view of Korjani is not explicit in terms of using an acoustic echo cancellation (AEC) process.
In a related field of endeavor (e.g. enhanced speech generation), Vis details a system that performs model based filtering (similar to the model based noise removal of Zad) that further includes features configured to reduce or eliminate reverberation or echo from an input audio signal; para 38, 49.  Also similar to Zad (machine learning), Vis details using neural networks to filter and generate the improved audio signal; para 51.
Modifying Zad in view of Korjani to include the features of Vis further discloses:
wherein the using the reference signal to remove at least part of the noise from the received signal comprises:
using the reference signal in an acoustic echo cancellation (AEC) process (e.g. Zad’s noise models now describing the echo/reverberant type noise detailed by Vis in paras 38 and 49 and subsequent filtering for improvement).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the features of Vis to the system of Zad in view of Korjani.  Doing so would provide improvements to Yam’s noise reduction by including a common type of environmental noise to be removed by the flexible model type removal of Zad.  Further inclusion of these features would adapt Zad to additionally include improved quality of enhanced speech improving the user experience; apra 24 of Vis by improving the intelligibility of the enhanced speech signal; para 56 of Vis.

Claims 12 and 18 are rejected under the same grounds as claim 4 above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THOMAS H MAUNG whose telephone number is (571)270-5690.  The examiner can normally be reached on Monday-Friday, 9am-6pm, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 1-(571) 272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/THOMAS H MAUNG/Primary Examiner, Art Unit 2654