DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-26 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-18 of U.S. Patent No. 10,811,030 (hereinafter ‘030). 
Although the claims at issue are not identical, they are not patentably distinct from each other:

Regarding Claims 1-3 (drawn to a method):
Current Application
Claim 1:

A method comprising:

with a processor, receiving noisy speech data; 

with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data; and 













with the processor, instructing communications circuitry to send the enhanced speech data to at least one of: a speaker configured to produce sound corresponding to the enhanced speech data, and a speech processing software application.
‘030
Claim 1:

A method comprising: 

with a processor, receiving noisy speech data; 

with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data; and 

with the processor, instructing communications circuitry to send the enhanced speech data to a speaker configured to produce sound corresponding to the enhanced speech data.
Current Application
Claim 2:

The method of claim 1, wherein using the trained mixed NMF dictionary to remove the noise components from the noisy speech data to produce the enhanced speech data comprises: 






with the processor, generating a NMF representation of the noisy speech data using the trained mixed NMF dictionary; and with the processor, 





applying a mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data.
‘030
Claim 1:

A method comprising: 

with a processor, receiving noisy speech data; 

with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data; and 

with the processor, instructing communications circuitry to send the enhanced speech data to a speaker configured to produce sound corresponding to the enhanced speech data.
Current Application
Claim 3:

The method of claim 2, wherein using the trained mixed NMF to remove the noise components from the noisy speech data to produce the enhanced speech data further comprises: 









with the processor, generating the mask based on the NMF representation.
‘030
Claim 1:

A method comprising: 

with a processor, receiving noisy speech data; 

with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data; and 

with the processor, instructing communications circuitry to send the enhanced speech data to a speaker configured to produce sound corresponding to the enhanced speech data.



Regarding Claims 8, 10, and 11  (drawn to a system):
Current Application
Claim 8:

A system comprising: 

an audio signal input device coupled to a signal processing module to communicate noisy speech data to the signal processing module; and 

the signal processing module comprising a processing unit and a memory, the memory having a set of instructions stored thereon which, when executed by the processing unit, cause the signal processing module to: 

receive the noisy speech data from the an audio signal input device; 

transform the noisy speech data into enhanced speech data by suppressing noise from the noisy speech data using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary; and 













transmit the enhanced speech data to an audio output module.
‘030
Claim 6:

 A system comprising: 

an audio signal input device coupled to a signal processing module to communicate noisy speech data to the signal processing module; and 

the signal processing module comprising a processing unit and a memory, the memory having a set of instructions stored thereon which, when executed by the processing unit, cause the signal processing module to: 


receive the noisy speech data from the an audio signal input device; 

transform the noisy speech data into enhanced speech data via suppressing noise from the noisy speech data by: 

generating a non-negative matrix factorization (NMF) representation of the noisy speech data using a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, the enhanced speech data comprising the at least one speech component; and 

transmit the enhanced speech data to an audio output module.
Current Application
Claim 10:

The system of claim 9, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: 














generate a NMF representation of the noisy speech data using the trained NMF dictionary.

‘030
Claim 6:

 A system comprising: 

an audio signal input device coupled to a signal processing module to communicate noisy speech data to the signal processing module; and 

the signal processing module comprising a processing unit and a memory, the memory having a set of instructions stored thereon which, when executed by the processing unit, cause the signal processing module to: 

receive the noisy speech data from the an audio signal input device; 

transform the noisy speech data into enhanced speech data via suppressing noise from the noisy speech data by: 

generating a non-negative matrix factorization (NMF) representation of the noisy speech data using a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, the enhanced speech data comprising the at least one speech component; and 

transmit the enhanced speech data to an audio output module.
Current Application
Claim 11:

The system of claim 10, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: 



















generate a mask based on the NMF representation; and 



apply the mask to the noisy speech data to suppress the noise from the noisy speech data to produce a speech component
‘030
Claim 6:

 A system comprising: 

an audio signal input device coupled to a signal processing module to communicate noisy speech data to the signal processing module; and 

the signal processing module comprising a processing unit and a memory, the memory having a set of instructions stored thereon which, when executed by the processing unit, cause the signal processing module to: 

receive the noisy speech data from the an audio signal input device; 

transform the noisy speech data into enhanced speech data via suppressing noise from the noisy speech data by: 

generating a non-negative matrix factorization (NMF) representation of the noisy speech data using a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, the enhanced speech data comprising the at least one speech component; and 

transmit the enhanced speech data to an audio output module.



Regarding Claims 15-17  (drawn to a signal processing module):
Current Application
Claim 15:

A signal processing module comprising: 

communications circuitry configured to receive noisy speech data from an external device; and 

a processing unit configured to use a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise from the noisy speech data to produce enhanced speech data, 












wherein the communications circuitry is further configured to transmit the enhanced speech data to the external device.
‘030
Claim 11:

A signal processing module comprising: 

communications circuitry configured to receive noisy speech data from an external device; and 

a processing unit configured to: use a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, 

wherein the communications circuitry is further configured to transmit the enhanced speech data to the external device.
Current Application
Claim 16:

The signal processing module of claim 15,wherein the processing unit is further configured to 









generate a NMF representation of the noisy speech data using the trained NMF dictionary.
‘030
Claim 11:

A signal processing module comprising: 

communications circuitry configured to receive noisy speech data from an external device; and 

a processing unit configured to: use a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, 

wherein the communications circuitry is further configured to transmit the enhanced speech data to the external device.
Current Application
Claim 17:

The signal processing module of claim 16 wherein the processing unit is further configured to 












generate a mask based on the NMF representation and 



to apply the mask to the noisy speech data to remove the noise from the noisy speech data to produce a speech component.
‘030
Claim 11:

A signal processing module comprising: 

communications circuitry configured to receive noisy speech data from an external device; and 

a processing unit configured to: use a trained mixed NMF dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise from the noisy speech data to produce enhanced speech data by: 

generating a NMF representation of the noisy speech data using the trained NMF dictionary; 

generating a mask based on only the NMF representation, wherein the noisy speech data represents only digitized sound signals, and wherein the NMF representation represents only the noisy speech data; and 

applying the mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data, 

wherein the communications circuitry is further configured to transmit the enhanced speech data to the external device.



Regarding Claims 21-23  (drawn to a method):
Current Application
Claim 21:

A method comprising steps of: 

generating, by a first processor, a trained speech non-negative matrix factorization (NMF) dictionary by: 

receiving a set of training speech samples corresponding to human speech; 





training a speech NMF dictionary by creating dictionary entries based on the set of training speech samples to produce a trained speech NMF dictionary; generating, by the first processor, a trained noise NMF dictionary by: 

receiving noise samples corresponding to noise; 





training a noise NMF dictionary by creating dictionary entries based on the noise samples to produce a trained noise NMF dictionary; and 


combining, by the first processor, the trained speech NMF dictionary with the trained noise NMF dictionary to generate a trained mixed NMF dictionary
‘030
Claim 15:

A method comprising steps of: 

generating, by a first processor, a trained mixed NMF dictionary by: 

receiving speech samples corresponding to human speech; 

performing, upon receiving the speech samples, frequency domain transformation of the speech samples to generate frequency domain speech samples; 

training, upon generating the frequency domain speech samples, a speech NMF dictionary by creating dictionary entries based on the frequency domain speech samples to produce a trained speech NMF dictionary; 


receiving noise samples corresponding to noise; 

performing, upon receiving the noise samples, frequency domain transformation of the noise samples to generate frequency domain noise samples; 

training, upon generating the frequency domain noise samples, a noise NMF dictionary by creating dictionary entries based on the frequency domain noise samples to produce a trained noise NMF dictionary; 

combining the trained speech NMF dictionary with the trained noise NMF dictionary to generate the trained mixed NMF dictionary; 

storing, by the first processor upon generating the trained mixed NMF dictionary, the trained mixed NMF dictionary on a memory device; 

receiving, by a second processor coupled to the memory device, noisy speech data; and 

generating, by the second processor upon receiving the noisy speech data, enhanced speech data from the noisy speech data based on the trained mixed NMF dictionary
Current Application
Claim 22:

The method of claim 21, further comprising steps of: 
































storing, by the first processor upon generating the trained mixed NMF dictionary, the trained mixed NMF dictionary on a memory device.
‘030
Claim 15:

A method comprising steps of: 

generating, by a first processor, a trained mixed NMF dictionary by: 

receiving speech samples corresponding to human speech; 

performing, upon receiving the speech samples, frequency domain transformation of the speech samples to generate frequency domain speech samples; 

training, upon generating the frequency domain speech samples, a speech NMF dictionary by creating dictionary entries based on the frequency domain speech samples to produce a trained speech NMF dictionary; 


receiving noise samples corresponding to noise; 

performing, upon receiving the noise samples, frequency domain transformation of the noise samples to generate frequency domain noise samples; 

training, upon generating the frequency domain noise samples, a noise NMF dictionary by creating dictionary entries based on the frequency domain noise samples to produce a trained noise NMF dictionary; 

combining the trained speech NMF dictionary with the trained noise NMF dictionary to generate the trained mixed NMF dictionary; 

storing, by the first processor upon generating the trained mixed NMF dictionary, the trained mixed NMF dictionary on a memory device; 

receiving, by a second processor coupled to the memory device, noisy speech data; and 

generating, by the second processor upon receiving the noisy speech data, enhanced speech data from the noisy speech data based on the trained mixed NMF dictionary
Current Application
Claim 23:

The method of claim 21, further comprising steps of: 




































receiving, by a second processor coupled to the memory device, noisy speech data; and 

generating, by the second processor upon receiving the noisy speech data, enhanced speech data from the noisy speech data based on the trained mixed NMF dictionary.
‘030
Claim 15:

A method comprising steps of: 

generating, by a first processor, a trained mixed NMF dictionary by: 

receiving speech samples corresponding to human speech; 

performing, upon receiving the speech samples, frequency domain transformation of the speech samples to generate frequency domain speech samples; 

training, upon generating the frequency domain speech samples, a speech NMF dictionary by creating dictionary entries based on the frequency domain speech samples to produce a trained speech NMF dictionary; 


receiving noise samples corresponding to noise; 

performing, upon receiving the noise samples, frequency domain transformation of the noise samples to generate frequency domain noise samples; 

training, upon generating the frequency domain noise samples, a noise NMF dictionary by creating dictionary entries based on the frequency domain noise samples to produce a trained noise NMF dictionary; 

combining the trained speech NMF dictionary with the trained noise NMF dictionary to generate the trained mixed NMF dictionary; 

storing, by the first processor upon generating the trained mixed NMF dictionary, the trained mixed NMF dictionary on a memory device; 

receiving, by a second processor coupled to the memory device, noisy speech data; and 

generating, by the second processor upon receiving the noisy speech data, enhanced speech data from the noisy speech data based on the trained mixed NMF dictionary



As shown in the tables above it is clear that all the elements of the application claims 1-3, 8, 10, 11, 15-17, and 21-23 are to be found in patent claims 1, 6, 11, and 15.  The difference between the application claims 1-3, 8, 10, 11, 15-17, and 21-23 and the patent claims 1, 6, 11, and 15 lies in the fact that the patent claims include more elements are thus more specific.  Thus, the invention of claims 1, 6, 11, and 15 of the patent is in effect a “species” of the “generic” invention of the application claims 1-3, 8, 10, 11, 15-17, and 21-23.  It has been held that the generic invention is “anticipated” by the “species”.  See In re Goodman, 29USPQ2d 2010 (Fed. Cir. 1993).
Since application claims 1-3, 8, 10, 11, 15-17, and 21-23 are anticipated by claims 1, 6, 11, and 15 of the patent, they are not patentably distinct from claims 1, 6, 11, and 15 of the patent.
Claim 4 of the current application corresponds to claim 2 of U.S. Patent No. 10,811,030. 
Claim 5 of the current application corresponds to claim 3 of U.S. Patent No. 10,811,030. 
Claim 6 of the current application corresponds to claim 4 of U.S. Patent No. 10,811,030. 
Claim 7 of the current application corresponds to claim 5 of U.S. Patent No. 10,811,030. 
Claim 9 of the current application corresponds to claim 7 of U.S. Patent No. 10,811,030. 
Claim 12 of the current application corresponds to claim 8 of U.S. Patent No. 10,811,030. 
Claim 13 of the current application corresponds to claim 9 of U.S. Patent No. 10,811,030. 
Claim 14 of the current application corresponds to claim 10 of U.S. Patent No. 10,811,030. 
Claim 18 of the current application corresponds to claim 12 of U.S. Patent No. 10,811,030. 
Claim 19 of the current application corresponds to claim 13 of U.S. Patent No. 10,811,030. 
Claim 20 of the current application corresponds to claim 14 of U.S. Patent No. 10,811,030. 
Claim 24 of the current application corresponds to claim 16 of U.S. Patent No. 10,811,030. 
Claim 25 of the current application corresponds to claim 17 of U.S. Patent No. 10,811,030. 
Claim 26 of the current application corresponds to claim 18 of U.S. Patent No. 10,811,030. 
Claim Objections
Claim 21 is objected to because of the following informalities:  Claim 21, line 12, recites “NMF dictionary; and”.  It appears to the examiner that the claim should recite “NMF dictionary.”.  Appropriate correction is required
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-10, 14-16, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guo et al. (US 10,013,975) in view of Tashev et al. (US 10/276,179).  
Regarding Claim 1, Guo et al teaches a method comprising: with a processor, receiving noisy speech data (The electronic device 102 may obtain a noisy speech signal 104) (col. 5, lines 24-54); and with the processor, instructing communications circuitry to send the enhanced speech data to a speaker configured to produce sound corresponding to the enhanced speech data, and a speech procession software application (the electronic device 102 may playback the reconstructed speech signal 124 or the residual noise-suppressed speech signal 118. This may be accomplished by providing the signal to one or more speakers) (col. 7, line 56-col. 8, line 12).
Guo et al fails to teach with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data.  
Tashev teaches with the processor, using a trained mixed non-negative matrix factorization (NMF) dictionary that comprises a trained noise NMF dictionary and a trained speech NMF dictionary to remove noise components from the noisy speech data to produce enhanced speech data (The NMFSE system separates speech signals xs(t) from noisy recordings y(t) in a single channel. The NMFSE system uses clean speech data to train ND equisized low-order dictionaries 101 (i.e., small number of atoms or basis functions), Wsi, i=1, . . . , ND, offline with distinct random initializations. To enhance a noisy signal y(t) 102 ( noisy speech), the NMFSE system uses T time windows to compute 103 the MxT dimensional short-time Fourier transform ("STFT"), Y(t,f) (FD noisy speech representation), of the noisy signal y(t)) (col. 3, line 62-col. 4, line 10).  
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to have combined the teachings of Guo with the teachings of Tashev to reduce computational costs by using a mixed NMF dictionary for speech enhancement of noisy speech data.
Regarding Claim 5, Guo et al teaches a method, wherein the noisy speech data is generated by a microphone of an external device (the electronic device 102 may capture a noisy speech signal 104 using one or more microphones) (col. 5, lines 24-54).
Regarding Claim 6, Guo et al teaches a method, wherein the speaker is part of the external device, wherein the external device is selected from the group consisting of an assistive listening device and a hearing aid (the electronic device 102 may receive the noisy speech signal 104 from another device (e.g., a wireless headset, another device, etc.) (col. 5, lines 24-54).
Regarding Claim 7, Guo et al teaches a method, wherein instructing communications circuitry to send the enhanced speech data to a speaker comprises: with the processor, instructing a first transceiver of the communications circuitry to wirelessly transmit the enhanced speech data to a second transceiver of the external device (the electronic device 102 may receive the noisy speech signal 104 from another device (e.g., a wireless headset, another device, etc.) (col. 5, lines 24-54).
Claims 8 and 15 are rejected for the same reason as claim 1.
Regarding Claim 9, Guo et al teaches a system, wherein the audio output module comprises: the audio signal input device, which comprises at least one microphone (the electronic device 102 may capture a noisy speech signal 104 using one or more microphones) (col. 5, lines 24-54); and a transceiver configured to transmit the noisy speech data to the signal processing module (The noisy speech signal 104 (or a signal based on the noisy speech signal 104) may be provided to the real-time noise reference determination module 106) (col. 5, lines 24-54), and to receive the enhanced speech data (the electronic device 102 may encode, transmit, store and/or play back the reconstructed speech signal 124 and/or the residual noise-suppressed speech signal 118) (col. 7, line 56-col. 8, line 12).
Regarding Claim 10, Guo et al teaches a system, when executed by the processing unit, the set of instructions further cause the signal processing module to: generate a NMF representation of the noisy speech data using the trained NMF dictionary (In some configurations, the low-rank speech dictionary may be learned through NMF-based speech dictionary learning. For example, obtaining the first speech dictionary 114 may include initializing one or more activation coefficients and/or speech basis functions and updating parameters until convergence) (col. 6, lines 36-55).
Regarding Claim 14, Guo et al teaches a system, wherein the audio output module further comprises: an output device coupled to the transceiver (the electronic device 102 may encode, transmit, store and/or play back the reconstructed speech signal 124 and/or the residual noise-suppressed speech signal 118) (col. 7, line 65-col. 8, line 12); an additional processing unit coupled to the output device (The electronic device includes a processor and memory in electronic communication with the processor) (col. 2, lines 35-51); and an additional memory having an additional set of instructions stored therein which, when executed by the additional processing unit (The electronic device includes a processor and memory in electronic communication with the processor) (col. 2, lines 35-51), cause the output device to receive the enhanced speech signals and produce audible sound based on the enhanced speech signals (the electronic device 102 may encode, transmit, store and/or play back the reconstructed speech signal 124 and/or the residual noise-suppressed speech signal 118) (col. 7, line 65-col. 8, line 12).
Claim 16 is rejected for the same reason as claim 10.
Claim 19 is rejected for the same reason as claim 7.
Claims 2, 4, 11-13, 17, 18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guo et al. and Tashev et al. as applied to claims 1, 8, and 15 above, and further in view of Wingate et al. (US 2016/0071526).
Regarding Claim 2, Guo et al. fails to teach a method, wherein using the trained mixed NMF dictionary to remove the noise components from the noisy speech data to produce the enhanced speech data comprises: with the processor, generating a NMF representation of the noisy speech data using the trained mixed NMF dictionary.
Tashev et al. teaches a method, wherein using the trained mixed NMF dictionary to remove the noise components from the noisy speech data to produce the enhanced speech data comprises: with the processor, generating a NMF representation of the noisy speech data using the trained mixed NMF dictionary (The NMFSE system separates speech signals xs(t) from noisy recordings y(t) in a single channel. The NMFSE system uses clean speech data to train ND equisized low-order dictionaries 101 (i.e., small number of atoms or basis functions), Wsi, i=1, . . . , ND, offline with distinct random initializations. To enhance a noisy signal y(t) 102 ( noisy speech), the NMFSE system uses T time windows to compute 103 the MxT dimensional short-time Fourier transform ("STFT"), Y(t,f) (FD noisy speech representation), of the noisy signal y(t)) (col. 3, line 62-col. 4, line 10).
Guo et al and Tashev et al fail to teach with the processor, applying a mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data.
Wingate et al teaches with the processor, applying a mask to the noisy speech data to remove the noise components from the noisy speech data to produce at least one speech component of the noisy speech data (This mask is then used in step 934 to perform signal separation in the frequency domain producing [tilde over (X)](f,n), which is then passed to a spectral inversion stage 936 in which the time signal [tilde over (x)](t) is determined for example using an inverse transform) (page 18, paragraph [0212]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Regarding Claim 4, Guo et al. and Tashev et al. fail to teach a method, wherein using the trained mixed NMF dictionary to remove the noise components from the noisy speech data to produce the enhanced speech data further comprises: with the processor, performing a first domain transform on the noisy speech data to transform the noisy speech data from a time domain to a frequency domain; and  with the processor, performing a second domain transform on the at least one speech component to transform the at least one speech component from the frequency domain to the time domain to produce the enhanced speech data.
Wingate et al. teaches a method, wherein using the trained mixed NMF dictionary to remove the noise components from the noisy speech data to produce the enhanced speech data further comprises: with the processor, performing a first domain transform on the noisy speech data to transform the noisy speech data from a time domain to a frequency domain (transformation function, such as e.g. Fast Fourier Transform (FFT), is applied transforming the waveform multiplied by the window function from a time domain to a frequency domain)  (page 11, paragraph [0134]); and  with the processor, performing a second domain transform on the at least one speech component to transform the at least one speech component from the frequency domain to the time domain to produce the enhanced speech data (an inverse transformation function (e.g. inverse STFT) may be applied to obtain the desired separated signal of interest in the time domain) (page 11, paragraph [135]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Regarding Claim 11, Guo et al. and Tashev et al. fail to teach a system, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: generate a mask based on the NMF representation; and apply the mask to the noisy speech data to suppress the noise from the noisy speech data to produce a speech component.
Wingate et al. teaches a system, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: generate a mask based on the NMF representation (one approach to representing the hidden multiple source structure is using a non-negative matrix factorization (NMF) approach, and, more generally, a non-negative tensor (i.e., three or more dimensional) factorization (NTF) approach) (page 18, paragraph [0217]); and apply the mask to the noisy speech data to suppress the noise from the noisy speech data to produce a speech component (This mask is then used in step 934 to perform signal separation in the frequency domain producing [tilde over (X)](f,n), which is then passed to a spectral inversion stage 936 in which the time signal [tilde over (x)](t) is determined for example using an inverse transform) (page 18, paragraph [0212]).
Therefore,  it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Regarding Claim 12, Guo et al. and Tashev et al. fail to teach a system, wherein the mask is a soft mask, and wherein to apply the mask, the signal processing module uses a filter bank to separate the noisy speech data into a plurality of frequency band components, and then multiplies each of the plurality of frequency band components by a respective value of an array of values between 0 and 1, wherein a given value of the array of values by which a given frequency band component of the plurality of frequency band components is multiplied is determined based on a ratio of noise to speech for the given frequency band component.
Wingate et al. teaches a system, wherein the mask is a soft mask, and wherein to apply the mask, the signal processing module uses a filter bank to separate the noisy speech data into a plurality of frequency band components, and then multiplies each of the plurality of frequency band components by a respective value of an array of values between 0 and 1, wherein a given value of the array of values by which a given frequency band component of the plurality of frequency band components is multiplied is determined based on a ratio of noise to speech for the given frequency band component (For each source s, the quantities M.sub.s(f,n) may be viewed as soft masks because their value in each time-frequency bin is a number between zero and one, inclusive. In other implementations, one may modify the mask, such as by applying a threshold to it to produce a hard mask, which only takes values zero and one, and typically has the effect of increasing perceived separation but may also cause artifacts. In some embodiments, masks may be modified by other nonlinearities. In some embodiments, the values of a soft or a hard mask may be softened by reducing their range from [0,1] to some smaller subset, e.g. [0.1, 0.9], to have the effect of decreasing artifacts at the expense of decreased perceived separation) (page 24, paragraph [0298]).
Therefore,  it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Regarding Claim 13, Guo et al. and Tashev et al. fail to teach a system, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: apply a Fourier transform to the noisy speech data to transform the noisy speech data from a time domain to a frequency domain; and to apply an inverse Fourier transform to the speech component to transform the speech component from the frequency domain to the time domain to produce the enhanced speech data.
Wingate et al. teaches a system, wherein, when executed by the processing unit, the set of instructions further cause the signal processing module to: apply a Fourier transform to the noisy speech data to transform the noisy speech data from a time domain to a frequency domain (transformation function, such as e.g. Fast Fourier Transform (FFT), is applied transforming the waveform multiplied by the window function from a time domain to a frequency domain)  (page 11, paragraph [0134]); and to apply an inverse Fourier transform to the speech component to transform the speech component from the frequency domain to the time domain to produce the enhanced speech data (an inverse transformation function (e.g. inverse STFT) may be applied to obtain the desired separated signal of interest in the time domain) (page 11, paragraph [135]).
Therefore,  it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Claim 17 is rejected for the same reason as claim 2.
Claim 18 is rejected for the same reason as claim 4.
Regarding Claim 20, Guo et al. and Tashev et al. fail to teach a signal processing module, wherein the processing unit is configured to produce the enhanced speech data from the noisy speech data in less than 10 milliseconds.
Wingate et al. teaches a signal processing module, wherein the processing unit is configured to produce the enhanced speech data from the noisy speech data in less than 10 milliseconds (The information generated in step 920 is then used in a signal separation step 930 to produce one or more separated time signals [tilde over (x)](t), thereby separating the audio mixture received in step 910 into component sources) (page 17, paragraph [0204]).
Therefore,  it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Guo and Tashev with the teachings of Wingate to improve the accuracy and efficiency of the audio signals.
Allowable Subject Matter
Claims 1-20 would be allowable if rewritten or amended to overcome the rejections under non-statutory double patenting, set forth in this Office Action.
Claim 3 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  Specifically, the prior art fails to disclose “wherein using the trained mixed NMF to remove the noise components from the noisy speech data to produce the enhanced speech data further comprises: with the processor, generating the mask based on the NMF representation” as recited in claim 3.
Claims 21-26 would be allowable if rewritten or amended to overcome the rejections under non-statutory double patenting, set forth in this Office Action.
The following is a statement of reasons for the indication of allowable subject matter:  Claim 21 of the current application teaches similar subject matter as the prior art of Guo et al. (US 10,013,975) and Tashev et al. (US 10/276,179).  However, the prior art fails to disclose “combining, by the first processor, the trained speech NMF dictionary with the trained noise NMF dictionary to generate a trained mixed NMF dictionary” as recited in claim 21.
Claims 22-26 would be allowed for being dependent on an allowable base claim if the double patenting rejection is overcome.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/21/2022 was filed in compliance with the provisions of 37 CFR 1.97 and 1.98.  Accordingly, the information disclosure statement is being considered by the examiner.
Cited Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Le Magoarou et al. (US 9,734,842) discloses separation of speech and background from an audio mixture by using a speech example.
Boulanger-Lewandowski et al. (US 9,721,202) discloses non-negative matrix factorization regularized by recurrent neural networks for audio processing.
Le Roux et al. (US 9,679,559) discloses source signal separation by discriminatively-trained non-negative matrix factorization.
Zhang et al. (US 8,874,441) discloses noise suppression using multiple sensors of a communication device.
Wilson et al. (US 8,015,003) discloses denoising acoustic signals using constrained non-negative matrix factorization.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SATWANT K SINGH whose telephone number is (571)272-7468. The examiner can normally be reached Monday thru Friday 8:30 AM to 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mohammad H. Ghayour can be reached on (571)272-3021. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SATWANT K SINGH/Primary Examiner, Art Unit 2672