Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification

The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1,2,7-9,14-16 are rejected under 35 U.S.C. 103 as being unpatentable over LeRoux et al (20190318725) in view of Dua et al (20200019863).

As per claim 1, LeRoux et al (20190318725) teaches a multi-person speech separation method for a terminal (as speech separation – para 005, in an environment of multiple speakers – para 0019), comprising: 
extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2 (as performing speaker signal separation – para 0009, based on a t-f unit deep clustering – para 0009);
 extracting a masking coefficient of the hybrid speech feature (using masking – para 0009; para 0073  
 and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal 
	
LeRoux et al (20190318725) teaches the above claim limitations, but does not explicitly teach using a generative adversarial network in generating the masking matrix; Dua et al (20200019863) teaches the use of a generative adversarial networks in natural language processing (para 0022), to be used on sentence or paragraph or characters or words or phrases or any other portion of the natural language content (para 0022).  Therefore, it would have been obvious to one of ordinary skill in the art of natural language processing in the area of encoder/decoder networks to incorporate the GAN of Dua et al (20200019863) to operate on the masking of the speech data features of LeRoux et al (20190318725), because it would advantageously arrive at a updated trained generator quicker, as well as optimization of numerical values without having to reconstruct the word sequence signals ( Dua et al (20200019863), para 0025).    

As per claim 2, the combination of LeRoux et al (20190318725) in view of Dua et al (20200019863) teaches the method according to claim 1, wherein before the extracting a hybrid speech feature from a hybrid speech signal requiring separation, the method further comprises:   	obtaining a hybrid speech sample and a clean speech sample from a sample database; extracting a hybrid speech sample feature from the hybrid speech sample (as, using a clean signal speaker speech with reference labels, and performing a loss function of a weighted combination of the recognition – para 0176); 
extracting a masking coefficient of the hybrid speech sample feature by using the generative network model, to obtain a sample masking matrix corresponding to the N human voices (as, performing the mask calculation – para 0073; using a network model – para 0073, 0075); 
LeRoux et al (20190318725) fig. 9a, outputs 953, 954; performing alternate training on the generative network model and the adversarial network model (Dua et al (20200019863), para 0022) by using the separated speech sample, the hybrid speech sample, and the clean speech sample (as using a mix of the separated, hybrid, and clean speech sample -- LeRoux et al (20190318725) – para 0176).

As per claim 7, the combination of LeRoux et al (20190318725) in view of Dua et al (20200019863) teaches the method according to claim 1, wherein the extracting a hybrid speech feature from a hybrid speech signal requiring separation comprises: extracting a time domain feature or a frequency domain feature of a single-channel speech signal from the hybrid speech signal (as extracting a time-frequency domain speech signal – para 0073, with separating speakers after the speaker separation network – fig. 9a, subblock 942);
extracting a time domain feature or a frequency domain feature of a multi-channel speech signal from the hybrid speech signal; extracting a single-channel speech feature from the hybrid speech signal; or extracting a correlated feature among a plurality of channels from the hybrid speech signal (examiner notes the claim language is in the alternative ‘or’, LeRoux et al (20190318725) teaches time-frequency domain feature of the multichannel speech signal – para 0073, 0075).

Claims 8,9,14 are apparatus claims that perform the method steps of claims 1,2,7 above and as such, claims 8,9,14 are similar in scope and content to claims 1,2,7 above and therefore, LeRoux et al (20190318725) teaches a system including processor and memories (para 0220) to perform the steps.

Claims 15,16 are non-transitory computer readable medium claims that perform the method steps contained claims 1,2,7 above and as such, claims 15,16 are similar in scope and content to claims 1,2,7 above and therefore, claims 15,16 are rejected under similar rational as presented against claims 1,2,7 above.  Furthermore, LeRoux et al (20190318725) teaches a storage device storing instructions (para 0088) to perform the steps.

Allowable Subject Matter

Claims 3-6,10-13,17-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter: As per claims 3-6,10-13, 17-20, the claim limitations toward using the GAN with the separated/hybrid/and clean speech sample to calculate a loss function amongst the three types of samples, and continual fixing of the network involving this loss function, is not explicitly taught by the prior art of record.  An representative piece of prior art, LeRoux et al (20190318725), teaches a CTC loss function across references labels from the dataset and single speaker speech, and clean speech (see para 0066), but not at the location as detailed in the claims above.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Please see related prior art listed below, with detailed explanations:
The following references teach the concept of source separation using networks and clean speech:
Deligne et al (20040111260), para 0043, 0044, 0046, 0056
Mao et al (20060239471), para 0105, 0106, 0107, and 0144
Gomez (20150088497) para 0046, 0051, 006, and 0071
Chen et al (20190139563) para 0019, para 0104, 0105

The following references teach the concept of GAN natural language processing:
Wang (20180204121) para 0003, 0052, 0057, 0060
Gao et al (20190114348) para 0017, 0037, 0100

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        
03/12/2022