Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-13,15-25,27-37 of U.S. Patent No. 10,614,826. Although the claims at issue are not identical, they are not patentably distinct from each other because although the claims of ‘826 patent include extra steps towards the watermarking aspect of the speech signal, these elements are not necessary to realize the functionality of the instant claims.  See table below.









16/994432
10,614,826
1. A method of watermarking speech data, the method comprising: generating, using a generator, speech data including a watermark, wherein the generator is trained to generate speech data including the watermark, the training comprising: generating first speech data and/or second speech data from the generator, the first speech data and the second speech data each configured to represent speech, the first speech data and the second speech data each including a candidate watermark; producing an inconsistency message as a function of at least one difference between the first speech data and at least authentic speech data; transforming the first speech data and/or the second speech data, including the candidate watermark, using a watermark robustness module to produce transformed speech data including a transformed candidate watermark, the 
2. The method of claim 1, wherein the authentic speech data represents (1) a particular target speaker relative to a plurality of different speakers, or (2) human speech. 
3. The method of claim 1, wherein the training further comprises: generating second speech data configured to represent speech as a function of the inconsistency message and the watermark-detectability message. 
4. The method of claim 3, further comprising: transforming the second speech data using the watermark robustness module to produce transformed second speech data; and producing a second watermark-detectability 
5. The method of claim 4, further comprising: repeating the steps of: generating speech data, transforming the speech data using the watermark robustness module to produce transformed speech data, and producing the watermark-detectability message, to produce a robust watermark, the robust watermark configured such that it is embedded in the speech data to produce watermarked speech data, the watermarked speech data configured to represent authentic speech and to include a detectable robust watermark when transformed by the watermark robustness module. 
6. The method of claim 1, wherein the watermark robustness module transforms the speech data by performing a mathematical 
7. The method of claim 1, wherein the inconsistency message is produced by a discriminative neural network, the first speech data is generated by a generative neural network, the watermark-detectability message is generated by a second discriminative neural network 
8. The method of watermarking speech data of claim 1, wherein the training further comprises: repeating one or more steps of: generating updated speech data configured to represent human speech as a function of the inconsistency message and the watermark-detectability message, producing an updated inconsistency message relating to at least one difference between the updated speech data and realistic human speech, transforming the updated speech data to produce an updated transformed candidate watermark, detecting 
9. The method of watermarking speech data of claim 1, further comprising: transforming the updated speech data to produce an updated transformed candidate watermark using a plurality of various different transformations. 




11. The system as defined by claim 10, wherein the synthetic speech in the target voice including the watermark cannot be detected as synthetic by the discriminative neural network, the discriminative neural network having access to speech data from a 
12. The system as defined by claim 10, wherein the generative neural network is paired to the watermark network, such that the watermark is configured to be detected by the watermark network that trained the generative neural network. 
13. The system as defined by claim 10, further comprising a watermark robustness module configured to transform the speech data to produce transformed speech data, and producing an inconsistency message as a function of the transformed speech data. 
14. A system for training machine learning to produce a speech watermark, the system comprising: a watermark robustness module configured to (1) receive first speech data that represents realistic speech, the first speech data generated by a generative machine learning system, and (2) transform the first 
15. The system of claim 14, further comprising: a generative neural network configured to generate the first speech data that represents human speech. 
16. The system of claim 14, further comprising: a discriminative neural network configured to receive the first speech data and produce an inconsistency message relating to at least one difference between the first speech data and realistic human speech; 
17. The system of claim 14, further comprising: a vector space having a plurality 
18. The system of claim 14, further comprising: the watermark machine learning system configured to embed a watermark in received speech data to produce watermarked speech data, wherein the watermarked speech data represents realistic speech and to include a watermark detectable by the watermark machine learning system when the speech data is transformed by the watermark robustness module. 





19. A computer program product for use on a computer system for training a system to 
20. The computer program product of claim 19, further comprising: program code for generating second speech data configured to represent realistic human speech as a function of the inconsistency message and the watermark-detectability message, the second speech data configured to include a watermark and to represent realistic human speech. 

 

 
3. The method as defined by claim 1, wherein the second candidate speech segment provides a higher probability that of being from the target voice than the first candidate speech segment. 
 
4. The method as defined by claim 1, further comprising transforming the source speech data to into the target timbre. 
 
5. The method as defined by claim 1, wherein the target timbre data is obtained from an audio input in the target voice. 
 
6. The method as defined by claim 1, wherein the machine learning system is a neural network. 


 
8. The method as defined by claim 7, further comprising: adjusting a representation of the first candidate voice relative to representations of the plurality of voices in the vector space to reflect the second candidate voice as a function of the inconsistency message. 
 
9. The method as defined by claim 1, wherein the inconsistency message is produced when the discriminative neural network has less than a 95 percent confidence interval that the first candidate voice is the target voice. 
 


 
11. The method as defined by claim 1, wherein the plurality of voices are in a vector space. 

12. The method as defined by claim 1, wherein the target timbre data is filtered by a temporal receptive field. 
 
13. The method as defined by claim 1, further comprising using the generative machine learning system to produce a final candidate speech segment in a final candidate voice, as a function of a null inconsistency message, the final candidate speech segment mimicking the first speech segment in the target timbre. 
 

 

16. A system for training a speech conversion system, the system comprising: source speech data that represents a first speech segment of a source voice; target timbre data that relates to a target voice; a generative machine learning system configured to produce first candidate speech data that represents a first candidate speech segment in a first candidate voice as a function of the source speech data and the target timbre data; a discriminative machine learning system configured to: compare the first candidate speech data to the target timbre data with reference to timbre data of a plurality of different voices, and determine whether there is at least one inconsistency between the first candidate speech data and the target timbre data with reference to the timbre data of the plurality of 

17. The system as defined by claim 16, wherein the generative machine learning system is configured to produce a second candidate speech segment as a function of the inconsistency message. 
 
18. The system as defined by claim 16, wherein the machine learning system is a neural network. 
 
19. The system as defined by claim 16, further comprising: a vector space configured to map a representation of the plurality of voices, including the candidate voice, as a 
 

20. The system as defined by claim 19, wherein a voice feature extractor is configured to adjust a representation of the candidate voice relative to representations of the plurality of voices in the vector space to update and reflect the second candidate voice as a function of the inconsistency message. 
 
21. The system as defined by claim 16, wherein the candidate voice is distinguished from the target voice when the discriminative neural network has less than a 95 percent confidence interval. 
 
22. The system as defined by claim 16, wherein the discriminative machine learning system is configured to determine the identity of the speaker of the candidate voice by 
 
23. The system as defined by claim 16, further comprising a vector space configured to contain a plurality of voices. 
 
24. The system as defined by claim 16, wherein the generative machine learning system is configured to produce a final candidate speech segment in a final candidate voice, as a function of a null inconsistency message, the final candidate speech segment mimicking the first speech segment as the target voice. 
 25. The system as defined by claim 16, wherein the target timbre data is filtered by a temporal receptive field. 
 
27. The system as defined by claim 16, wherein the source speech data is from a source audio input. 
 

28. A computer program product for use on a computer system for training a speech conversion system using source speech data that represents a speech segment from a source voice for conversion into an output voice having a target voice timbre, the computer program product comprising a tangible, non-transient computer usable medium having computer readable program code thereon, the computer readable program code comprising: program code for causing a generative machine learning system to produce first candidate speech data that represents a first candidate speech segment in a first candidate voice as a function of the source speech data and target timbre data; program code for causing a discriminative machine learning system to compare the first candidate speech data to the target timbre data with reference to the timbre data of the plurality of different voices; program code for causing the discriminative machine learning 
 


 

30. The computer program product as defined by claim 28, wherein the machine learning system is a neural network. 
 
31. The computer program product as defined by claim 28, further comprising: program code for mapping a representation of each of the plurality of voices and the candidate voice in a vector space as a function of the timbre data from each voice. 
 

32. The computer program product as defined by claim 31, further comprising: program code for adjusting the representation of the candidate voice relative to at least one representation of the plurality of voices in the 
 

33. The computer program product as defined by claim 28, further comprising: program code for assigning a speaker identity to the candidate voice by comparing the candidate voice to the plurality of voices. 
 

34. The computer program product as defined by claim 28, further comprising: program code for filtering an inputted target audio using a temporal receptive field to produce the timbre data. 
 

36. The computer program product as defined by claim 28, further comprising: program code for converting the source speech data that represents the speech segment from the 
 

37. The computer program product as defined by claim 36, further comprising: program code for adding a watermark to the transformed speech segment.


Allowable Subject Matter

Claims 1-20 are allowed over the prior art of record.

The following is an examiner’s statement of reasons for allowance:

As per the independent claims, the prior art of record, individually or in combination does not explicitly teach generating first speech data and/or second speech data from the generator, the first speech data and the second speech data each configured to represent speech, the first speech data and the second 10 speech data each including a candidate watermark; producing an inconsistency message as a function of at least one difference between the first speech data and . 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Please see related art listed on the PTO-892 form.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        01/27/2022