DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The disclosure is objected to because of the following informalities: para [0032] refers to “Gaussian noise 204” in line 8 and line 11, which should instead read “Gaussian noise 208”.  
Appropriate correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 7, 12, 21, 27, 32-34 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Arik et al. (US 2019/0355347 A1), hereinafter referred to as Arik.

Examiner notes that the independent claims recite the language “to help” which is considered intended use, and is not given patentable weight. In the interest of compact prosecution, Examiner has mapped the claims as if the limitations had patentable weight, but recommends changing the language to “configured to” or a similar amendment.

Regarding claim 1, Arik teaches:
A processor, comprising: 
one or more arithmetic logic units (ALUs) to help synthesize a second audio signal based, at least in part, on one or more neural networks trained using one or more characteristics of a first audio signal (para [0141], where a CPU is used, and Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram).  

Regarding claim 7, Arik teaches:
A system, comprising: 
one or more processors to help synthesize a second audio signal based, at least in part, on one or more neural networks trained using one or more characteristics of a first audio signal (para [0141], where a CPU is used, and Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram); and 
one or more memories to store parameters associated with the one or more neural networks (para [0141], where memory is used).  

Regarding claim 12, Arik teaches:
The system of claim 7, wherein the first audio signal is human speech (para [0093], where recordings of speakers are used).  

Regarding claim 21, Arik teaches:
A processor, comprising: 
one or more arithmetic logic units (ALUs) to help train one or more neural networks to synthesize a second audio signal based, at least in part, on one or more characteristics of a first audio signal (para [0141], where a CPU is used, and Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram).  

Regarding claim 27, Arik teaches:
A system, comprising: 
one or more processors to help train one or more neural networks to synthesize a second audio signal based, at least in part, on one or more characteristics of a first audio signal (para [0141], where a CPU is used, and Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram); and 
one or more memories to store parameters associated with the one or more neural networks (para [0141], where memory is used).  

Regarding claim 32, Arik teaches:
The system of claim 27, wherein the second audio signal encodes synthesized human speech (para [0100], where the synthesized output is speech).  

Regarding claim 33, Arik teaches:
The system of claim 27, wherein the digital representation of the first audio signal is a digital recording of human speech (para [0093], where recordings of speakers are used).  

Regarding claim 34, Arik teaches:
A computer-implemented method, comprising: 
training one or more neural networks to synthesize a second audio signal based, at least in part, on one or more characteristics of a first audio signal (Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 14 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Arik, in view of Bastyr (US 2020/0388267 A1).

Regarding claim 14, Arik teaches:
A speech synthesis system comprising: 
one or more processors to help synthesize a second audio signal based, at least in part, on one or more neural networks trained using one or more characteristics of a first audio signal (para [0141], where a CPU is used, and Fig. 3-4, para [0082], [0087], where a CNN is trained using an input spectrogram with multiple frequency channels, and the trained CNN is used to generate a synthesized waveform from an input spectrogram); 
one or more memories to store parameters associated with the one or more neural networks (para [0141], where memory is used); and
Arik does not teach:
one or more audio output devices to play the second audio signal.  
Bastyr teaches:
one or more audio output devices to play the second audio signal (Fig. 1 element 124, para [0020], where a speaker outputs audio signals).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system or Arik by using the speakers of Bastyr (Bastyr para [0020]) in the system of Arik (Arik Fig. 19) in order to help reduce the level of unwanted noise from specific acoustic bands (Bastyr para [0003]).

Regarding claim 20, Arik in view of Bastyr teaches:
The speech synthesis system of claim 14, wherein the speech synthesis system comprises a vehicle (Bastyr para [0020], where the system is used in a vehicle).  

Claims 6, 13, 26, and 39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Arik, in view of Bach et al. (US 2018/0018553 A1), hereinafter referred to as Bach.

Regarding claim 6, Arik teaches:
The processor of claim 1
Arik does not teach:
wherein the one or more neural networks are to be trained in a first direction and to generate inferences in a second direction.
Bach teaches:
wherein the one or more neural networks are to be trained in a first direction and to generate inferences in a second direction (Fig. 1b, 2b, para [0226], where the network is trained in one direction and can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik by using the training of Bach (Bach para [0221]) on the network of Arik (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Regarding claim 13, Arik teaches:
The system of claim 7
Arik does not teach:
wherein the one or more neural networks are to be trained in a first direction and to generate inferences in a second direction.
Bach teaches:
wherein the one or more neural networks are to be trained in a first direction and to generate inferences in a second direction (Fig. 1b, 2b, para [0226], where the network is trained in one direction and can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik by using the training of Bach (Bach para [0221]) on the network of Arik (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Regarding claim 26, Arik teaches:
The processor of claim 21
Arik does not teach:
wherein the one or more neural networks are to be trained in a first direction and to inference in a second direction.
Bach teaches:
wherein the one or more neural networks are to be trained in a first direction and to inference in a second direction (Fig. 1b, 2b, para [0226], where the network is trained in one direction and can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik by using the training of Bach (Bach para [0221]) on the network of Arik (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Regarding claim 39, Arik teaches:
The method of claim 34,
Arik does not teach:
wherein the one or more neural networks are to be trained in a first direction and to inference in a second direction.
Bach teaches:
wherein the one or more neural networks are to be trained in a first direction and to inference in a second direction (Fig. 1b, 2b, para [0226], where the network is trained in one direction and can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik by using the training of Bach (Bach para [0221]) on the network of Arik (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Claim 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Arik, in view of Bastyr, and further in view of Bach.

Regarding claim 19, Arik in view of Bastyr teaches:
The speech synthesis system of claim 14
Arik in view of Bastyr does not teach:
wherein the one or more neural networks generates Gaussian values in a first direction and synthesizes audio signals in a second direction.
Bach teaches:
wherein the one or more neural networks generates Gaussian values in a first direction and synthesizes audio signals in a second direction (Fig. 1b, 2b, para [0226], where the network is trained in one direction and can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik in view of Bastyr by using the training of Bach (Bach para [0221]) on the network of Arik in view of Bastyr (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Claim 31 is/are rejected under 35 U.S.C. 103 as being unpatentable over Arik, in view of Ehteshami Bejnordi et al. (US 2020/0372361 A1), hereinafter referred to as Bejnordi, and further in view of Bach.

Regarding claim 31, Arik teaches:
The system of claim 27
Arik does not teach:
wherein the one or more neural networks generates Gaussian values in a first direction and synthesizes audio signals in a second direction.  
Bejnordi teaches:
wherein the one or more neural networks generates Gaussian values in a first direction (para [0068], where the output of the neural network follows a gaussian distribution) and
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik by using the training of Bejnordi (Bejnordi para [0068]) on the network of Arik (Arik para [0082], [0087]) in order to forgo performing batch normalization operations without any loss in performance or accuracy of output (Bejnordi para [0068]).
Bach teaches:
synthesizes audio signals in a second direction (Fig. 1b, 2b, para [0226], where the network can operate in both directions).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Arik in view of Bejnordi by using the training of Bach (Bach para [0221]) on the network of Arik in view of Bejnordi (Arik para [0082], [0087]) in order to determine relevance scores of feature samples rather than input variables (Bach para [0226]).

Allowable Subject Matter
Claims 2-5, 8-11, 15-18, 22-25, 28-30, 35-38, and 40 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The closest prior art of Arik, Bach, Bastyr, and Bejnordi do not teach all the limitations of the claims. Specifically, none of the cited prior art teaches the generation of Gaussian values based on both the audio signal and the converted compact representation, and training the neural networks using the Gaussian values, in combination with the other limitations. While Bejnordi does teach training the neural networks using the Gaussian values, these Gaussian values are not generated based on an input audio and an input compact representation. Hence, none of the cited prior art, either alone or in combination thereof, teaches the combination of limitations found in the noted claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2021/0375248 A1 para [0036] teaches use of a neural network in sound synthesis.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658