DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-7 and 12-18 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Chang (US PG Pub 20210166705).

	As per claims 1 and 12, Chang discloses:	A computer-implemented method and system comprising:
	a non-transitory machine-readable memory configured to store machine-readable instructions for one or more neural networks (Chang; p. 0100 - The methods according to the above-described example embodiments may be recorded in non-transitory computer-readable media including program instructions to implement various operations of the above-described example embodiments); and 	a computer comprising a processor (Chang; p. 0098 – a processor) configured to:
	obtaining, by a computer, a plurality of training audio signals including one or more lower- bandwidth audio signals having a first bandwidth and one or more corresponding higher- bandwidth audio signals having a second bandwidth, wherein the first bandwidth is comparatively lower than the second bandwidth (Chang; Fig 1, item 110; p. 0037 - extracting feature vectors from a narrowband (NB) signal and a wideband (WB) signal of a speech);	training, by the computer, a bandwidth expander comprising a set of one or more neural network layers of a neural network, the bandwidth expander trained by applying the neural network on the plurality of training audio signals (Chang; Fig. 1, item 130; p. 0037 - training a DNN classification model that discriminates the estimated feature vector of the wideband signal from the actually extracted feature vector of the wideband signal and feature vector of the narrowband signal);	receiving, by the computer, an inbound audio signal having the first bandwidth (Chang; Fig. 4, item 401; p. 0070-0072 – narrowband signal input for bandwidth extension); and 	generating, by the computer, an estimated inbound audio signal having the second bandwidth by applying the bandwidth expander of the neural network on the inbound audio signal (Chang; Fig. 4, item 403; p. 0070-0072 – generating wideband signal 403 using a DNN generation model 410 and a narrowband signal 401 as an input).
	As per claims 2 and 13, Chang discloses:	The method and system according to claims 1 and 12, wherein obtaining the plurality of training audio signals includes: generating, by the computer, a lower-bandwidth audio signal having the first bandwidth by executing a codec program on a higher-bandwidth audio signal having the second bandwidth (Chang; p. 0050 - the narrowband signal may be generated by down-sampling the wideband signal. To apply a performance degradation in an actual communication environment, the narrowband signal may be modified using a narrowband codec, for example, an adaptive multi-rate (AMR) or an adaptive multi-rate narrowband (AMR-NB)).		As per claims 3 and 14, Chang discloses:	The method and system according to claims 1 and 12, wherein obtaining the plurality of training audio signals includes: generating, by the computer, a simulated lower-bandwidth audio signal having a type of degradation by executing an augmentation operation for the type of degradation on a lower- bandwidth audio signal, the plurality of training audio signals further comprising the simulated lower-bandwidth audio signal, and wherein the inbound audio signal has the type of degradation, whereby the estimated inbound audio signal generated by the computer is an enhanced inbound audio signal having comparatively less of the type of degradation (Chang; p. 0060-0065 - The feature vector extractor 210 may extract feature vectors from a narrowband signal and a wideband signal of a speech. The narrowband signal may be generated by down-sampling the wideband signal and may degrade the performance using a narrowband codec to apply a performance degradation by a codec in an actual communication environment. For example, the narrowband signal may be modified using the narrowband codec, for example, an AMR or an AMR-NB to apply the performance degradation in the actual communication environment).

	As per claims 4 and 15, Chang discloses:	The method and system according to claims 1 and 12, further comprising: extracting, by the computer, one or more features from each of the training audio signals, wherein the computer applies the neural network on the one or more features of the training audio signals (Chang; Fig 1, item 110; p. 0037 - extracting feature vectors from a narrowband (NB) signal and a wideband (WB) signal of a speech); and extracting, by the computer, the one or more features from the inbound audio signal, wherein the computer applies the neural network on the one or more features of the inbound audio signal (Chang; Fig. 4, item 401; p. 0070-0072 – a DNN generation model 410 configured to estimate a feature vector 402 of a wideband signal using a feature vector 401 of a narrowband signal as an input may be trained).		As per claims 5 and 16, Chang discloses:	The method and system according to claims 1 and 12, wherein at least one higher-bandwidth audio signal of the plurality of training signals originated via a channel configured for the second bandwidth (Chang; Fig 1, item 110; p. 0037 - extracting feature vectors from a narrowband (NB) signal and a wideband (WB) signal of a speech).		As per claims 6 and 17, Chang discloses:	The method and system according to claims 1 and 12, wherein the computer generates the estimated inbound audio signal, in response to the computer determining that the inbound audio signal originated via a channel configured for the first bandwidth (Chang; p. 0050 - the narrowband signal may be generated by down-sampling the wideband signal. To apply a performance degradation in an actual communication environment, the narrowband signal may be modified using a narrowband codec, for example, an adaptive multi-rate (AMR) or an adaptive multi-rate narrowband (AMR-NB)).	As per claims 7 and 18, Chang discloses:	The method and system according to claims 1 and 12, wherein training further comprises performing, by the computer, a loss function of the neural network according to a training estimated audio signal outputted by the neural network for a training audio signal, the loss function instructing the computer to update one or more hyperparameters of one or more layers of the bandwidth expander (Chang; p. 0078-0082 - A cost function (loss function) may be designed to determine a classification result D(x) of an actual wideband signal x as 1 and to determine a classification result D(G(z)) of an estimated wideband signal G(z)as 0 according to the following Equation 1).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 8-11 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chang in view of Heigold (US PG Pub 20170069327).

	As per claims 8 and 19, Chang discloses:	The method and system according to claims 1 and 12, upon which claims 8 and 19 depend.	Chang, however, fails to disclose training, by the computer, a speaker recognizer comprising a second set of one or more neural network layers by applying the speaker recognizer on a plurality of second training audio signals comprising one or more clean audio signals and simulated audio signals; extracting, by the computer, an enrollee voiceprint for an enrollee by applying the speaker recognizer on one or more enrollee audio signals of the enrollee; extracting, by the computer, an inbound voiceprint for an inbound speaker by applying the neural network architecture to the estimated inbound audio signal; and generating, by the computer, a likelihood score based upon the inbound voiceprint and the enrollee voiceprint, the likelihood score indicating a likelihood that the inbound speaker is the enrollee. 	Heigold does teach training, by the computer, a speaker recognizer comprising a second set of one or more neural network layers by applying the speaker recognizer on a plurality of second training audio signals comprising one or more clean audio signals and simulated audio signals (Heigold; p. 0051 - At stage (B), the neural network 140 may be trained in a manner that parallels the enrolment and verification of users at a client device. Accordingly, the computing system 120 can select in each training sample 122 a set of simulated enrollment utterances 122b and a simulated verification utterance 122a. The simulated enrollment utterances 122b may all be utterances of the same training speaker, such that a simulated speaker model can be determined for each training sample 122. The simulated verification utterance 122a may be an utterance of the same speaker as the speaker of the simulated enrollment utterances 122b, or may be an utterance of a different speaker. The training samples 122 can then be provided to the neural network 140, and a classification can be made based on outputs of the neural network 140 as to whether the simulated verification utterance 122a was spoken by the same speaker as the speaker of the simulated enrollment utterances 122b, or by a different speaker from the speaker of the simulated enrollment utterances 122b. The neural network 140 can then be updated based on whether the speaker determination was correct); extracting, by the computer, an enrollee voiceprint for an enrollee by applying the speaker recognizer on one or more enrollee audio signals of the enrollee (Heigold; p. 0050 - Each training speaker may speak a predetermined utterance to a computing device, and the computing device may record an audio signal that includes the utterance. For example, each training speaker may be prompted to speak the training phrase "Hello Phone." In some implementations, each training speaker may be prompted to speak the same training phrase multiple times. The recorded audio signal of each training speaker may be transmitted to the computing system 120, and the computing system 120 may collect the recorded audio signals from many different computing devices and many different training speakers); extracting, by the computer, an inbound voiceprint for an inbound speaker by applying the neural network architecture to the estimated inbound audio signal (Heigold; p. 0050 - Each training speaker may speak a predetermined utterance to a computing device, and the computing device may record an audio signal that includes the utterance. For example, each training speaker may be prompted to speak the training phrase "Hello Phone." In some implementations, each training speaker may be prompted to speak the same training phrase multiple times. The recorded audio signal of each training speaker may be transmitted to the computing system 120, and the computing system 120 may collect the recorded audio signals from many different computing devices and many different training speakers. In some implementations, the neural network 140 may be optimized for text-dependent speaker verification, in that a user's identity may be verified based on characteristics of the user's voice determined from an utterance of the pre-defined training phrase. In such implementations, the neural network 140 may be trained on utterances that all, or substantially all, include the pre-defined training phrase. In other implementations, the neural network 140 may be trained to allow for text-independent speaker verification, in that a user's identity may be verified based on characteristics of the user's voice determined from an utterance of a wide variety of words or phrases, which may not be pre-defined), and generating, by the computer, a likelihood score based upon the inbound voiceprint and the enrollee voiceprint, the likelihood score indicating a likelihood that the inbound speaker is the enrollee (Heigold; p. 0072 - For example, if the training sample were labeled as truly having non-matching speakers, incorrectly classified the training sample as having matching speakers, then the neural network 206 may be automatically adjusted to correct the error. More generally, the neural network 206 may be optimized so as to maximize the similarity score for matching speakers samples or to optimize a score output by the logistic regression, and the neural network 206 may also be optimized so as to minimize the similarity score for non-matching speakers samples or to optimize the score output by the logistic regression).
	Therefore, it would have been obvious to one of ordinary skill in the art to modify the method of Chang to include training, by the computer, a speaker recognizer comprising a second set of one or more neural network layers by applying the speaker recognizer on a plurality of second training audio signals comprising one or more clean audio signals and simulated audio signals; extracting, by the computer, an enrollee voiceprint for an enrollee by applying the speaker recognizer on one or more enrollee audio signals of the enrollee; extracting, by the computer, an inbound voiceprint for an inbound speaker by applying the neural network architecture to the estimated inbound audio signal; and generating, by the computer, a likelihood score based upon the inbound voiceprint and the enrollee voiceprint, the likelihood score indicating a likelihood that the inbound speaker is the enrollee, as disclosed by Heigold, in order to allow a user to “enroll” with the device by providing to the device one or more samples of speech spoken by the user, from which a speaker model representing the user's voice is determined. Subsequent speech samples received at the device may then be processed and evaluated with respect to the speaker model to verify a user's identity (Heigold; p. 0002).
	As per claim 9, Chang in view of Heigold discloses:	The method according to claim 8, further comprising executing, by the computer, one or more data augmentation operations on at least of a second training audio signal and an enrollee audio signal (Chang; p. 0060-0065 - The feature vector extractor 210 may extract feature vectors from a narrowband signal and a wideband signal of a speech. The narrowband signal may be generated by down-sampling the wideband signal and may degrade the performance using a narrowband codec to apply a performance degradation by a codec in an actual communication environment. For example, the narrowband signal may be modified using the narrowband codec, for example, an AMR or an AMR-NB to apply the performance degradation in the actual communication environment).

	As per claim 10, Chang in view of Heigold discloses:
	The method according to claim 9, wherein executing the one or more data augmentation operations includes applying the bandwidth expander on the at least of the second training audio signal and the enrollee audio signal (Chang; Fig. 4, item 403; p. 0070-0072 – generating wideband signal 403 using a DNN generation model 410 and a narrowband signal 401 as an input).

	As per claims 11 and 20, Chang in view of Heigold discloses:	The method and system according to claims 8 and 19, further comprising generating, by the computer, an estimated enrollee audio signal for the one or more enrollee audio signals by applying the bandwidth expander on an enrollee audio signal having the first bandwidth and originated via a channel configured for the first bandwidth (Chang; Fig. 4, item 403; p. 0070-0072 – generating wideband signal 403 using a DNN generation model 410 and a narrowband signal 401 as an input).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art made of record and not relied upon includes:	Schmidt (US PG Pub 20200243102) discloses an apparatus for generating a bandwidth enhanced audio signal from an input audio signal having an input audio signal frequency range includes: a raw signal generator configured for generating a raw signal having an enhancement frequency range, wherein the enhancement frequency range is not included in the input audio signal frequency range; a neural network processor configured for generating a parametric representation for the enhancement frequency range using the input audio frequency range of the input audio signal and a trained neural network; and a raw signal processor for processing the raw signal using the parametric representation for the enhancement frequency range to obtain a processed raw signal having frequency components in the enhancement frequency range, wherein the processed raw signal or the processed raw signal and the input audio signal frequency range of the input audio signal represent the bandwidth enhanced audio signal (Schmidt; Abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Rodrigo A Chavez whose telephone number is (571)270-0139. The examiner can normally be reached Monday - Friday 9-6 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 5712727602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RODRIGO A CHAVEZ/Examiner, Art Unit 2658                                                                                                                                                                                                        

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658