DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The office action sent in response to Applicant’s communication received on 8/28/2019   for the application number 16553997. The office hereby acknowledges receipt of the following placed of record in the file: Specification, Abstract, Oath/Declaration and claims. 

Claims 1-20 are presented for examination. 
Examiner’s Remark 
Claims 15-20 does not fall under USC 101, because Para 0090, Para 0101 of the specification mentions-  A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire. 

Information Disclosure Statement
The information disclosure submitted on 8/28/2019 was filed before the mailing data of the first office action. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claims 1-2, 5, 10-11,  13, and 15-17 are rejected under 35 U.S.C. 102(a)(1) as being  unpatentable over Piero (WO 2019191251) 

Regarding claim 1, Piero teaches a system ( fig 7) , comprising: a memory that stores computer executable components; and a processor, operably coupled to the memory, and that executes the computer executable components stored in the memory ( Fig 5A,5B, memory and processor) , wherein the computer executable components comprise: a speech analysis component that determines a condition of an origin of an audio signal  ( extract feature 408, Para 0072; wherein the feature is a condition of an origin can be extracted features ‘V’ may be based on differences in speech properties such as differences in pitch, amplitude, duration, etc. between synthetic speech data and recorded reference speech data. The synthetic speech data‘D’ and reference speech data‘R’ (e.g., natural speech) may be aligned in time for facilitating with a feature extraction step or steps. These extracted features‘V’ may include, but are not limited to, Fundamental Frequency (F0), LF (Liljencrants-Fant  model) features representing the source signal (e.g., vocal folds’ behavior), parametric representation of the spectrum (such as Cepstral Coefficients), linguistic features representing the context, linguistic features related to the context, and a difference signal between the recorded reference speech and synthesized speech. In an example, the difference signal that may be modeled is a source signal, and not the parameter space. This difference signal may be modeled in a space of vector quantized excitation vectors that may be built in the training mode. In an example embodiment, where the system 100 is the parametric text-to-speech synthesis system, the extracted features ‘V’ may particularly include a sequence of excitation vectors, corresponding to the differences between the synthetic speech data‘D’ (e.g., SPSS) and the recorded reference speech data‘R’ (e.g., natural speech signal), for the first input text‘TG, Para 0048) based on a difference between a first feature of the audio signal and a second feature of a synthesized reference audio signal ( extract atleast one feature based on the speech and the synthetic data, Para 0072, Fig 7; wherein the feature can be pitch, duration, amplitude etc., Para 0046) [AltContent: rect]  
Regarding claim 2, Piero as above in claim 1, teaches  further comprising: a synthetic speech component that generates the synthesized reference audio signal , wherein the audio signal and the synthesized reference audio signal express a mutual sentence structure ( synthetic speech of the same signal, Fig 7) 



Regarding claim 5, Piero as above in claim 1, teaches  a feature component that extracts a first vector from the audio signal that characterizes the first feature and extracts a second vector from the synthesized reference audio signal that characterizes the second feature ( comparison based on the feature, Para 0075-0076)

Regarding claim 11, arguments analogous to claim 2, are applicable



Regarding claim 10, arguments analogous to claim 1, are applicable. In addition Piero teaches A computer-implemented to perform the steps of claim 1 ( Abstract) 
Regarding claim 15, arguments analogous to claim 1 are applicable. In addition Piero teaches A computer program product for characterizing speech, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform the functions of claim 1 ( Abstract, Fig 5a-5b) 
Regarding claim 13, arguments analogous to claim 5, are applicable 
 Regarding claim 16, arguments analogous to claim 2, are applicable

Regarding claim 17, arguments analogous to claim 5, are applicable 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 3-4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Piero (WO 2019191251) and further in view of Mahyar ( US Pat# 10930263) 

Regarding claim 3, Piero as above in claim 2, does not explicitly teaches  a speech content component that analyzes the audio signal using machine learning model to determine a sentence structure expressed by the audio signal, and wherein the synthetic speech component generates the synthesized reference audio signal to match the sentence structure 
However Mahyar teaches a speech content component that analyzes the audio signal using machine learning model to determine a sentence structure expressed by the audio signal ( words, textual granularity using a learned model, Fig 4, Fig 1, Col 6, line 20-42) , and wherein the synthetic speech component generates the synthesized reference audio signal to match the sentence structure ( optimize to match the recorded speech, Fig 4, Col 6, line 40-67) 
It would have been obvious having the teachings of Piero to further include the concept of Mahyar before effective filing date to improve the technology of synthesized voices ( Col 1, line 45-67, Mahyar) 

Regarding claim 4, Piero as above in claim 1, does not explicitly teaches wherein the synthesized reference audio signal is comprised within a plurality of synthesized reference audio signals generated by the synthetic speech component, and wherein the plurality of synthesized reference audio signals express the mutual sentence structure 
However Mahyar teaches  wherein the synthesized reference audio signal is comprised within a plurality of synthesized reference audio signals generated by the synthetic speech component, and wherein the plurality of synthesized reference audio signals express the mutual sentence structure ( Following training, to synthesize multiple predicted audio waveforms (each one also referred to as “predicted 105 speaking in the target language, each of these neural networks is provided a representation of text input, (e.g., the movie dialogue “What does it mean to be Samurai? . . . to master the way of the sword” as illustrated in text 140), and each neural network predicts as outputs an audio waveform (or parameters for generating an audio waveform) corresponding to the text input being spoken by the speaker 105 in the target language, Col 2, line 25-50) 

It would have been obvious having the teachings of Piero to further include the concept of Mahyar before effective filing date to improve the technology of synthesized voices ( Col 1, line 45-67, Mahyar) 

 Regarding claim 12, arguments analogous to claim 3, are applicable 

Claims 6-8, 14 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Piero (WO 2019191251) and further in view of Chae (  US 20200058290) 
Regarding claim 6, Piero as above in claim 5, teaches a differential component that determines the difference between the first feature and the second feature based on the first vector and the second vector, ( difference based on amplitude, duration etc., Para 0004, 0075) 
Piero does not explicitly teaches wherein the difference correlates to a speech pattern associated with the origin
However Chae teaches wherein the difference correlates to a speech pattern associated with the origin ( extract synthesis analysis, Fig 8; wherein the syntax analysis includes the accent, stress in speech etc., Para 0254, Para 0185) 
It would have been obvious having the teachings of Piero to further include the concept of Chae before effective filing date to further optimize the system ( Fig 8-Fig 9, Para 0300, Chae) 
Regarding claim 7, Piero modified by Chae as above in claim 6, teaches  a classification component that generates a machine learning model that classifies the speech pattern to determine the condition ( machine learning to determine the syntax, Para 0280, 0318, Chae) 
Regarding claim 8, Piero modified by Chae as above in claim 7, wherein the machine learning model utilizes a neural network model ( deep learning, Para 0280, fig 2, fig 10, fig 11, Chae) 


 
Regarding claim 14, Piero as above in claim 13, teaches further comprising: determining, by the system, the difference between the first feature and the second feature based on the first vector and the second vector ( • The training mode may include steps for aligning recorded reference speech data (e.g., natural speech) with synthetic speech, and during the same training mode, features may be extracted (as described above) where the extracted features may include a sequence of excitation vectors corresponding to difference between the synthetic speech data and the recorded reference speech data.• Forming, using vectors, a vector quantized space that may be built during training mode for modelling extracted features (difference signal) to generate speech gap filling model, Para 0027, 0048, 0075) 
Piero does not explicitly teaches  wherein the difference correlates to a speech pattern associated with the origin; and generating, by the system, a neural network model that classifies the speech pattern to determine the condition
However Chae teaches wherein the difference correlates to a speech pattern associated with the origin ( machine learning to determine the syntax, Para 0280, 0318, Chae); and generating, by the system, a neural network model that classifies the speech pattern to determine the condition ( machine learning to determine the syntax, Para 0280, 0318, Chae)
( Fig 8-Fig 9, Para 0300, Chae) 


Regarding claim 20, Piero modified by Chae as above in claim 19, teaches  wherein the processor utilizes a cloud computing environment to generate the machine learning model ( Para 0006-0007, 0088-0092, Chae)  


Regarding claim 18, arguments analogous to claim 6, are applicable 

Regarding claim 19, arguments analogous to claim 7, are applicable

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Piero (WO 2019191251) and further in view of Sabrina (Vocal caricatures reveal signatures of speaker identity) 

Regarding claim 9, Piero as above in claim 1, wherein the condition comprises a member selected from a group consisting of an( amplitude, pitch, duration)  however does not explicitly mentions  identity, an emotional state, an accent, an age, and a health status 
However Sabrina teaches pitch/amplitude/duration determine identity, an emotional state, an accent, an age, and a health status ( pitch determines the identity of the speaker, Under Acoustic spaces of similarity and identity, Left col, page 3; Although studies that focused on prosodic aspects were inconclusive some temporal properties as pitch f0(t), sound intensity I(t) and duration D(t) have been shown to be cues for differentiating voices, Under Data Analysis, Right Col, Page 6) 

(Abstract, Sabrina) 
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to RICHA MISHRA whose telephone number is (571)272-5357.  The examiner can normally be reached on M-T 7AM - 5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Benny Tieu can be reached on (571)272-7490.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.