DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/10/2022 has been entered.
 
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-29 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Objections
Claim 19 objected to because of the following informalities:  line 1 reads “claim 16 the” which should read “claim 16 wherein the”.  Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-29 rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  Using the subject matter eligibility test from page 74621 of the Federal Register Notice titled “2014 Interim Guidance on Patent Subject Matter Eligibility,” a two-step process is performed. Under step 1, the claims are analyzed to determine if the claim is directed to a process, machine, article of manufacture, or composition of matter. In this case, claims 1-15 and 29 are directed to a method, which is a process, and claims 16-28 are directed to a system, which is a machine or article of manufacture. Step 2A (part 1 of the Mayo test), using the guidance from pages 50-57 of the Federal Register Vol. 84 No. 4 from Monday, January 7, 2019, requires applying a two-prong inquiry. In Prong One, examiners evaluate whether the claim recites a judicial exception, determining if the claim is directed to a law of nature, a natural phenomenon, or an abstract idea. In this case, claim 1 recites computing, which is a mathematical calculation. In Prong Two, examiners evaluate whether the judicial exception is integrated into a practical application that imposes a meaningful limit on the judicial exception. In this case, no additional limitations to the abstract idea are provided.
Step 2B (part 2 of the Mayo test) requires analyzing the claims to determine if they recite additional elements that amount to significantly more than the judicial exception. In this case, the claims do not include additional elements that are sufficient to amount to significantly more than the abstract idea itself.  

Regarding claim 1 and 16, receiving audio is a mental process, and computing coefficients and determining a measurement are mathematical calculations, which are abstract ideas, without integration into a practical application and without significantly more.

Regarding claims 2-3, 5, 7, 15, 17-18, 20, and 29, the limitations are further clarifications of the above abstract ideas, without integration into a practical application and without significantly more.

Regarding claims 4 and 19, computation using a neural network is a mathematical calculation, which is an abstract idea without integration into a practical application and without significantly more.

Regarding claims 6 and 21, estimating a glottal state is a mathematical calculation, which is an abstract idea without integration into a practical application and without significantly more.

Regarding claims 8-9 and 22-23, displaying an image and playing an audio recording is considered extra-solution activity that does not qualify as a practical application, nor as significantly more.

Regarding claims 10 and 24, associating variables with an utterance is a mental process, which is an abstract idea without integration into a practical application and without significantly more.

Regarding claims 11-13 and 25-27, computing functions, generating a matrix, and generating an eigenspectrum and determining a measurement are mathematical calculations, which are abstract ideas without integration into a practical application and without significantly more.

Regarding claims 14 and 28, computing changes is a mathematical calculation, which is an abstract idea without integration into a practical application and without significantly more.

The limitations of the claims, taken alone, do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements individually. Applicable case law cited in the Federal Register includes, but is not limited to: Alice Corp., 134 S. Ct. at 2355-56, Digitech Image Tech., LLC v. Electronics for Imaging, Inc., 758 F.3d 1344 (Fed. Cir. 2014), Benson, 409 U.S. at 63.

See "Preliminary Examination Instructions in view of the Supreme Court Decision in Alice Corporation Pty. Ltd. v. CLS Bank International, et al.," dated June 25, 2014, and the Federal Register notice titled "2014 Interim Guidance on Patent Subject Matter Eligibility" (79 FR 74618).

	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 10-13, 15-19, 24-27, and 29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quatieri et al. (US 2015/0112232 A1), hereinafter referred to as Quatieri232, in view of Quatieri et al. (US 2017/0053665 A1), hereinafter referred to as Quatieri665.

Regarding claim 1, Quatieri232 teaches:
A method for measuring neuromotor coordination from speech: 
receiving an audio recording that includes spoken speech (Fig. 8A element 102, para [0065], where input speech is received); 
determining using a computing device time varying feature coefficients from at least a portion of the speech in the audio recording, the feature coefficients representing at least one characteristic of the at least a portion of the speech in the audio recording (para [0020], where features from the speech including MFCCs, formant frequencies, prosodic characteristics, etc. are measured); 
determining using a computing device, from the feature coefficients, one or more time varying vocal tract variables (para [0064], [0070], where the feature domains or channels of formant frequencies and Delta MFCC are the variables computed from the features, which capture vocal tract shape and dynamics); and 
determining using a computing device a measurement of a disorder based at least in part on a degree of correlation between at least two of the vocal tract variables (para [0064], [0070], where auto and cross correlations among channels of measurement domains are the basis for key depression features).  
Quatieri232 does not teach:
determining using a computing device, from the feature coefficients, one or more time varying vocal tract variables, each of said vocal tract variables being an output of a machine learning system estimating a position or movement of a corresponding  articulator of a vocal tract, said machine learning system being trained on data including measurements of positions or movement of articulators of vocal tracts;
Quatieri665 teaches:
determining using a computing device, from the feature coefficients, one or more time varying vocal tract variables (para [0032], where the DIVA model takes as inputs speech formants and computes parameters including articulatory commands), each of said vocal tract variables being an output of a machine learning system estimating a position or movement of a corresponding  articulator of a vocal tract (para [0050], where articulatory states include positions and velocities of articulators, which are output from the model), said machine learning system being trained on data including measurements of positions or movement of articulators of vocal tracts (para [0024], where articulator positions are measured, and para [0021], where the model learns using the measured signals);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 by using the DIVA model of Quatieri665 (Quatieri665 para [0032]) in the determination of the vocal tract changes of Quatieri232 (Quatieri232 para [0070]), as it would allow building a model to match a specific speaker instead of a generic speaker, while also allowing specificity across disorders (Quatieri665 para [0069]).

Regarding claim 2, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1 wherein the feature coefficients represent characteristics of an audio power spectrum of the portion of the speech (Quatieri232 para [0043], where the MFCCs are coefficients that represent the short-term power spectrum).  

Regarding claim 3, Quatieri232 in view of Quatieri665 teaches:
The method of claim 2 wherein the feature coefficients comprise cepstral coefficients (Quatieri232 para [0043], where the MFCCs are coefficients that represent the short-term power spectrum).  

Regarding claim 4, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1
wherein determining the vocal tract variables comprises providing the feature coefficients as inputs to a neural network, and using the neural network to compute the vocal tract variables from the feature coefficients (Quatieri665 para [0021], [0032], where DIVA is a neural network model, which receives feature inputs and computes the vocal tract variables).

Regarding claim 10, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1 further comprising associating the vocal tract variables with an utterance within the audio recording (Quatieri232 para [0033], where the speech-related variables are related to the user's speech).  

Regarding claim 11, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1 wherein determining the measurement of the disorder comprises computing time correlation dependent functions of the at least one vocal tract variables (Quatieri232 para [0032], [0041], where a channel-delay correlation matrix of the channels is used, or alternatively, using the variables of para [0034-39]).  

Regarding claim 12, Quatieri232 in view of Quatieri665 teaches:
The method of claim 11 wherein computing the time correlation dependent functions comprises generating a channel-delay correlation matrix of the vocal tract variables and/or feature coefficients (Quatieri232 para [0032], [0041], where a channel-delay correlation matrix of the channels is used, or alternatively, using the variables of para [0034-39]).  

Regarding claim 13, Quatieri232 in view of Quatieri665 teaches:
The method of claim 12 further comprising generating an eigenspectrum of the channel-delay correlation matrix, and determining the measurement of the disorder comprises identifying eigenvalues within the eigenspectrum that have magnitudes indicating depressed speech (Quatieri232 para [0041], where the eigenvalues are constructed from the matrix, and para [0084], where the matrix eigenspectra are the eigenvalues, and where the eigenvalues are used to differentiate healthy and depressed subjects and estimate depression levels).  

Regarding claim 15, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1 wherein determining the measurement of the disorder further includes a time delay correlation of the vocal tract variables (Quatieri232 Fig. 7, para [0031], where a time delay correlation of the channels is performed).  

Regarding claim 16, Quatieri232 teaches:
A machine-implemented system for measuring neuromotor coordination from speech, the system comprising: 
a receiver to receive an audio recording that includes speech (Fig. 8A element 102, para [0065], where input speech is received); 
a feature extractor configured to compute feature coefficients from at least a portion of the spoken speech in the audio recording, the feature coefficients representing at least one characteristic of the at least a portion of the speech in the audio recording (para [0020], where features from the speech including MFCCs, formant frequencies, prosodic characteristics, etc. are measured);  
a vocal tract variable generator configured to estimate from the feature coefficients one or more time varying vocal tract variables (para [0064], [0070], where the feature domains or channels of formant frequencies and Delta MFCC are the variables computed from the features, which capture vocal tract shape and dynamics); and 
a disorder identification module configured to determine a measurement of a disorder based at least in part on a degree of correlation between at least two of the vocal tract variables (para [0064], [0070], where auto and cross correlations among channels of measurement domains are the basis for key depression features).  
Quatieri232 does not teach:
a vocal tract variable generator comprising a machine learning system trained on data including measurements of positions or movement of articulators of vocal tracts and configured to estimate from the feature coefficients one or more time varying vocal tract variables, each of said vocal tract variables estimating a position or movement of a corresponding articulator of a vocal tract;
Quatieri665 teaches:
a vocal tract variable generator comprising a machine learning system trained on data including measurements of positions or movement of articulators of vocal tracts (para [0024], where articulator positions are measured, and para [0021], where the model learns using the measured signals) and configured to estimate from the feature coefficients one or more time varying vocal tract variables (para [0032], where the DIVA model takes as inputs speech formants and computes parameters including articulatory commands), each of said vocal tract variables estimating a position or movement of a corresponding articulator of a vocal tract (para [0050], where articulatory states include positions and velocities of articulators, which are output from the model);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 by using the DIVA model of Quatieri665 (Quatieri665 para [0032]) in the determination of the vocal tract changes of Quatieri232 (Quatieri232 para [0070]), as it would allow building a model to match a specific speaker instead of a generic speaker, while also allowing specificity across disorders (Quatieri665 para [0069]).

Regarding claim 17, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16 wherein the feature coefficients represent an audio power spectrum of the portion of the spoken speech (Quatieri232 para [0043], where the MFCCs are coefficients that represent the short-term power spectrum).  

Regarding claim 18, Quatieri232 in view of Quatieri665 teaches:
The system of claim 17 wherein the feature coefficients are cepstral coefficients (Quatieri232 para [0043], where the MFCCs are coefficients that represent the short-term power spectrum).  

Regarding claim 19, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16 the machine learning system comprises a neural network that computes the vocal tract variables from the feature coefficients (Quatieri665 para [0021], [0032], where DIVA is a neural network model, which receives feature inputs and computes the vocal tract variables).  

Regarding claim 24, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16 further comprising a time delay correlation module that associates the vocal tract variables with an utterance within the audio recording (Quatieri232 para [0033], where the speech-related variables are related to the user's speech).  

Regarding claim 25, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16 further comprises a time delay correlation module that computes time correlation dependent functions of the at least one vocal trace variables (Quatieri232 para [0032], [0041], where a channel-delay correlation matrix of the channels is used, or alternatively, using the variables of para [0034-39]).  

Regarding claim 26, Quatieri232 in view of Quatieri665 teaches:
The system of claim 25 wherein the time delay correlation module computes the time correlation dependent functions by generating a channel-delay correlation matrix of the vocal tract variables and/or feature coefficients (Quatieri232 para [0032], [0041], where a channel-delay correlation matrix of the channels is used, or alternatively, using the variables of para [0034-39]).  

Regarding claim 27, Quatieri232 in view of Quatieri665 teaches:
The system of claim 26 further the time delay correlation module generates an eigenspectrum of the channel-delay correlation matrix (Quatieri232 para [0041], where the eigenvalues are constructed from the matrix, and para [0084], where the matrix eigenspectra are the eigenvalues); and 
the disorder identification module determining the measurement of the disorder comprises identifying eigenvalues within the eigenspectrum that have magnitudes indicating depressed speech (Quatieri232 para [0041], where the eigenvalues are constructed from the matrix, and para [0084], where the matrix eigenspectra are the eigenvalues, and where the eigenvalues are used to differentiate healthy and depressed subjects and estimate depression levels).  

Regarding claim 29, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1 
wherein the vocal tract variables are selected from a group consisting of: a place of articulation; a manner of articulation; constriction and/or location of lips, tongue tip, tongue body, velum, or glottis; and features and/or position of nasal cavity, buccal cavity, nostrils, epiglottis, trachea, and hard palate (Quatieri665 para [0050], where the variables are positions and velocities of articulators such as tongue, jaw, lips, and larynx).  

Claims 5 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quatieri232, in view of Quatieri665, and further in view of Seneviratne et al. (Seneviratne, N., Sivaraman, G., & Espy-Wilson, C. Y. (2019). Multi-Corpus Acoustic-to-Articulatory Speech Inversion. In Interspeech (pp. 859-863).), hereinafter referred to as Seneviratne.

Regarding claim 5, Quatieri232 in view of Quatieri665 teaches:
The method of claim 4 
Quatieri232 in view of Quatieri665 does not teach:
wherein the neural network comprises stored parameters determined using the Wisconsin X-Ray Microbeam database representing vocal tract variables associated with audio data
Seneviratne teaches:
wherein the neural network comprises stored parameters determined using the Wisconsin X-Ray Microbeam database representing vocal tract variables associated with audio data (Seneviratne pages 860-861 Section 3.2, where a FF-DNN receives MFCCs as input and outputs Tract Variables using XRMB data for training).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the neural network of Seneviratne (Seneviratne pages 860-861, section 3.2) in the estimation of Quatieri232 in view of Quatieri665 (Quatieri232 para [0057], [0062]) in order to achieve a relatively speaker independent representation of speech articulation and characterize salient features of the vocal tract area function (Seneviratne page 860 section 2.1).

Regarding claim 20, Quatieri232 in view of Quatieri665 teaches:
The system of claim 19 
Quatieri232 in view of Quatieri665 does not teach:
wherein the neural network comprises stored parameters using the Wisconsin X-Ray Microbeam database representing vocal tract variables associated with audio data
Seneviratne teaches:
wherein the neural network comprises stored parameters using the Wisconsin X-Ray Microbeam database representing vocal tract variables associated with audio data (Seneviratne pages 860-861 Section 3.2, where a FF-DNN receives MFCCs as input and outputs Tract Variables using XRMB data for training).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the neural network of Seneviratne (Seneviratne pages 860-861, section 3.2) in the estimation of Quatieri232 in view of Quatieri665 (Quatieri232 para [0057], [0062]) in order to achieve a relatively speaker independent representation of speech articulation and characterize salient features of the vocal tract area function (Seneviratne page 860 section 2.1).

Claims 6-7 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quatieri232, in view of Quatieri665, and further in view of Lester et al. (US 2019/0371354 A1), hereinafter referred to as Lester.

Regarding claim 6, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1
Quatieri232 in view of Quatieri665 does not teach:
wherein determining the vocal tract variables includes estimating a glottal state.
Lester teaches:
wherein determining the vocal tract variables includes estimating a glottal state (para [0054], where glottal closure features are determined from the determined fundamental period corresponding to the input audio signal).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the glottal state of Lester (Lester para [0054]) as one of the vocal tract variables of Quatieri232 in view of Quatieri665 (Quatieri232 para [0064], [0070]) in order to break up the signal into segments and create a reduced audio signal without altering the pitch (Lester para [0055]).

Regarding claim 7, Quatieri232 in view of Quatieri665 and Lester teaches:
The method of claim 6 wherein estimating the glottal state comprises calculating the glottal state from acoustic measurements of the audio signal (Lester para [0054], where glottal closure features are determined from the determined fundamental period corresponding to the input audio signal).  

Regarding claim 21, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16
Quatieri232 in view of Quatieri665 does not teach:
further comprising a glottal estimator configured to generate vocal tract variables includes estimating a glottal state.
Lester teaches:
further comprising a glottal estimator configured to generate vocal tract variables includes estimating a glottal state (para [0054], where glottal closure features are determined from the determined fundamental period corresponding to the input audio signal).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the glottal state of Lester (Lester para [0054]) as one of the vocal tract variables of Quatieri232 in view of Quatieri665 (Quatieri232 para [0064], [0070]) in order to break up the signal into segments and create a reduced audio signal without altering the pitch (Lester para [0055]).

Claims 8-9 and 22-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quatieri232, in view of Quatieri665, and further in view of Norsworthy (US 2015/0127352 A1).

Regarding claim 8, Quatieri232 in view of Quatieri665 teaches:
The method of claim 1
Quatieri232 in view of Quatieri665 does not teach:
further comprising displaying an image of a vocal tract on a display device.
Norsworthy teaches:
further comprising displaying an image of a vocal tract on a display device (Fig. 8, para [0042], where an animated vocal tract is shown on the device).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the animation of Norsworthy (Norsworthy para [0042]) to display the vocal tract of Quatieri232 in view of Quatieri665 (Quatieri232 para [0064], [0070]) in order to promote literacy (Norsworthy para [0042]).

Regarding claim 9, Quatieri232 in view of Quatieri665 and Norsworthy teaches:
The method of claim 8 further comprising playing the audio recording and simultaneously animating the image of the vocal tract to display at least one of a constriction location and a degree of articulators along the vocal tract of the speaker represented by the time varying vocal tract variables (Norsworthy Fig. 8, para [0042], where the vocal tract is animated while synchronized with the audio, which displays the constriction location and degree of articulators).  

Regarding claim 22, Quatieri232 in view of Quatieri665 teaches:
The system of claim 16
Quatieri232 in view of Quatieri665 does not teach:
further comprising a display interface configured to display an image of a vocal tract.
Norsworthy teaches:
further comprising a display interface configured to display an image of a vocal tract (Fig. 8, para [0042], where an animated vocal tract is shown on the device).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Quatieri232 in view of Quatieri665 by using the animation of Norsworthy (Norsworthy para [0042]) to display the vocal tract of Quatieri232 in view of Quatieri665 (Quatieri232 para [0064], [0070]) in order to promote literacy (Norsworthy para [0042]).

Regarding claim 23, Quatieri232 in view of Quatieri665 and Norsworthy teaches:
The system of claim 22 wherein the display interface is configured to play the audio recording and simultaneously animate the image of the vocal tract to display the constriction location and degree of articulators along the vocal tract of the speaker (Norsworthy Fig. 8, para [0042], where the vocal tract is animated while synchronized with the audio, which displays the constriction location and degree of articulators).  

Allowable Subject Matter
Claims 14 and 28 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 101, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The closest prior art of Quatieri, Seneviratne, Lester, and Norsworthy do not teach the limitations of the claims. Specifically, none of the cited prior art teaches determining the measurement of the disorder by computing changes in articulator kinematics as determined through phasing of coupled oscillatory models of articulatory gestures derived from vocal tract variables. Hence, none of the cited prior art, either alone or in combination thereof, teaches the combination of limitations of the claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2022/0208173 A1 para [0152] teaches a bLSTM learning mappings between articulators and acoustic parameters; US 2019/0333505 A1 para [0057] teaches training a model to decode feature data into speech articulator movements.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658