DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Amendment
This office action is responsive to applicant’s remarks received on May 16, 2022. Claims 1, 2, 4, 6-18, 20, 21 and newly added Claim 22 are pending.


Response to Arguments
Applicant’s arguments with respect to the amended claims filed on May 16, 2022 have been fully considered but they are not persuasive.

A:  Applicant’s Remarks
For applicant’s remarks “See Applicant Arguments/Remarks Made in an Amendment” filed on May 16, 2022.

A:  Examiner’s Response
The jest of Applicant’s arguments rely on the fact that the cited references either alone or in combination do not teach, disclose or suggest evaluating the audio input from the user for speech errors using at least a sensitivity level, wherein evaluating the audio input from the user for speech errors comprises evaluating the audio input from the user for at least one speech error from a group consisting of substitution, insertion, hesitation, omission, correction, and repetition. 
Examiner understands Applicant’s arguments but respectfully disagree. Doyle ‘710 discloses at Column 15, lines 15-55 teaches evaluating the audio input from the user for speech errors using at least a sensitivity level, wherein evaluating the audio input from the user for speech errors comprises evaluating the audio input from the user for at least one speech error from a group consisting of substitution, insertion, hesitation, omission, correction, and repetition. 
Referring to Fig. 7A, at step 710, application software 222 causes the system to analyze one or more entries in error log 440 to determine whether any of the entries recorded in error log 440 are due to acoustic similarity. For example, if at step 710 the system detects an IGFA type error that is due to acoustic similarity, then the system at step 715 determines if there are any acoustically different synonyms included in the grammar for the improperly recognized user utterance. If no acoustically different synonyms are included in recognition grammar, then at step 717 the system adds to the grammar an acoustically different synonym for the utterance. Thereafter, at step 719, the system removes the acoustically similar term or phrase from the grammar's vocabulary. As such, the system replaces the improperly recognized term or phrase included in the vocabulary with one that cannot be easily confused with the user utterance. For example, if "read it" is being confused with "delete it" because the two phrases are acoustically similar, then the system would, for example, remove "delete it" from the grammar's vocabulary and substitute it with "get rid of it". The phrase "get rid of it" is not acoustically similar to "read it" and therefore cannot be as easily confused by the system. 
Thus, the cited references either alone or in combination teach, disclose or suggests the Applicant’s claimed invention. As a result, it is submitted that the present application is not in condition for allowance.

NOTE:
The Examiner has tried to interpret the claims, as best the Examiner can ascertain, to develop an appropriate prior art rejection in the best interests of compact prosecution. If any interpretation of the Examiner's is considered incorrect or off-base, the Examiner invites the Applicant to show the portions of the Applicant's specification which give a more proper interpretation of the claimed subject matter.
Moreover, should any questions arise in connection with this application or should the Applicant believe that a telephone conference with the Examiner would be helpful in resolving any remaining issues pertaining to this application the Examiner respectfully requests that he be contacted at the number indicated below.


Claim Rejections - 35 USC § 103
1.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
2.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	Claims 1, 2, 4, 6-18 & 20-22 are rejected under 35 U.S.C. 103 as being anticipated by Doyle (US 7,668,710 B2 hereinafter, Doyle ‘710).
Regarding claim 12; Doyle ‘710 discloses a system (Fig. 9A, System 910) 
comprising: 
a processing system (Fig. 9A, CPU 901);
a storage system (Fig. 9A, Memory 902);
and instructions stored on the storage system that when executed by the processing system (i.e. Computer software 920 is, typically, stored in storage media 906 and is loaded into memory 902 prior to execution. Computer software 920 may comprise system software 921 and application software 222. System software 921 includes control software such as an operating system that controls the low-level operations of computing system 910. Column 22, lines 50-62);
direct the processing system to at least: receive audio input from a user comprising one or more spoken words (Fig. 3, Step 310 i.e. At step 310, a human operator retrieves recorded information about a call by selecting an individual call from list box 210. Column 10, lines 11-25);
evaluate the audio input from the user for speech errors using at least a sensitivity level, wherein evaluating the audio input from the user for speech errors comprises evaluating the audio input from the user for at least one speech error from a group consisting of substitution, insertion, hesitation, omission, correction, and repetition (i.e. Referring to FIG. 7A, at step 710, application software 222 causes the system to analyze one or more entries in error log 440 to determine whether any of the entries recorded in error log 440 are due to acoustic similarity. For example, in accordance with one embodiment, if at step 710 the system detects an IGFA type error that is due to acoustic similarity, then the system at step 715 determines if there are any acoustically different synonyms included in the grammar for the improperly recognized user utterance. If no acoustically different synonyms are included in recognition grammar, then at step 717 the system adds to the grammar an acoustically different synonym for the utterance. Thereafter, at step 719, the system removes the acoustically similar term or phrase from the grammar's vocabulary. As such, the system replaces the improperly recognized term or phrase included in the vocabulary with one that cannot be easily confused with the user utterance. For example, if "read it" is being confused with "delete it" because the two phrases are acoustically similar, then the system would, for example, remove "delete it" from the grammar's vocabulary and substitute it with "get rid of it". The phrase "get rid of it" is not acoustically similar to "read it" and therefore cannot be as easily confused by the system. Column 15, lines 15-55);
determine speech errors for the audio input using at least a sensitivity level (Fig. 5, Step 510 i.e. application software 222 is utilized to improve the quality of voice recognition services provided by voice recognition gateway 100. To accomplish this, at step 510 in Fig. 5, application software 222 causes the system to analyze system logs, such as call log 136, transcription log 430, and error log 440, to determine the accuracy or efficiency level of the system.  System settings, such as the confidence threshold, for example, may be also monitored to determine types and sources of error, and system recognition accuracy, for example. Column 11, lines 45-65)
determine whether an amount and type of speech errors in the audio input requires adjustment to the sensitivity level (i.e. Using the information recorded in system logs, the system determines and assigns an accuracy and efficiency level to the existing voice recognition system at step 520. Based on the calculated accuracy and efficiency levels, at step 530, the system determines if the voice recognition system needs to be adjusted to produce better results. Column 12, lines 19-34)
adjust the sensitivity level to a second sensitivity level based on the amount and type of the speech errors in the audio input (i.e. The voice recognition system settings, including the confidence threshold, are adjusted so that the number of false accepts and false reject type errors are minimized and the number of out-of-grammar correct rejects are maximized. The system calculates the recognition accuracy rate based on the number of correct accepts and correct rejects recorded in the system logs. By analyzing the accuracy results, the system may adjust the confidence threshold to improve recognition accuracy. Column 18, lines 20-39)
the second sensitivity level being different than the sensitivity level (i.e. The system confidence threshold is set so that out-of-grammar correct reject rate is at least 50% and the in-grammar false reject rate is equal to twice the in-grammar false accept rate. Column 18, lines 20-39)
and re-evaluate the speech errors for the audio input using at least the second sensitivity level (Fig. 5, Steps 550-560 i.e. Once the system has determined the sources of error, then at step 550 the system hypothesizes solutions that can resolve system inefficiencies and prevent certain errors from occurring. Exemplifying methods for hypothesizing solutions for various sources of error are illustrated in FIGS. 7A through 7B. Once the system has hypothesized one or more solutions, the system at step 560 reconfigures the voice recognition system based on one or more hypothesized solutions. The system may, for example, add or delete certain phonetic definitions or pronunciations to the recognition grammar or modify threshold settings depending on the types of errors detected and the solutions hypothesized. Column 12, lines 49-60)
Doyle ‘710 teaches most of the subject matter as described as above except the fact that Doyle ‘710 describes an input as a user’s utterance. Applicant’s Application uses an audio input. For example, Applicant’s application the Abstract teaches that the sensitivity feature can receive audio input comprising one or more spoken words. Doyle ‘710 teaches at the Abstract that the voice recognition information comprises a recognized voice command associated with the user utterance and a reference to an audio file that includes the user utterance. With this said, Examiner equates the user’s utterance of Doyle ‘710 to the audio input of Applicant’s application. Doyle ‘710 recognized voice command associated with the user utterance could have been substituted for the audio input of Applicant’s application and the results would have been predictable and yielded the same results. The suggestion/motivation for doing so would have been advantageous because with advancements in communications technology, it would make it easier for information to be accessed using voice recognition systems that translate voice commands into system commands for data retrieval from an electronic system. Therefore, the claimed subject matter would have been obvious to a person having ordinary skill in the art at the time the invention was made. Thus, it would have been obvious for Doyle ‘710 to teach the stated limitation.

Regarding claim 13; Doyle ‘710 discloses wherein the second sensitivity level is lower than the sensitivity level (See Appendixes A & B i.e. Appendixes A & B shows a list that reflects the recognition accuracy level of the system at various confidence thresholds.  Columns 23-24). 

Regarding claim 14; Doyle ‘710 discloses wherein the second sensitivity level is higher than the sensitivity level (See Appendixes A & B i.e. Appendixes A & B shows a list that reflects the recognition accuracy level of the system at various confidence thresholds.  Columns 23-24).

Regarding claim 15; Doyle ‘710 discloses wherein the instructions to evaluate the audio input from the user for speech errors using at least the sensitivity level direct the processing system to: obtain a speech score for each spoken word of the audio input, wherein the speech score comprises at least one score from the group consisting of a repetition score, an insertion score, a substitution score, an omission score, and a hesitation score (i.e. A modified version of LDA can be used to find the closest alignment of two phoneme strings, generating an acoustic similarity score in the process. The score in the upper right corner represents the "cost of matching" the two words. In this example, the cost is 4. It can be noted that "cl" represents an unvoiced or silent phoneme. The confidence threshold is a value that is used to set the acceptance level of utterances and define the degree of acoustic similarity required for recognition. Recognition accuracy is highest when the acoustic model of an utterance exactly matches that of a term or phrase included in the recognition grammar. For example, if "read it" is being confused with "delete it" because the two phrases are acoustically similar, then the system would, for example, remove "delete it" from the grammar's vocabulary and substitute it with "get rid of it". The phrase "get rid of it" is not acoustically similar to "read it" and therefore cannot be as easily confused by the system. (Column 6, lines 1-40; Column 17, line 40 thru Column 18, line 29 & Column 15, lines 49-55); 
and apply the sensitivity level to the obtained speech score to determine the speech errors for the audio input (i.e. The voice recognition system settings, including the confidence threshold, are adjusted so that the number of false accepts and false reject type errors are minimized and the number of out-of-grammar correct rejects are maximized. The system calculates the recognition accuracy rate based on the number of correct accepts and correct rejects recorded in the system logs. By analyzing the accuracy results, the system may adjust the confidence threshold to improve recognition accuracy. Column 18, lines 20-39).

Regarding claim 16; Doyle ‘710 discloses wherein the instructions to apply the sensitivity level to the obtained speech score to determine the speech errors direct the processing system to: for each word of the one or more spoken words in the audio input, determine if the speech score is above a threshold value, the threshold value being set by the sensitivity level (i.e. Another source of error may be a confidence threshold level that is too high or too low. In some embodiments a source of error is determined to be a high confidence threshold, if a high rate of IGFR type errors are detected. Column 3, line 66 thru Column 4, line 16);
and if the speech score is above the threshold value, flagging the word of the one or more spoken words in the audio input as having a speech error (i.e. In a voice recognition system with high confidence threshold setting, even a slight difference in acoustic similarity can cause the voice recognition system to reject a user utterance it otherwise should have accepted, leading to an IGFR. Column 3, line 66 thru Column 4, line 16).

Regarding claim 1; Claim 1 contains substantially the same subject matter as claim 12. Therefore, claim 1 is rejected on the same grounds as claim 12.

Regarding claim 2; Claim 2 contains substantially the same subject matter as claim 13. Therefore, claim 2 is rejected on the same grounds as claim 13.

 Regarding claim 4; Claim 4 contains substantially the same subject matter as claim 14. Therefore, claim 4 is rejected on the same grounds as claim 14.

Regarding claim 6; Claim 6 contains substantially the same subject matter as claim 15. Therefore, claim 6 is rejected on the same grounds as claim 15.

Regarding claim 7; Claim 7 contains substantially the same subject matter as claim 15. Therefore, claim 7 is rejected on the same grounds as claim 15.

Regarding claim 8; Doyle ‘710 discloses wherein the obtaining of the speech score for each spoken word of the audio input comprises: communicating the audio input to a speech service; and receiving the speech score from the speech service (i.e. A modified version of LDA can be used to find the closest alignment of two phoneme strings, generating an acoustic similarity score in the process. For example, the chart provided immediately below is a Levenstein phoneme alignment for the word "operating" with its canonical pronunciation along the vertical axis and a recognized phoneme string based on what the user said along the horizontal access. The score in the upper right corner represents the "cost of matching" the two words. In this example, the cost is 4. It can be noted that "cl" represents an unvoiced or silent phoneme. Column 16, lines 1-18). 

Regarding claim 9; Claim 9 contains substantially the same subject matter as claim 16. Therefore, claim 9 is rejected on the same grounds as claim 16.

Regarding claim 10; Doyle ‘710 discloses surfacing a visual indication of each of the determined speech errors in an application (See Appendixes A & B i.e. Appendixes A & B shows a visual representation of the speech errors at Columns 23-24. The LDA chart also show a visual representation of a simple metric for testing the similarity of strings at Column 16).

Regarding claim 11; Doyle ‘710 discloses determining one or more reading or speaking tools that correspond to the determined speech errors; and providing the one or more reading or speaking tools for a display of an application (i.e. Transcription software 420 can be utilized to view and analyze each entry in call log 136 to determine whether a user utterance was properly recognized, and if not, the possible reasons for the improper recognition. When transcription software 420 is executed, a GUI, such as that shown in Fig. 2, is provided to the human operator. The GUI may include several text and list boxes that display information about entries in call log 136. Boxes 210, 220, 230, 240, 250, 255, and 260 are exemplifying interface tools that can be utilized to implement transcription software 420's GUI. Column 9, lines 49-59).

Regarding claim 17; Claim 17 contains substantially the same subject matter as claim 12. Therefore, claim 17 is rejected on the same grounds as claim 12. However, claim 17 further discloses a computer-readable storage medium having instructions stored thereon that, when executed by a processing system cause the system to perform a method. Doyle ‘710 at Claim 21 discloses a tangible computer-readable medium having computer program logic stored thereon that enables a computing device to execute the method. 

Regarding claim 18; Claim 18 contains substantially the same subject matter as claim 15. Therefore, claim 18 is rejected on the same grounds as claim 15.

Regarding claim 20; Claim 20 contains substantially the same subject matter as claim 6. Therefore, claim 20 is rejected on the same grounds as claim 6.

Regarding claim 21; Doyle ‘710 discloses wherein the speech error is an oral reading miscue made while the user is performing an oral reading (i.e. Voice recognition information produced by a voice recognition system in response to recognizing a user utterance is analyzed. The voice recognition information comprises a recognized voice command associated with the user utterance and a reference to an audio file that includes the user utterance. Based on the analysis, a recognition error may be identified and the source of the error determined. See Abstract).

Regarding claim 22; Doyle ‘710 discloses wherein evaluating the audio input from the user for speech errors further comprises evaluating the audio input from the user for a mispronunciation error (i.e. In Grammar False Accept (IGFA): The system detects an in-grammar false accept or IGFA type error, if both the transcribed utterance and the recognized voice command are part of the recognition grammar, but the semantic value of the transcribed utterance is not the same as the semantic value of the recognized voice command. That is, the utterance should be recognized as a certain word or phrase in the grammar, but instead was recognized as another word or phrase. Thus, the system detects a case of IGFA error, for example, if the transcribed utterance is "read it" and the recognized voice command is "delete it," where both "read it" and "delete it" are part of the recognition grammar. On the other hand, if a user says "please read it" and the system recognizes "read it," there is no error since both commands have the same semantic value (from the system's perspective). Column 13, line 60 thru Column 14, line 2 & Column 15, lines 32-67).


Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARCUS T. RILEY, ESQ. whose telephone number is (571)270-1581. The examiner can normally be reached 9-5 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy P. Goddard can be reached on 517-272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARCUS T. RILEY, ESQ.
Primary Examiner
Art Unit 2677



/MARCUS T RILEY/Primary Examiner, Art Unit 2677