DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments (3/24/21 Remarks: page 6, line 6 – page 7, line 30) have been fully considered but they are not persuasive.
Applicant argues (3/24/21 Remarks: page 6, line 6 – page 7, line 30, particularly page 6, lines 16-21 and page 7, lines 20-26) that the art of record fails to disclose the recited feature of “applying each of the plurality of script models each representing a script text to each of the plurality of utterances”.
However, upon further consideration of the teachings of Rtischev, this reference discloses features readable upon the amended language as set forth below (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script points P(i)) in response to user input to produce corresponding indications of whether the utterance matches the script text; Rtischev column 6, line 54 - column 7, line 5, repeating this process for each script model by proceeding to the next script step point P(i+1)).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 21-24, 27-31, & 34-38 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev (US 5634086, cited in 2/25/20 Information Disclosure Statement) in view of Girardo (US 20020077819, cited in 2/25/20 Information Disclosure Statement).
Rtischev discloses an arrangement for obtaining, segmenting, and decoding audio data.
Rtischev does not disclose expressly the generation of a transcription.
Girardo discloses voice-to-text transcription.
Rtischev and Girardo are combinable because they are from they are from the field of speech and text processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the transcription of Girardo to the Rtischev user script response arrangement.

Therefore, it would have been obvious to combine Rtischev with Girardo to obtain the invention as specified in claim 21.
Claim 21: A method of script identification in audio data, the method comprising:
obtaining audio data (Rtischev Figure 1 item 12, microphone input);
segmenting the audio data into a plurality of utterances (Rtischev column 5, line 47 - column 6, line 5, individual word recognition);
obtaining a plurality of script models, wherein each script model is representative of a script text (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options);
decoding the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response), wherein decoding the plurality of utterances comprises applying each of the plurality of script models each representing a script text to each of the plurality of utterances and producing an indication of which of the script models are identified in the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script points P(i)) in response to user input to produce corresponding indications of whether the utterance matches the script text; Rtischev column 6, line 54 - column 7, line 5, repeating this process for each script model by proceeding to the next script step point P(i+1));
for each identified script model, transcribing an utterance containing each script from the plurality of utterances to produce an utterance transcription (Girardo Abstract, speech voice-to-text transcription);
comparing each utterance transcription to the script text (Girardo paragraph 0039, comparison/validation); and
determining a script accuracy (Girardo paragraph 0039, comparison/validation).
Applying these teachings to claims 22-24, 27-31, & 34-38:
Claim 22: The method of claim 21 (see above), further comprising evaluating a compliance of the audio data with a script requirement threshold by comparing the determined script accuracy to the script requirement threshold (Rtischev column 7, lines 26-31, reject indicator threshold).
Claim 23: The method of claim 22 (see above), wherein the script requirement threshold is a minimum word error rate (Rtischev column 7, lines 32-38, evaluation of number of errors divided by number of words).
Claim 24: The method of claim 21 (see above), wherein if none of the plurality of script models are determined to have occurred, initiating at least one remedial action (Rtischev column 7, lines 32-44, end tracking if sufficient scripted words are not recognized).
Claim 27: The method of claim 21 (see above), wherein each script model includes compliant variations of the script text (Rtischev column 5, lines 41-44, multiple models for a given script, each being an accepted variation).
Claim 28: A non-transitory computer readable medium programmed with computer readable code that upon execution by a computer processor (Rtischev column 9, lines 8-10 and column 10, lines 9-12, implementation by computer program and computer workstation) causes the computer processor to:
obtain audio data (Rtischev Figure 1 item 12, microphone input);
segment the audio data into a plurality of utterances (Rtischev column 5, line 47 - column 6, line 5, individual word recognition);
obtain a plurality of script models, wherein each script model is representative of a script text (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options);
decode the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response), wherein decoding the plurality of utterances comprises applying each of the plurality of script models each representing a script text to each of the plurality of utterances and producing an indication of which of the script models are identified in the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script points P(i)) in response to user input to produce corresponding indications of whether the utterance matches the script text; Rtischev column 6, line 54 - column 7, line 5, repeating this process for each script model by proceeding to the next script step point P(i+1));
for each identified script model, transcribe an utterance containing each script from the plurality of (Girardo Abstract, speech voice-to-text transcription);
compare each utterance transcription to the script text (Girardo paragraph 0039, comparison/validation); and
determine a script accuracy (Girardo paragraph 0039, comparison/validation).
Claim 29: The non-transitory computer readable medium of claim 28 (see above), further causing the processor to evaluate a compliance of the audio data with a script requirement threshold by comparing the determined script accuracy to the script requirement threshold (Rtischev column 7, lines 26-31, reject indicator threshold).
Claim 30: The non-transitory computer readable medium of claim 29 (see above), wherein the script requirement threshold is a minimum word error rate (Rtischev column 7, lines 32-38, evaluation of number of errors divided by number of words).
Claim 31: The non-transitory computer readable medium of claim 28 (see above), wherein if none of the plurality of script models are determined to have occurred, further causing the processor to initiate at least one remedial action (Rtischev column 7, lines 32-44, end tracking if sufficient scripted words are not recognized).
Claim 34: The non-transitory computer readable medium of claim 28 (see above), wherein each script model includes compliant variations of the script text (Rtischev column 5, lines 41-44, multiple models for a given script, each being an accepted variation).
Claim 35: A system for identification of a script in audio data, the system comprising:
an audio data source (Rtischev Figure 1 item 12, microphone input);
(Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options); and
a processing system communicatively connected to the script model databased and the audio data source (Rtischev column 9, lines 8-10 and column 10, lines 9-12, implementation by computer program and computer workstation), the processing system:
obtains audio data (Rtischev Figure 1 item 12, microphone input),
segments the audio data into a plurality of utterances (Rtischev column 5, line 47 - column 6, line 5, individual word recognition),
obtains a plurality of script models, wherein each script model is representative of a script text (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options),
decodes the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response), wherein decoding the plurality of utterances comprises applying each of the plurality of script models each representing a script text to each of the plurality of utterances and producing an indication of which of the script models are identified in the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script points P(i)) in response to user input to produce corresponding indications of whether the utterance matches the script text; Rtischev column 6, line 54 - column 7, line 5, repeating this process for each script model by proceeding to the next script step point P(i+1)),
for each identified script model, transcribes an utterance containing each script from the plurality of utterances to produce an utterance transcription (Girardo Abstract, speech voice-to-text transcription),
compares each utterance transcription to the script text (Girardo paragraph 0039, comparison/validation), and
determines a script accuracy (Girardo paragraph 0039, comparison/validation).
Claim 36: The system of claim 35 (see above), wherein the processing system further evaluates a compliance of the audio data with a script requirement threshold by comparing the determined script accuracy to the script requirement threshold (Rtischev column 7, lines 26-31, reject indicator threshold).
Claim 37: The system of claim 36 (see above), wherein the script requirement threshold is a minimum word error rate (Rtischev column 7, lines 32-38, evaluation of number of errors divided by number of words).
Claim 38: The system of claim 35 (see above), wherein if none of the plurality of script models are determined to have occurred, the processing system further initiates at least one remedial action (Rtischev column 7, lines 32-44, end tracking if sufficient scripted words are not recognized).
Claims 25, 32, & 39 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev in view of Girardo as applied .
Rtischev in view of Girardo teaches the inventions of claims 21, 28, & 35 as described above.
Rtischev in view of Girardo does not disclose an exchange including a customer service agent.
Sherman discloses an exchange including a customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Rtischev in view of Girardo and Sherman are combinable because they are from they are from the field of speech and text processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the Rtischev in view of Girardo arrangement to exchanges with a customer service agent.
The suggestion/motivation for doing so would have been to apply the accuracy evaluation arrangements of Rtischev (Rtischev column 7, lines 26-31, reject indicator threshold) to the Sherman arrangement of interaction with a customer service agent.
Therefore, it would have been obvious to combine Rtischev with Girardo to obtain the invention as specified in claims 25, 32, & 39.
Claim 25: The method of claim 21 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Claim 32: The non-transitory computer readable medium of claim 28 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Claim 39: The system of claim 35 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Claims 26, 33, & 40 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev in view of Girardo as applied to claims 21, 28 & 35, and further in view of Aleksic (US 8880398, cited in 2/25/20 Information Disclosure Statement).
Rtischev in view of Girardo teaches the inventions of claims 21, 28, & 35 as described above.
Rtischev in view of Girardo does not expressly disclose an exchange including a multiword utterance.
Aleksic discloses the processing of multiword phrases (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Rtischev in view of Girardo and Aleksic are combinable because they are from they are from the field of speech and text processing.

The suggestion/motivation for doing so would have been to avoid limiting the operation of Rtischev in view of Girardo to single-word utterances.
Therefore, it would have been obvious to combine Rtischev with Aleksic to obtain the invention as specified in claims 26, 33, & 40.
Claim 26: The method of claim 21 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Claim 33: The non-transitory computer readable medium of claim 28 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Claim 40: The system of claim 35 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Conclusion
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/Stephen M Brinich/
Examiner, Art Unit 2663