DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 21-22, 28-29, & 35-36 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Rtischev (US 5634086, cited in 2/25/20 Information Disclosure Statement).
Claim 21: A method of script identification in audio data, the method comprising:
obtaining audio data (Rtischev Figure 1 item 12, microphone input);
segmenting the audio data into a plurality of utterances (Rtischev column 5, line 47 - column 6, line 5, individual word recognition);
obtaining a plurality of script models, wherein each of the plurality of script models is representative of a plurality of script texts (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options);
decoding the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response) by applying each of the plurality of script models to the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script step points) in response to user input to produce corresponding indications);
determining if any of the plurality of script texts occurred in the audio data from the plurality of utterances, the plurality of utterances having been decoded (Rtischev column 6 line 54 - column 7, line 5, determine whether audio data corresponds to script);
compiling each of the plurality of script models from at least each of the corresponding script texts (Rtischev column 5, lines 41-44, compiling script model); and
modifying each of the plurality of script models to include acceptable variations of each of the corresponding script texts (Rtischev column 5, lines 41-44, building script models, by definition including range of accepted variations).
Claim 22: The method of claim 21 (see above), wherein each of the plurality of script models is further compiled from at least one speaker acoustic model (Rtischev column 5, lines 44-46, speech recognition model).
Claim 28: A non-transitory computer readable medium programmed with computer readable code that upon execution by a computer processor causes the computer processor to:
obtain audio data (Rtischev Figure 1 item 12, microphone input);
(Rtischev column 5, line 47 - column 6, line 5, individual word recognition);
obtain a plurality of script models, wherein each of the plurality of script models is representative of a plurality of script texts (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options);
decode the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response) by applying each of the plurality of script models to the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script step points) in response to user input to produce corresponding indications);
determine if any of the plurality of script texts occurred in the audio data from the plurality of utterances, the plurality of utterances having been decoded (Rtischev column 6 line 54 - column 7, line 5, determine whether audio data corresponds to script);
compile each of the plurality of script models from at least each of the corresponding script texts (Rtischev column 5, lines 41-44, compiling script model); and
modify each of the plurality of script models to include acceptable variations of each of the corresponding script texts (Rtischev column 5, lines 41-44, building script models, by definition including range of accepted variations).
Claim 29: The non-transitory computer readable medium of claim 28 (see above), wherein each of the plurality of script models is further compiled from at least one speaker (Rtischev column 5, lines 44-46, speech recognition model).
Claim 35: A system for identification of a script in audio data, the system comprising:
an audio data source (Rtischev Figure 1 item 12, microphone input);
a script model database comprising a plurality of script models each script model of the plurality representative of at least one script text (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options); and
a processing system communicatively connected to the script model databased and the audio data source (Rtischev column 3, line 66 – column 4, line 6, processor and communication arrangement), the processing system:
obtains audio data (Rtischev Figure 1 item 12, microphone input);
segments the audio data into a plurality of utterances (Rtischev column 5, line 47 - column 6, line 5, individual word recognition);
obtains a plurality of script models, wherein each of the plurality of script models is representative of a plurality of script texts (Rtischev column 5, lines 31-41 and Figure 3, obtain script model in conjunction with a plurality of HMM models, preselected script Figure 3 item 114 implying plurality of script options);
decodes the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances and generating response) by applying each of the plurality of script models to the plurality of utterances (Rtischev column 6, lines 25-44 and Figure 3, apply script models (various script step points) in response to user input to produce corresponding indications);
determines if any of the plurality of script texts occurred in the audio data from the plurality of utterances, the plurality of utterances having been decoded (Rtischev column 6 line 54 - column 7, line 5, determine whether audio data corresponds to script);
compiles each of the plurality of script models from at least each of the corresponding script texts (Rtischev column 5, lines 41-44, compiling script model); and
modifies each of the plurality of script models to include acceptable variations of each of the corresponding script texts (Rtischev column 5, lines 41-44, building script models, by definition including range of accepted variations).
Claim 36: The system of claim 35 (see above), wherein each of the plurality of script models is further compiled from at least one speaker acoustic model (Rtischev column 5, lines 44-46, speech recognition model).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 23-25, 30-32, & 37-38 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev in view of Girardo (US 20020077819, cited in 2/25/20 Information Disclosure Statement).
Re claim 23, Rtischev discloses an arrangement for obtaining, segmenting, and decoding audio data and application of scripts as described in parent claim 21.
Rtischev does not disclose expressly the filtering of extraneous (i.e. other than the voice to be analyzed) sound.
Girardo discloses filtering of extraneous sound.
Rtischev and Girardo are combinable because they are from they are from the field of speech and text processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the filtering of Girardo to the Rtischev user script response arrangement.
The suggestion/motivation for doing so would have been to prevent interference by extraneous sound.
Therefore, it would have been obvious to combine Rtischev with Girardo to obtain the invention as specified in claim 23.
Claim 23: The method of claim 21 (see above), the method further comprising:
filtering the plurality of utterances to include only utterances attributed to a customer service agent (Girardo paragraph 0060, filtering out extraneous sounds prior to speech processing);
 (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances by speech sound processing); and
using the extracted acoustic features in decoding the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances by speech sound processing).
Applying these teachings to claims 24-25, 30-32, & 37-38:
Claim 24: The method of claim 21 (see above), wherein if any of the plurality of script models are determined to have occurred in the audio data, further comprising:
transcribing the utterance containing the script to produce an utterance transcription (Girardo Abstract, speech voice-to-text transcription);
comparing the script text to the utterance transcription (Girardo paragraph 0039, comparison/validation); and
determining a script compliance (Girardo paragraph 0039, comparison/validation).
Claim 25: The method of claim 24 (see above), wherein if the audio data is evaluated as non-compliant, further comprising:
initiating at least one remedial action, wherein the at least one remedial action is selected from (Note: This is a recitation in the alternative, readable upon any one option) operating a graphical display to present on screen guidance to a customer service agent (Rtischev column 6, lines 12-24, aural or visual feedback);
presenting additional information to a customer; and
producing an alert of a non-compliant script.
Claim 30: The non-transitory computer readable medium of claim 28 (see above), further causing the processor to:
filter the plurality of utterances to include only utterances attributed to a customer service agent (Girardo paragraph 0060, filtering out extraneous sounds prior to speech processing);
extract acoustic features from the filtered plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances by speech sound processing); and
use the extracted acoustic features in decoding the plurality of utterances (Rtischev column 6, lines 12-44 and Figure 3, decoding utterances by speech sound processing).
Claim 31: The non-transitory computer readable medium of claim 28 (see above), wherein if any of the plurality of script models are determined to have occurred in the audio data, further causing the processor to:
transcribe the utterance containing the script to produce an utterance transcription (Girardo Abstract, speech voice-to-text transcription);
compare the script text to the utterance transcription (Girardo paragraph 0039, comparison/validation); and
determine a script compliance (Girardo paragraph 0039, comparison/validation).
Claim 32: The non-transitory computer readable medium of claim 31 (see above), wherein if the audio data is evaluated as non-compliant, further causing the processor to:
initiate at least one remedial action, wherein the at least one remedial action is selected from (Note: This is a recitation in the alternative, readable upon any one option) operating a graphical display to present on screen (Rtischev column 6, lines 12-24, aural or visual feedback);
present additional information to a customer; and
produce an alert of a non-compliant script.
Claim 37: The system of claim 35 (see above), wherein if any of the plurality of script models are determined to have occurred in the audio data, the processing system further:
transcribes the utterance containing the script to produce an utterance transcription (Girardo Abstract, speech voice-to-text transcription);
compares the script text to the utterance transcription (Girardo paragraph 0039, comparison/validation); and
determines a script compliance (Girardo paragraph 0039, comparison/validation).
Claim 38: The system of claim 37 (see above), wherein if the audio data is evaluated as non-compliant, the processing system further:
initiates at least one remedial action, wherein the at least one remedial action is selected from (Note: This is a recitation in the alternative, readable upon any one option) operating a graphical display to present on screen guidance to a customer service agent (Rtischev column 6, lines 12-24, aural or visual feedback),
presents additional information to a customer, and
produces an alert of a non-compliant script.
Claims 26, 33, & 39 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev in view of Sherman (US 20100093319, cited in 2/25/20 Information Disclosure Statement).

Rtischev does not disclose an exchange including a customer service agent.
Sherman discloses an exchange including a customer service agent.
Rtischev and Sherman are combinable because they are from they are from the field of speech and text processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the Rtischev arrangement to exchanges with a customer service agent.
The suggestion/motivation for doing so would have been to apply the accuracy evaluation arrangements of Rtischev (Rtischev column 7, lines 26-31, reject indicator threshold) to the Sherman arrangement of interaction with a customer service agent.
Therefore, it would have been obvious to combine Rtischev with Sherman to obtain the invention as specified in claim 26.
Claim 26: The method of claim 21 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Applying these teachings to claims 33 & 39:
Claim 33: The non-transitory computer readable medium of claim 28 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Claim 39: The system of claim 35 (see above), wherein the audio data is an instance of an exchange including at least one customer service agent (Sherman paragraph 0004-0005, system for customer service agents).
Claims 27, 34, & 40 are rejected under 35 U.S.C. 103 as being unpatentable over Rtischev in view of Aleksic (US 8880398, cited in 2/25/20 Information Disclosure Statement).
Re claim 27, Rtischev discloses an arrangement for obtaining, segmenting, and decoding audio data and application of scripts as described in parent claim 21.
Rtischev does not expressly disclose an exchange including a multiword utterance.
Aleksic discloses the processing of multiword phrases (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Rtischev and Aleksic are combinable because they are from they are from the field of speech and text processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the Rtischev arrangement to exchanges having multiword utterances.

Therefore, it would have been obvious to combine Rtischev with Aleksic to obtain the invention as specified in claims 27.
Claim 27: The method of claim 21 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Applying these teachings to claims 34 & 40:
Claim 34: The non-transitory computer readable medium of claim 28 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Claim 40: The system of claim 35 (see above), wherein at least one of the plurality of utterances consists of more than a single word (Aleksic column 15, lines 33-38 and column 16, lines 43-58, multiword utterances e.g. “cat and dog”).
Conclusion
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).

The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/Stephen M Brinich/
Examiner, Art Unit 2663