DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments (6/28/21 Remarks: page 2, line 6 – page 4, line 4) with respect to the rejection(s) of claims 12-17, 19-21, & 48-60 under 35 USC §103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Farkas (US 20120129505).
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 12-17, 19-21, 48-49, & 59-60 are rejected under 35 U.S.C. 103 as being unpatentable over Petrushin (US 20110178803, cited in 7/10/18 Information Disclosure Statement) in view of Farkas (US 20120129505).
Re claim 12, Petrushin discloses:
Claim 12: A system for providing real-time analysis of audio signals, comprising:
a memory (Petrushin Figure 1, RAM 114), and
a processor coupled to the memory (Petrushin Figure 1, CPU 110), the processor being operable to:
(Petrushin Abstract, audio processing, paragraphs 0348 & 0354, digital audio);
generate first computed streamed signal information corresponding to each of the first digital audio signals by computing first metrics data for the first digital audio signals, the first computed streamed signal information including the first metrics data (Petrushin paragraph 0330, processing voice information at one site to generate voice characteristic information and transmitting characteristic information to another site), and the first metrics data including one or more of (Note: This is a recitation in the alternative satisfied by a teaching of any one option) conversational flow (Petrushin paragraphs 0008 & 0130, metrics data includes a speaking rate of the voice signal, which quantifies a “conversational flow”), hyper-articulation, and hypo-articulation between a first party and a second party (see below);
store the computed first streamed signal information in the memory (Petrushin paragraph 0330, comparing transmitted characteristic information to a later set of transmitted characteristic information, inherently requiring storage of the earlier data at least until this comparison occurs); and
transmit the first computed streamed signal information to one or more computing devices (Petrushin paragraph 0330, processing voice information at one site to generate voice characteristic information and transmitting characteristic information to another site where it is further processed),
wherein the transmitting the first computed streamed signal information to the one or more computing devices (Petrushin paragraph 0328, display of processing (voice identification) result).
Petrushin does not disclose expressly:
…the first metrics data including one or more of conversational flow, hyper-articulation, and hypo-articulation between a first party and a second party…
Farkas discloses (Farkas paragraphs 0053 & 0056-0057, metrics of conversational flow (frequency of speaker exchange and ratio of speaker participation) in a conversation between first and second parties) metrics data including at least a measure of conversational flow between a first party and a second party. 
Petrushin and Farkas are combinable because they are from the field of voice metrics analysis.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the detailed metrics analysis of both parties of a conversation as taught be Farkas in conjunction with the voice analysis arrangement of Petrushin.
The suggestion/motivation for doing so would have been to analyze the voice of both the agent and the customer, enabling the emotional state of both agent and customer to be monitored as suggested by Petrushin (Petrushin paragraph 0180, “The present invention would monitor the conversation between the customer and the employee to determine whether the customer and/or the employee are becoming upset, for example.”).
Therefore, it would have been obvious to combine Petrushin with Farkas to obtain the invention as specified in claim 12.
Applying these teachings in accordance with the above described rationale to claims 13-17, 19-21, 48-49, & 59-60:
Claim 13: The system of claim 12 (see above), wherein the first digital audio signals correspond to first source audio signals (Petrushin paragraph 0330, audio signals from spoken sources, paragraph 0306, voice capture via phone).
Claim 14: The system of claim 13 (see above), wherein the real-time analysis is performed within a predetermined time between a first time of receiving the plurality of first source audio signals and a second time of transmitting the first computed streamed signal information to the agent computing device (Petrushin paragraph 0330, processing voice information at one site to generate voice characteristic information and transmitting characteristic information to another site).
Claim 15: The system of claim 13 (see above), wherein the plurality of first source audio signals are received via one or more of (Note: This is a recitation in the alternative satisfied by a teaching of any one option) a voice over internet protocol (VoIP) and a public switched telephone network (PSTN) (Petrushin paragraph 0306, voice capture via phone).
Claim 16: The system of claim 12 (see above),

wherein the first computed streamed signal information includes first contextual metrics data, the contextual metrics data indicating a comparison between the first metrics data and the historical metrics data (Petrushin paragraph 0349, recognition by comparing characteristic data of stored vocabulary data and current data).
Claim 17: The system of claim 12 (see above), wherein the metrics are configured dynamically during a call associated with the first source audio signals (Petrushin paragraph 0259, adaptive filtering of sound signal) or (Note: This is a recitation in the alternative satisfied by a teaching of any one option) statically prior to the call associated with the first source audio signals.
Claim 19: The system of claim 12 (see above), wherein the processor is further operable to:
retrieve, from the memory, second digital audio signals (Petrushin Abstract, audio processing, paragraphs 0348 & 0354, digital audio, second instance of described operation readable on “second” signals, etc);
generate second computed streamed signal information corresponding to each of the second digital audio signals by computing second metrics data for the second digital audio signals, the second computed streamed signal information including the second metrics data (Petrushin paragraph 0330, processing voice information at one site to generate voice characteristic information and transmitting characteristic information to another site);
store the computed second streamed signal information in the memory (Petrushin paragraph 0330, comparing transmitted characteristic information to a later set of transmitted characteristic information, inherently requiring storage of the earlier data at least until this comparison occurs); and
transmit the second computed streamed signal information to the one or more computing devices (Petrushin paragraph 0330, processing voice information at one site to generate voice characteristic information and transmitting characteristic information to another site where it is further processed),
wherein the transmitting the second computed streamed signal information to the one or more computing devices causes the second computed streamed signal information to be displayed at the one or more computing devices (Petrushin paragraph 0328, display of processing (voice identification) result).
Claim 20: The system of claim 19 (see above), wherein the one or more computing devices includes a supervisor computing device, and wherein the first computed streamed signal information and the second computed streamed signal information is transmitted to and caused to be displayed at the supervisor computing device (Petrushin paragraph 0328, display of result to a border guard (who functions in a supervisory capacity to control border access)).
Claim 21: The system of claim 12 (see above), wherein metrics included in the metrics data include one or more of (Note: This is a recitation in the alternative satisfied by a teaching of any one option) conversational participation, dynamic variation (Petrushin paragraphs 0008 & 0130, metrics data includes an energy range of the voice signal, which quantifies a “dynamic variation”), speaking rate (Petrushin paragraph 0126, determining speaking rate metric data), and vocal effort (Petrushin paragraph 0123, determining vocal energy metric data) between a first party and a second party (Farkas paragraphs 0053 & 0056-0057, metrics of conversational flow (frequency of speaker exchange and ratio of speaker participation) in a conversation between first and second parties).
Claim 48: The system of claim 12 (see above), wherein the first metrics data comprises at least one of (Note: This is a recitation in the alternative satisfied by a teaching of any one option) (i) to (ix):
(i) a measure of pace at which a first party has spoken and a measure of pace at which a second party has spoken, over an interval of time;
(ii) a measure of tone with which a first party has spoken and a measure of tone with which a second party has spoken, over an interval of time;
(iii) a measure of speaking rate at which a first party has spoken and a measure of speaking rate at which a second party has spoken, over an interval of time (Petrushin paragraph 0126, determining speaking rate metric data; Petrushin Abstract & paragraph 0177, conversation between at least two parties);
(iv) a measure of vocal effort with which a first party has spoken and a measure of vocal effort with which a second party has spoken, over an interval of time a (Petrushin paragraph 0123, determining speaking energy metric data; Petrushin Abstract & paragraph 0177, conversation between at least two parties); and
(v) a measure of degree of articulation with which a first party has spoken and a measure of degree of articulation with which a second party has spoken, over an interval of time;

(vii) a measure of conversational engagement of parties over an interval of time;
(viii) a measure of perceived depression with which a party has spoken over an interval of time; and
(ix) a measure of conversational flow over an interval of time (Farkas paragraphs 0053 & 0056-0057, metrics of conversational flow (frequency of speaker exchange and ratio of speaker participation).
Claim 49: The system of claim 48 (see above), wherein causing the first computed streamed signal information to be displayed at the one or more computing devices comprises rendering, by a processor of the one or more computing devices, on a continuous basis during a multi-party telephonic communication, one or more graphical user interface widgets for substantially contemporaneous presentation on a display to the first party wherein the one or more widgets are graphically representative of the one or more metrics and are rendered for display on a real-time basis (Petrushin paragraphs 0143 & 0177-0178 and Figures 7 & 10, rendering of voice characteristic information and display on a computer screen).
Claim 59: The system of claim 48 (see above), wherein the first metrics data comprises at least two of (i) to (ix) (Petrushin paragraph 0126, determining speaking rate metric data; Petrushin paragraph 0123, determining speaking energy metric data).
Claim 60: The system of claim 12 (see above), wherein the processor is further operable to receive, over a network from a plurality of providers of computing resources (Petrushin paragraph 0338, receive signals over a variety of computer or telephone networks), an analysis of parts of the first digital audio signals, the first metrics data, and/or (Note: This is a recitation in the alternative satisfied by a teaching of any one option) the first computed streamed signal information for transmitting the first computed streamed signal information to the one or more computing devices (Petrushin paragraph 0328, transmission of processed (voice identified) result  for display).
Claims 50-58 are rejected under 35 U.S.C. 103(a) as being unpatentable over Petrushin in view of Farkas, and further in view of Noble (US 8537983, cited in 7/10/18 Information Disclosure Statement).
Re claim 50, Petrushin in view of Farkas discloses the system of claim 49 (see above).
Petrushin in view of Farkas does not disclose expressly:
Claim 50: The system of claim 49 (see above), wherein the rendering comprises rendering a timeline widget that scrolls contemporaneously with at least a portion of the telephonic communication graphically indicating when the first party is speaking and when the second party is speaking.
Noble discloses:
…wherein the rendering comprises rendering a timeline widget (Noble column 26, lines 21-45; Figure 13, timeline display) that scrolls contemporaneously with at least a portion of the telephonic communication graphically indicating when the first party is speaking and when the (Noble column 26, lines 21-45; Figure 13, identification of speech by each party).
Petrushin and Noble are combinable because they are from the field of speech analysis.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to provide a timeline widget in conjunction with the Petrushin display.
The suggestion/motivation for doing so would have been to provide additional information for review of a conversation, as suggested by Noble (Noble column 25, lines 62-67, use of the timeline widget in reviewing conversation).
Therefore, it would have been obvious to combine Petrushin with Noble to obtain the invention as specified in claim 50.
Claim 50: The system of claim 49 (see above), wherein the rendering comprises rendering a timeline widget (Noble column 26, lines 21-45; Figure 13, timeline display) that scrolls contemporaneously with at least a portion of the telephonic communication graphically indicating when the first party is speaking and when the second party is speaking (Noble column 26, lines 21-45; Figure 13, identification of speech by each party).
Applying these teachings to claims 51-58:
Claim 51: The system of claim 49 (see above), wherein the rendering comprises rendering a numerical representation of the measure of conversational engagement at intervals contemporaneously with at least a portion of (Noble column 34, lines 23-45, detection of a conversational party’s emotional state, including states such as “excitement” inherently associated with a high degree of engagement), leaving behind a graphical record of engagement scores corresponding to intervals of time during the telephonic communication (Noble column 34, lines 43-45, graphically recording specific intervals of time during the conversation as associated with these emotional states).
Claim 52: The system of claim 51 (see above), wherein each rendering of engagement score is color-coded such that low engagement scores can be immediately visually differentiated from high engagement scores (Noble column 29, lines 33-36, describing the use of color-coding to enable certain graphic elements to be readily distinguished from others).
Claim 53: The system of claim 49 (see above), wherein the rendering comprises rendering a “tone” widget comprising a graphical element representing the measure of tone of the first party (Petrushin paragraph 0124, “frequency spectral features” inherently indicate vocal tone) in positional relation to a graphical element representing the measure of tone of the second party (Petrushin paragraph 0124, “frequency spectral features” inherently indicate vocal tone), and updating the widget substantially contemporaneously with the telephonic communication (Noble column 26, lines 21-45; Figure 13) to reflect changes in the measures of tone of the first party and second party (Noble column 26, lines 21-45; Figure 13, identification of speech by each party; Petrushin paragraph 0124, “frequency spectral features” inherently provide “tone” metric for each party).
Claim 54: The system of claim 53 (see above), wherein the tone widget graphically reflects both an “instantaneous” measure of tone of the first and second parties (Petrushin paragraph 0124, “frequency spectral features” inherently indicate vocal tone), and a rolling measure of tone of the first and second parties (Petrushin paragraph 0148, measures such as “standard deviation”, “slope of the fundamental frequency”, “range of the energy”, etc inherently require an ongoing (i.e. “rolling”) measurement rather than only an instantaneous one, inasmuch as they require the processing of multiple separate data points).
Claim 55: The system of claim 49 (see above), wherein the rendering comprises rendering a “pace” widget comprising a graphical element representing the measure of speaking rate of the first party (Petrushin paragraph 0126, speech rate) in positional relation (Noble column 1, lines 16-32; column 1, line 56 - column 2, line 12; Figure 13, representations of agent’s (1315) and caller’s (1320) speech components) to a graphical element representing the measure of speaking rate of the second party (Petrushin paragraph 0126, speech rate), and updating the widget substantially contemporaneously with the telephonic communication to reflect changes in the measures of speaking rate of the first party and second party (Noble Figure 13, the timeline display is synchronized to the timing of the conversation).
Claim 56: The system of claim 55 (see above), wherein the pace widget graphically reflects both an “instantaneous” measure of speaking rate of the first and second parties (Petrushin paragraph 0126, speech rate), and a rolling measure of speaking rate of the first and second (Petrushin paragraph 0148, measures such as “standard deviation”, “slope of the fundamental frequency”, “range of the energy”, etc inherently require an ongoing (i.e. “rolling”) measurement rather than only an instantaneous one, inasmuch as they require the processing of multiple separate data points).
Claim 57: The system of claim 49 (see above), wherein the rendering comprises rendering a “participation” widget comprising a graphical element (bar) representing the measure of amount of time the first party has spoken relative to the second party over an interval of time (Noble column 1, lines 16-32; column 1, line 56 - column 2, line 12; Figure 13, timeline representations of agent’s (1315) and caller’s (1320) speech components inherently indicate the amount of speech time by each).
Claim 58: The system of claim 57 (see above), wherein the participation widget comprises a color-coded graphic visually indicating (Noble column 29, lines 33-36, describing the use of color-coding for visual indication) whether the measure of amount of time the first party has spoken relative to the second party over the interval of time is acceptable or not (Noble column 1, lines 16-32; column 1, line 56 - column 2, line 12; Figure 13, timeline representations of agent’s (1315) and caller’s (1320) speech components inherently indicate the amount of speech time by each; the determination of what is “acceptable” in this regard is inherently a user judgment based on this provide information).
Conclusion
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed 
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/Stephen M Brinich/
Examiner, Art Unit 2663