DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments and amendments in the Amendment dated January 21, 2022 (herein “Amendment”) with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C §103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Rastrow et al., US 10,339,925 B1.

Claim Objections
Claims 1, 8 and 15, and therefore claims 3-5, 10-12, 14, and 17-19 which depend therefrom, are objected to because of the following informalities:  all of the claims recite both “monitoring ... the conversation for a pause” “the pause in the conversation” and “without a pause being detected.” As best understood, the third “pause” recited is also “the pause in the conversation” and should be recited as such.  Appropriate correction is required.
Claim 8, and therefore claims 10-12, and 14, which depend therefrom, are objected to because of the following informalities: Claim 8, next to last line of the claim, recites “visually output” but should recite “visual output.” Appropriate correction is required.



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3-5, 8, 10-12, 14-15, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Lovitt et al., (US 10,176,808 B1, herein “Lovitt”) in view of G. Tur et al., "The CALO Meeting Assistant System," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 6, pp. 1601-1611, Aug. 2010, doi: 10.1109/TASL.2009.2038810 (herein “Tur NPL”) further in view of Herold et al., (US 7,603,413 B1, herein “Herold”) further in view of Rastrow et al., (US 10,339,925 B1, herein “Rastrow”).
Regarding claim 1, Lovitt teaches a method implemented by one or more processors, comprising (Lovitt col. 14, lines 65-67, operations of system 300 including virtual assistant 320, where col. 28, lines 33-51 disclose computer system 900 as implementing the components of system 300, and including a processor 904): 
setting an automated assistant implemented at least in part on one or more conference computing devices (Lovitt fig. 3, col. 15, lines 5-7 and line 58 – col. 16, line 23, virtual assistant (automated assistant) 320 (disclosed in col. 28, lines 45-50 as implemented by a computer system, thus being a computing device) that interacts with participants in a spoken conversation session via a conferencing system, thus being a conference computing device) to a conference mode (Lovitt col. 16, lines 62-65, interactive mode) in which the automated assistant performs speech processing on multiple distinct spoken utterances exchanged during a conversation between multiple participants (Lovitt col. 7, lines 34-47, speech recognition module (where col. 15, lines 51-52 teach the virtual assistant as including the modules of fig. 1, of which the recognition module is included), is adapted to receive an audio signal of a participant utterance and convert it to recognized speech information), without requiring explicit invocation of the automated assistant prior to each of the multiple distinct spoken utterances (Lovitt col. 16, line 62 – col. 17, line 2, virtual assistant is entered into interactive mode which will have the virtual assistant automatically process utterances until an event which can be a command to exit the interactive mode (thus while in the interactive mode, no explicit invocation is needed to process commands or queries)); 
Lovitt col. 17, lines 3-7, continuing in the processing of the utterance (which is processed automatically when in interactive mode and thus without explicit participant invocation) virtual assistant uses the interpretation module to determine an interpretation result, which can include an intent or content for the utterance, where col. 7, lines 8-33 disclose that the interpretation module generates interpretation results (including context, intent and content) from utterance information, where the utterance information can be the recognized speech information (output of the speech recognition module – the “first” generated from the speech processing)); 
generating, by the automated assistant, based on the processing, a query (Lovitt col. 8, lines 51-56, interpretation results generated by the interpretation module including the recognition of contents for a query in the user utterance); 
obtaining information that is responsive to the query (Lovitt col. 9, lines 55-61, response module generates a response for an utterance (that would have had the query content in it));
by the automated assistant, the conversation for a pause (Lovitt col. 17, lines 22-26, the virtual assistant is configured to delay presenting a response until a pause or break in the spoken conversation amount participants of the session); 
in response to detecting, based on the, the pause in the conversation, providing audible output that conveys at least part of the information that is responsive to the query to the multiple participants at one or more of the conference computing devices Lovitt fig. 3, col. 10, lines 12-31, and col. 17, lines 11-26, audio response is presented once there has been a pause or break in the conversation to avoid interrupting the discussion, where the response is text of the response that has been rendered as synthesized speech (thus audibly outputting) and is provided to the computing devices 304a, 304b and 304c of the participants) while the automated assistant is in conference mode (Lovitt col. 16, line 62 – col. 17, line 2, the interactive mode continues until a command to exit, therefore, until a user command to exit, the virtual assistant will provide responses to user utterances while in the interactive mode); 
in response to determining that one or more criteria is met (Lovitt col. 13, lines 11-31, rendering policy is applied to determine how an item is presented to participants, the policy including the communication modality for a response, where col. 18, lines 20-25 gives an example where the rendering policy indicates that for a second participant (a criteria that is to be met for execution of this particular rendering policy), a response is to be rendered visually instead of being rendered as synthesized speech audio), causing one or more displays that are perceptible to the multiple participants to render visual output that conveys at least part of the information that is responsive to the query, wherein the visual output is provided in lieu of the audible output (Lovitt col. 18, lines 20-25, the rendering policy determining that responses (at least part of the information that is responsive to the query) are to be rendered visually instead of being rendered as synthesized speech audio).
While Lovitt teaches speech recognition to convert a spoken utterance into recognized speech information, Lovitt does not explicitly teach that this recognized 
Further, Lovitt teaches that the participant utterances are processed to identify commands using an interpretation module to find a result including intent or content, which at least suggests a semantic processing, as semantics relate to the meaning of language, and in order find intent, the meaning is usually considered. Nonetheless, Lovitt does not explicitly teach “semantic processing.”
Still further, while Lovitt teaches that the virtual assistant can delay presenting an audio response until a pause or break in the conversation which strongly suggests a type of monitoring to know when there is a pause or break, and which at least suggests some consideration of conversational pauses as they relate to an overall rendering policy, Lovitt does not explicitly teach “monitoring,” and Lovitt also does not explicitly teach “without a pause being detected.”
Tur NPL teaches speech-to-text processing and first text generated from the speech-to-text processing (Tur NPL page 1603, speech recognition including speech to text transcription, and page 1604, middle left column, the CALO-MA system in real-time, transcribes an audio stream of meeting participants into text), semantic processing (Tur NPL page 1606, section VII, action item detection in speech from the meeting determined from classification using semantic features of the utterances).
Herold teaches monitoring (Herold col. 3, lines 9-15, an automated agent BOT monitors conversation among participants in a chat room, and takes action when a pause of a predetermined length (one or more criteria) is detected).
Rastrow col. 5, lines 35-56, when a recipient is unavailable because a device of the recipient is outputting multimedia content (without a pause), the response is instead sent to a device with a display, so that the content of the response is visually displayed).
Therefore, taking the teachings of Lovitt and Tur NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the speech-to-text processing and semantic processing as disclosed in Tur NPL at least because doing so would improve segmentation accuracy, where proper segmentation provides accuracy in understanding what was talked about when, and increase productivity of meeting participants and non-participants (see Tur NPL pages 1606, 1605 and Abstract).
Further, taking the teachings of Lovitt and Herold together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the predetermined length of a pause and monitoring as disclosed in Herold at least because doing so would help stimulate conversation between participants, to help keep a chat room fresh, interesting and an entertaining experience (see Herold col. 3, lines 10-15 and col. 2, lines 47-48).
Still further, taking the teachings of Lovitt and Rastrow together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the responding when there is no pause in a multimedia system as disclosed in 
Regarding claims 3, 10 and 17, Lovitt as modified teaches since the one or more of the multiple distinct spoken utterances (Lovitt col. 17, lines 22-26, a break in spoken conversation, thus from one of the utterances of a conversation participant), but does not explicitly teach the remainder of the limitations of claim 3. 
Herold teaches wherein the one or more criteria comprise passage of a predetermined time interval (Herold col. 3, lines 9-15, an automated agent BOT monitors conversation among participants in a chat room, and takes action when a pause of a predetermined length (predetermined time interval) is detected in the conversation).
Further, taking the teachings of Lovitt and Herold together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the predetermined length of a pause and monitoring as disclosed in Herold at least because doing so would help stimulate conversation between participants, to help keep a chat room fresh, interesting and an entertaining experience (see Herold col. 3, lines 10-15 and col. 2, lines 47-48).
Regarding claims 4, 11 and 18, Lovitt as modified does not explicitly teach the limitations of claims 4, 11 and 18. Tur NPL teaches wherein the one or more criteria comprise detecting a new topic of the conversation (Tur NPL pages 1605-1606, identifying when a topic is discussed and topic shifts from one topic to another (new topic), where amounts of silence are indicative of a topic shift, the CALO-MA system using Bayesian interference to estimate the topics and the positions of the topic shifts).
Therefore, taking the teachings of Lovitt and Tur NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the topic segmentation processing as disclosed in Tur NPL at least because doing so would improve segmentation accuracy, where proper segmentation provides accuracy in understanding what was talked about when, and increase productivity of meeting participants and non-participants (see Tur NPL pages 1606, 1605 and Abstract).
Regarding claims 5, 12 and 19, Lovitt as modified teaches wherein the one or more criteria comprise detecting a change in context of the conversation (Lovitt col. 7, lines 8-17, interpretation module that determines utterance information including context from the context module, where col. 17, lines 5-14 teach that the recipients for a response (criteria for determining which participant will get the synthesized speech response and which participant will not) from the interpretation result).
Regarding claim 8, Lovitt teaches a system comprising one or more processors and memory storing instructions that, in response to execution of the instructions by the one or more processors, cause the one or more processors to (Lovitt col. 14, lines 65-67, operations of system 300 including virtual assistant 320, where col. 28, lines 33-51 disclose computer system 900 as implementing the components of system 300, and including a processor 904, with a main memory device coupled via a bus to the processor for storing instructions to be executed by the processor): 
Lovitt fig. 3, col. 15, lines 5-7 and line 58 – col. 16, line 23, virtual assistant (automated assistant) 320 (disclosed in col. 28, lines 45-50 as implemented by a computer system, thus being a computing device) that interacts with participants in a spoken conversation session via a conferencing system, thus being a conference computing device) to a conference mode (Lovitt col. 16, lines 62-65, interactive mode) in which the automated assistant performs speech processing on multiple distinct spoken utterances exchanged during a conversation between multiple participants (Lovitt col. 7, lines 34-47, speech recognition module (where col. 15, lines 51-52 teach the virtual assistant as including the modules of fig. 1, of which the recognition module is included), is adapted to receive an audio signal of a participant utterance and convert it to recognized speech information), without requiring explicit invocation of the automated assistant prior to each of the multiple distinct spoken utterances (Lovitt col. 16, line 62 – col. 17, line 2, virtual assistant is entered into interactive mode which will have the virtual assistant automatically process utterances until an event which can be a command to exit the interactive mode (thus while in the interactive mode, no explicit invocation is needed to process commands or queries)); 
automatically perform, by the automated assistant, processing on first generated from the speech processing of one or more of the multiple distinct spoken utterances, wherein the processing is performed without explicit participant invocation (Lovitt col. 17, lines 3-7, continuing in the processing of the utterance (which is processed automatically when in interactive mode and thus without explicit participant invocation) virtual assistant uses the interpretation module to determine an interpretation result, which can include an intent or content for the utterance, where col. 7, lines 8-33 disclose that the interpretation module generates interpretation results (including context, intent and content) from utterance information, where the utterance information can be the recognized speech information (output of the speech recognition module – the “first” generated from the speech processing)); 
generate, by the automated assistant, based on the processing, a query (Lovitt col. 8, lines 51-56, interpretation results generated by the interpretation module including the recognition of contents for a query in the user utterance); 
obtain information that is responsive to the query (Lovitt col. 9, lines 55-61, response module generates a response for an utterance (that would have had the query content in it));
by the automated assistant, the conversation for a pause (Lovitt col. 17, lines 22-26, the virtual assistant is configured to delay presenting a response until a pause or break in the spoken conversation amount participants of the session); 
in response to detection of the pause in the conversation, provide audible output that conveys at least part of the information that is responsive to the query to the multiple participants at one or more of the conference computing devices (Lovitt fig. 3, col. 10, lines 12-31, and col. 17, lines 11-26, audio response is presented once there has been a pause or break in the conversation to avoid interrupting the discussion, where the response is text of the response that has been rendered as synthesized speech (thus audibly outputting) and is provided to the computing devices 304a, 304b and 304c of the participants) while the automated assistant is in conference mode (Lovitt col. 16, line 62 – col. 17, line 2, the interactive mode continues until a command to exit, therefore, until a user command to exit, the virtual assistant will provide responses to user utterances while in the interactive mode); 
in response to a determination that one or more criteria is met (Lovitt col. 13, lines 11-31, rendering policy is applied to determine how an item is presented to participants, the policy including the communication modality for a response, where col. 18, lines 20-25 gives an example where the rendering policy indicates that for a second participant (a criteria that is to be met for execution of this particular rendering policy), a response is to be rendered visually instead of being rendered as synthesized speech audio), cause one or more displays that are perceptible to the multiple participants to render visual output that conveys at least part of the information that is responsive to the query, wherein the visually output is provided in lieu of the audible output (Lovitt col. 18, lines 20-25, the rendering policy determining that responses (at least part of the information that is responsive to the query) are to be rendered visually instead of being rendered as synthesized speech audio).
While Lovitt teaches speech recognition to convert a spoken utterance into recognized speech information, Lovitt does not explicitly teach that this recognized speech information is text. Therefore, Lovitt does not explicitly teach “speech-to-text processing.”
Further, Lovitt teaches that the participant utterances are processed to identify commands using an interpretation module to find a result including intent or content, which at least suggests a semantic processing, as semantics relate to the meaning of language, and in order find intent, the meaning is usually considered. Nonetheless, Lovitt does not explicitly teach “semantic processing.”

Tur NPL teaches speech-to-text processing and first text generated from the speech-to-text processing (Tur NPL page 1603, speech recognition including speech to text transcription, and page 1604, middle left column, the CALO-MA system in real-time, transcribes an audio stream of meeting participants into text), semantic processing (Tur NPL page 1606, section VII, action item detection in speech from the meeting determined from classification using semantic features of the utterances).
Herold teaches monitor (Herold col. 3, lines 9-15, an automated agent BOT monitors conversation among participants in a chat room, and takes action when a pause of a predetermined length (one or more criteria) is detected).
Rastrow teaches without a pause being detected (Rastrow col. 5, lines 35-56, when a recipient is unavailable because a device of the recipient is outputting multimedia content (without a pause), the response is instead sent to a device with a display, so that the content of the response is visually displayed).
Therefore, taking the teachings of Lovitt and Tur NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the speech-to-text processing and semantic processing as disclosed in Tur NPL at least because doing so would improve segmentation accuracy, where proper segmentation provides accuracy in understanding what was talked about when, and 
Further, taking the teachings of Lovitt and Herold together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the predetermined length of a pause and monitoring as disclosed in Herold at least because doing so would help stimulate conversation between participants, to help keep a chat room fresh, interesting and an entertaining experience (see Herold col. 3, lines 10-15 and col. 2, lines 47-48).
Still further, taking the teachings of Lovitt and Rastrow together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the responding when there is no pause in a multimedia system as disclosed in Rastrow, at least because doing so would prevent a recipient from being disruptive to other individuals participating in a multimedia event (see Rastrow col. 5, lines 48-50).
Regarding claim 14, Lovitt as modified above teaches further comprising, in response to the determination that the one or more criteria is met (Lovitt col. 17, lines 3-14, considering a criteria to be a rendering policy rule specifying not to provide synthesized speech or any response to a particular participant). 
Tur NPL teaches output the data after a conclusion of the conversation (Tur NPL page 1604 left column, once the meeting is concluded, an offline recognition system generates (outputs) an accurate transcript (data) for later browsing, it is noted that “data” has antecedent basis in 1) being generated by the automated assistant and 2) being based on semantic processing, both of which Tur also performs to generate its transcript).
Herold teaches before the pause is detected (Herold col. 3, lines 9-15, an automated agent BOT monitors conversation among participants in a chat room, and takes action when a pause of a predetermined length (one or more criteria) is detected (thus, the pause must be of that predetermined length before it is detected to be a pause for which action is taken, otherwise, no action is taken/a refraining from taking action)).
Therefore, taking the teachings of Lovitt and Tur NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the speech recognition after a meeting is concluded as disclosed in Tur NPL at least because doing so would improve accuracy in understanding what was talked about when, and increase productivity of meeting participants and non-participants (see Tur NPL pages 1606, 1605 and Abstract).
Further, taking the teachings of Lovitt and Herold together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the monitoring as disclosed in Herold at least because doing so would help stimulate conversation between participants, to help keep a chat room fresh, interesting and an entertaining experience as necessary (see Herold col. 3, lines 10-15 and col. 2, lines 47-48).
Regarding claim 15, Lovitt teaches at least one non-transitory computer-readable medium comprising instructions that, in response to execution of the instructions by one or more processors, cause the one or more processors to perform the following operations (Lovitt col. 14, lines 65-67, operations of system 300 including virtual assistant 320, where col. 28, lines 33-51 disclose computer system 900 as implementing the components of system 300, and including a processor 904, with a main memory device such as RAM (non-transitory computer-readable medium) coupled via a bus to the processor for storing instructions to be executed by the processor): 
setting an automated assistant implemented at least in part on one or more conference computing devices (Lovitt fig. 3, col. 15, lines 5-7 and line 58 – col. 16, line 23, virtual assistant (automated assistant) 320 (disclosed in col. 28, lines 45-50 as implemented by a computer system, thus being a computing device) that interacts with participants in a spoken conversation session via a conferencing system, thus being a conference computing device) to a conference mode (Lovitt col. 16, lines 62-65, interactive mode) in which the automated assistant performs speech processing on multiple distinct spoken utterances exchanged during a conversation between multiple participants (Lovitt col. 7, lines 34-47, speech recognition module (where col. 15, lines 51-52 teach the virtual assistant as including the modules of fig. 1, of which the recognition module is included), is adapted to receive an audio signal of a participant utterance and convert it to recognized speech information), without requiring explicit invocation of the automated assistant prior to each of the multiple distinct spoken utterances (Lovitt col. 16, line 62 – col. 17, line 2, virtual assistant is entered into interactive mode which will have the virtual assistant automatically process utterances until an event which can be a command to exit the interactive mode (thus while in the interactive mode, no explicit invocation is needed to process commands or queries)); 
automatically performing, by the automated assistant, processing on first generated from the speech processing of one or more of the multiple distinct spoken utterances, wherein the processing is performed without explicit participant invocation (Lovitt col. 17, lines 3-7, continuing in the processing of the utterance (which is processed automatically when in interactive mode and thus without explicit participant invocation) virtual assistant uses the interpretation module to determine an interpretation result, which can include an intent or content for the utterance, where col. 7, lines 8-33 disclose that the interpretation module generates interpretation results (including context, intent and content) from utterance information, where the utterance information can be the recognized speech information (output of the speech recognition module – the “first” generated from the speech processing)); 
generating, by the automated assistant, based on the processing, a query (Lovitt col. 8, lines 51-56, interpretation results generated by the interpretation module including the recognition of contents for a query in the user utterance); 
obtaining information that is responsive to the query (Lovitt col. 9, lines 55-61, response module generates a response for an utterance (that would have had the query content in it));
by the automated assistant, the conversation for a pause (Lovitt col. 17, lines 22-26, the virtual assistant is configured to delay presenting a response until a pause or break in the spoken conversation amount participants of the session); 
Lovitt fig. 3, col. 10, lines 12-31, and col. 17, lines 11-26, audio response is presented once there has been a pause or break in the conversation to avoid interrupting the discussion, where the response is text of the response that has been rendered as synthesized speech (thus audibly outputting) and is provided to the computing devices 304a, 304b and 304c of the participants) while the automated assistant is in conference mode (Lovitt col. 16, line 62 – col. 17, line 2, the interactive mode continues until a command to exit, therefore, until a user command to exit, the virtual assistant will provide responses to user utterances while in the interactive mode); 
in response to determining that one or more criteria is met (Lovitt col. 13, lines 11-31, rendering policy is applied to determine how an item is presented to participants, the policy including the communication modality for a response, where col. 18, lines 20-25 gives an example where the rendering policy indicates that for a second participant (a criteria that is to be met for execution of this particular rendering policy), a response is to be rendered visually instead of being rendered as synthesized speech audio), causing one or more displays that are perceptible to the multiple participants to render visual output that conveys at least part of the information that is responsive to the query, wherein the visual output is provided in lieu of the audible output (Lovitt col. 18, lines 20-25, the rendering policy determining that responses (at least part of the information that is responsive to the query) are to be rendered visually instead of being rendered as synthesized speech audio).

Further, Lovitt teaches that the participant utterances are processed to identify commands using an interpretation module to find a result including intent or content, which at least suggests a semantic processing, as semantics relate to the meaning of language, and in order find intent, the meaning is usually considered. Nonetheless, Lovitt does not explicitly teach “semantic processing.”
Still further, while Lovitt teaches that the virtual assistant can delay presenting an audio response until a pause or break in the conversation which strongly suggests a type of monitoring to know when there is a pause or break, and which at least suggests some consideration of conversational pauses as they relate to an overall rendering policy, Lovitt does not explicitly teach “monitoring,” and Lovitt also does not explicitly teach “without a pause being detected.”
Tur NPL teaches speech-to-text processing and first text generated from the speech-to-text processing (Tur NPL page 1603, speech recognition including speech to text transcription, and page 1604, middle left column, the CALO-MA system in real-time, transcribes an audio stream of meeting participants into text), semantic processing (Tur NPL page 1606, section VII, action item detection in speech from the meeting determined from classification using semantic features of the utterances).
Herold col. 3, lines 9-15, an automated agent BOT monitors conversation among participants in a chat room, and takes action when a pause of a predetermined length (one or more criteria) is detected).
Rastrow teaches without a pause being detected (Rastrow col. 5, lines 35-56, when a recipient is unavailable because a device of the recipient is outputting multimedia content (without a pause), the response is instead sent to a device with a display, so that the content of the response is visually displayed).
Therefore, taking the teachings of Lovitt and Tur NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the speech-to-text processing and semantic processing as disclosed in Tur NPL at least because doing so would improve segmentation accuracy, where proper segmentation provides accuracy in understanding what was talked about when, and increase productivity of meeting participants and non-participants (see Tur NPL pages 1606, 1605 and Abstract).
Further, taking the teachings of Lovitt and Herold together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the virtual assistant operations of Lovitt to include the predetermined length of a pause and monitoring as disclosed in Herold at least because doing so would help stimulate conversation between participants, to help keep a chat room fresh, interesting and an entertaining experience (see Herold col. 3, lines 10-15 and col. 2, lines 47-48).
.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Friday, 09:30-18:30 EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MICHELLE M. KOETH
Primary Examiner
Art Unit 2656



/MICHELLE M KOETH/Primary Examiner, Art Unit 2656