DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1, 4, 6, 8 to 11, 14, 16, and 18 to 20 are objected to because of the following informalities:  
Independent claims 1, 11, and 20 set forth a new limitation of “is directed to receiver”, which should be “is directed to a receiver”.  The limitation of “receiver” has no prior antecedent basis in these independent claims.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4, 6, 8 to 11, 14, 16, and 18 to 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly et al. (U.S. Patent No. 9,972,318) in view of LeBeau et al. (U.S. Patent Publication 2015/0310867).
Kelly et al. discloses a method, system, and computer readable medium for interpreting voice commands, comprising:
“receiving, at a digital assistant of an information handling device, an indication to initiate an audible conversational session with a user” – a user may interact with system 100 to request a recipe from a first process during a multi-turn session requiring input from the user followed by output from a first process (column 6, lines 35 to 39: Figure 1); system 100 may receive a first voice command associated with a first process, and may start a current session (“an audible conversational session with a user”) (column 7, lines 30 to 33: Figure 1); microphone 112 of device 110 captures audio 11 corresponding to a spoken utterance; device 110 using a wakeword detection module 220 then processes audio 11 to determine if a wakeword is detected in audio 11 (column 9, lines 39 to 45: Figure 1); once a wakeword is detected, local device 110 may ‘wake’ and begin transmitting audio data 211 corresponding to input audio 11 to server 120 (column 10, lines 58 to 60: Figure 1); concepts disclosed may be applied in a number of different devices and computer systems including personal digital assistants (“a digital assistant of an information handling device”) (column 36, lines 17 to 25); here, a wakeword is “an indication to initiate an audible conversational session with a user”;
“detecting a query input provided by the user during the conversational session, wherein the query input comprises one or more functions to be performed by the digital assistant, and wherein the query input comprising the one or more functions is received during an exchange within the conversational session” – a voice command instructs system 100 what the user would like to do, e.g., a user may request information, e.g., “What is the capital of France?”, or may request server 120 to perform an action, e.g., e.g., request (column 4, lines 9 to 26: Figure 1); a query may be processed to determine an intent or intents, where each intent corresponds to an action to be performed that is responsive to the query (column 14, lines 21 to 35: Figure 2); an NLU output can include a command to play music on a music playing application or appliance, or to request the return of search results from a search engine located on a search server (column 16, lines 10 to 23: Figure 2); here, an action corresponds to “one or more functions to be performed by the digital assistant”, i.e., performing an action of playing music or searching for information;
“wherein the exchange comprises a multi-turn exchange comprising one or more outputs provided to the user by the digital assistant in response to inputs provided by the user” – a user may interact with system 100 to request a recipe during a multi-turn session requiring input from the user following output from a first process; system 100 may receive a first voice command from the user, generate first output associated with a second step of a recipe, and, later, system 100 may receive a second voice command, and generate second output associated with a third step of the recipe; after the first process generates the first output, the first process continues the current exchange; system 100 may use session data, e.g., a session cookie, to track progress during a multi-turn exchange (“wherein the exchange comprises a multi-turn exchange”) (column 6, lines 35 to 60: Figure 1);
“completing, utilizing the digital assistant, the function; determining, based upon completion of the function, the exchange in relation to the query input has been completed” – a first process may read the recipe to the user during a multi-turn session; e.g., play synthesized speech reading step two of the recipe to the user; later, system 100 may receive a second voice command, and generate second output associated with step three of the recipe, and system 100 may output second output, e.g., play synthesized speech reading step three of the recipe to the user (column 6, lines 35 to 54: Figure 1); here, each step of the recipe of playing a step in a recipe provides for “completing . . . the function”; a server may determine that user 10 has completed a session, e.g., completed a final step in a series of steps, so that progress data is not necessary and may end the process (column 21, lines 62 to 65: Figures 6A to 6B);
“sustaining, at the digital assistant, the conversational session, responsive to determining the one or more functions have completed and responsive to determining the exchange has been completed, until a termination factor occurs, wherein the termination factor comprises an indication to end the conversational session, wherein the sustaining the conversational session comprises keeping the digital assistant activated after completion of the one or more functions and after the exchange has been completed without receiving an additional indication to maintain the conversational session” – server 120 may determine that user 10 has not completed a session, e.g., that additional steps follow a current step in a series of steps, and may determine to save the progress data before halting a process (column 21, line 66 to column 22, line 3: Figures 6A to 6B); when system 100 determines that a timeout has occurred, e.g., no input for a period of time of 8 seconds while operating a second process, server 120 may end a second process (column 22, lines 14 to 22: Figures 6A to 6B); server 120 e.g., no input for a period of time of 8 seconds; when server 120 determines that Process 1 is not complete, e.g., 50% progress, which corresponds to a current step that has subsequent steps, server 120 may interpret a timeout as an implicit command to delete progress data (column 22, lines 34 to 49: Figures 6A to 6B); server 120 may interpret user 10 switching between processes as an implicit command to store progress data when the current session appears to be complete; user 10 may want to resume Process 1 from the end, or may have skipped steps and want to go back; different voice commands may instruct system 100 to restart a process or to resume from a previous session, e.g., ‘start recipes’, ‘restart recipes’, ‘begin recipes’, etc., may resume the process; server 120 may determine to store progress data when detecting that a timeout occurred; server 120 may interpret a timeout as an implicit command to store progress data (column 23, lines 1 to 33: Figures 6A to 6B); server 120 can determine to halt a process, e.g., receive a voice command to start a second process, determine that a timeout has occurred, and receive a voice command to save progress data (column 25, lines 23 to 26: Figure 8); server 120 may determine to halt a first process; server 120 may receive a voice command associated with a second process, may detect a timeout, e.g., no input to device 110 for a period of time of 8 seconds, or may receive a voice command instructing server 120 to end the first process (column 26, lines 19 to 24: Figure 9); Compare Specification, ¶[0018], ¶[0038], and ¶[0045], where one embodiment for ‘sustaining’ a conversational session includes a time out period; Compare Claims 9 and 18, which state that “a termination factor” can include “a predetermined time period”.
Kelly et al. discloses all of the limitations of these independent claims, but omits “wherein the indication to end comprises determining, using a model informed by contextual information related to the environment of the user, user input received at the digital assistant after completion of the exchange and while sustaining the conversational session is directed to receiver other than the digital assistant.”  That is, Kelly et al. discloses determining if a user has completed a session of a multi-turn dialogue as an ongoing exchange in a current session and determining that a timeout has occurred to end the session.  However, Kelly et al. omits using a ‘contextual model’ related to ‘the environment of the user’ that user input is ‘directed to receiver other than the digital assistant’, e.g., the user is talking in a conversation to another user and not talking to the digital assistant.  Still, Kelly et al. discloses that performing automatic speech recognition (ASR) interprets an utterance based on a similarity between the utterance and models for sounds to identify words used in context (“using a model informed by contextual information”).  (Column 11, Line 20 to Column 12, Line 38)
Concerning independent claims 1, 11 and 20, LeBeau et al. teaches these limitations directed to using a determined context for voice input until a variety of ending events occur.  Specifically, LeBeau et al. teaches separating voice input from ambient noises (“information related to the environment of the user”), e.g., background noises, and then determining whether the voice input is applicable to this mobile computing device.  When two users are having a conversation in a presence of a mobile computing device that is monitoring for voice input, this mobile computing device can determine which of the voice inputs is part of the users’ conversation (“determining . . . user input b can be tipped off that certain voice input is directed to the mobile computing device 102b based on changes in the voice input structure, pauses, e.g., user waiting for a response from mobile computing device 102b, changes in apparent direction of the audio signal, e.g., user faces mobile computing device 102b, changes in speed and delivery, and changes in tone and inflection.  There are a number of questions in conversations 130 to 136 between Alice 126 and Bob 128, but only question in voice input 136 is directed to mobile device 102b.  (¶[0038]: Figure 1A)  Mobile computing device 142 detects a current context for mobile computing device 142 and a user associated with mobile computing device 144.  Mobile computing device 142 determines whether to monitor audio signals for a user request based on current context 146 for a user request based on current context 146 of device 142 and its user 150.  (¶[0043] - ¶[0044]: Figure 1A: Steps A and B)  If a user does not provide many voice-based requests to computing device 172 in a context C 178 over time, mobile computing device 172 may stop monitoring for voice input in context C 178.  (¶[0056]: Figure 1C)  Mode selection unit 238 can examine changed context of mobile computing device 202 to determine whether to stop monitoring for voice input.  Upon determining to stop monitoring for voice input, mode selection unit 238 can instruct input subsystem 204 and input parser 210 to deactivate microphone 206a.  (¶[0084] - ¶[0085]: Figure 3A: Steps 318 to 320)  Changes in a structure associated with voice input can be identified, Kelly et al. by determining that user input received by a digital assistant is directed to a receiver other than the digital assistant using a model informed by contextual information related to the environment of the user as taught by LeBeau et al. for a purpose of automatically determining when to monitor for voice input including a verbal search request.

Concerning claims 4 and 14, Kelly et al. discloses a wakeword detection module 220 that determines if a wakeword is detected in audio 11 (column 9, lines 42 to 48: Figure 1); once a wakeword is detected, local device 110 may ‘wake’, and begin transmitting audio data 211 corresponding to audio input 11 to server 120 (column 10, lines 58 to 60: Figure 1).  Here, a wakeword is “the indication to initiate the conversational session.”
Concerning claims 6 and 16, Kelly et al. discloses that server 120 may determine to halt a process by receiving a voice command to end a first process (“wherein the receiving an indication to end comprises receiving, from the user, a predetermined command”) (column 26, lines 19 to 24: Figure 9).  Similarly, LeBeau et al. teaches that a mobile computing device can continue to monitor for voice input until a variety of ending events occur including the current context of the mobile computing device changing, e.g., the user removes the mobile computing device from the car, or the user indicating e.g., the user providing voice input that provides this indication by “stop monitoring voice input”, or a battery running low.  (¶[0025]: Figure 1B)  Here, LeBeau et al. teaches “an indication to end the conversational session” when a user speaks “stop monitoring input”.  
Concerning claim 8, LeBeau et al. teaches that a mobile computing device can continue to monitor for voice input until a variety of ending events occur including the current context of the mobile computing device changing (“using contextual information associated with the user input”), e.g., the user removes the mobile computing device from the car, or the user indicating that they want the voice input monitoring to end, e.g., the user providing voice input that provides this indication by “stop monitoring voice input”, or a battery running low.  (¶[0025]: Figure 1B)
Concerning claims 9 and 18, Kelly et al. discloses that if system 100 determines that a timeout has occurred, e.g., no input for a period of time of 8 seconds while operating a second process, server 120 may end a second process (column 22, lines 14 to 22: Figures 6A to 6B); server 120 may determine that Process 1 is complete at the time that server 120 detects a timeout, e.g., no input for a period of time of 8 seconds; when server 120 determines that Process 1 is not complete, e.g., 50% progress, which corresponds to a current step that has subsequent steps, server 120 may interpret a timeout as an implicit command to delete progress data (column 22, lines 34 to 49: Figures 6A to 6B); server 120 may interpret a timeout as an implicit command to store progress data (column 23, lines 1 to 33: Figures 6A to 6B); server 120 can determine to halt a process, e.g., receive a voice command to start a second process, determine that a timeout has occurred, receive a voice command to save progress data (column 25, e.g., no input to device 110 for a period of time of 8 seconds, or may receive a voice command instructing server 120 to end the first process (column 26, lines 19 to 24: Figure 9); here, a timeout is “a predetermined time period” and ending a process upon detecting a timeout is “ending the conversational session after expiration of the predetermined time period.”
Concerning claims 10 and 19, Kelly et al. discloses that a user may directly speak out a voice signal V1 including the identification information, e.g., a specific vocabulary word or name of ‘Theresa’, to wake up mobile terminal apparatus 300 to execute a voice interaction function (¶[0045] - ¶[0047] and ¶[0049]: Figures 3 and 4: Step S402); here, a wakeword is “a predetermined command” “to initiate a conversational session”.  Similarly, LeBeau et al. teaches that a mobile computing device can perform a variety of techniques to make a determination that voice inputs are part of requests for the mobile computing device to perform an operation including monitoring for particular keywords, (e.g., ‘search’, ‘mobile device’, etc.) and examining syntax (e.g., identify questions, identify commands, etc.) (“receiving a predetermined command”).

Response to Arguments
Applicants’ arguments filed 30 June 2021 have been fully considered but are moot in view of new grounds of rejection as necessitated by amendment.
Kelly et al. (U.S. Patent No. 9,972,318) in view of Zhang (U.S. Patent Publication 2014/0309996).  Mainly, Applicants’ argument appears to be that the new limitations are not disclosed or taught by Kelly et al. or Zhang.  Additionally, Applicants characterize Kelly et al. as directed to pausing and resuming a process using voice commands.  Applicants make some general allegations, too, that there is an insufficient evidentiary basis for a combination of references.
New grounds of rejection are now applied to independent claims 1, 11, and 20 as being obvious under 35 U.S.C. §103 over Kelly et al. (U.S. Patent No. 9,972,318) in view of LeBeau et al. (U.S. Patent Publication 2015/0310867).  The rejection no longer relies upon Zhang.  Applicants’ arguments fail to consider the teachings of LeBeau et al., which were clearly relevant to the new limitations as set forth in the prior rejection.  All of the pending claims are now rejected over Kelly et al. and LeBeau et al.  It should be evident from the prior rejection of the independent claims that it might not be necessary to require Zhang, where it was stated that “Kelly et al. could be construed to disclose all of the limitations of these independent claims” in the Office Action.  Applicants’ new claim amendments do provide some narrowing of the prior limitations of LeBeau et al.  
Generally, LeBeau et al. teaches these new limitations by monitoring voice inputs for ending events based on context for which voice inputs are part of users’ conversations and which voice inputs are requests for a mobile computing device to perform an action.  Voice inputs are separated from ambient noises, and particular keywords and syntax are examined to make a determination if voice inputs are requests for a mobile computing device to perform an action or if voice inputs are only conversation between users.  (¶[0025] - ¶[0026])  A mobile computing device can be tipped off that certain voice input is directed to a mobile computing device based on changes in voice input structure, e.g., pauses, or changes in an apparent direction of an audio signal.  A conversation 130 to 136 between Alice 126 and Bob 128 include a number of questions, but only question 136 in voice input is directed at mobile computing device 102b.  (¶[0038] - ¶[0039]: Figure 1A)  Mobile computing device 142 continually receives and monitors ambient audio signals for a user request 154.  A television 156a, a person 156b, and a pet 156c can produce audio signals 158a-c.  (¶[0047])  If a user does not provide many voice-based requests to computing device 172 in context C 178 over time, mobile computing device 172 may stop monitoring for voice input in the context C 178.  (¶[0058]: Figure 1C)  Mode selection unit 238 can determine whether to start or stop monitoring audio data for a user request based on user behavior data associated with audio data monitoring that is stored in a user behavior data repository 242.  (¶[0065]: Figure 2B)  Keyword identifier 242a can determine whether a particular voice input is directed at mobile computing device 202 a and speech analysis subsystem 212.  (¶[0080] - ¶[0085]: Figure 3A: Steps 310, 312, and 320)  Changes in structure associated with voice input can be identified, and based on the identified changes, a determination as to whether voice input is directed to a mobile computing device can be made.  (¶[0093]: Figure 3B: Steps 356 to 358)  
Applicants’ remaining arguments challenge a propriety of a combination of Kelly et al. and Zhang.  This argument is moot because the rejection no longer relies upon Zhang.  Applicants’ characterization of Kelly et al. as referring back to a process that was paused and then resuming the process is not persuasive to overcome the rejection because using pauses to determine timeouts is not excluded by the claim language.  Notably, LeBeau et al. suggests that pauses may be used for determining ending events, too.  Compare ¶[0025] - ¶[0026] and ¶[0038] of LeBeau et al.  Generally, it is maintained that the combination may be properly premised on an express motivation statement to automatically determine whether to monitor for voice input as taught by LeBeau et al., or by (C) Use of known technique to improve similar devices (methods, or products) in the same way of KSR International Co. v. Teleflex Inc. (KSR), 550 U.S. LeBeau et al. teaches use of a known technique with contextual information to determine if a voice input is directed at a mobile device or is directed to a different user or is background noise, and this known technique is an improvement over a similar device that uses a conventional timeout period to determine whether to sustain a conversation of Kelly et al. 
The rejection of independent claims 1, 11, and 20 as being obvious under 35 U.S.C. §103 over Kelly et al. (U.S. Patent No. 9,972,318) in view of LeBeau et al. (U.S. Patent Publication 2015/0310867) is maintained to be proper.  These new grounds of rejection are necessitated by amendment.  Applicants’ arguments are moot and/or unpersuasive.  Accordingly, this rejection is properly FINAL.

Conclusion
Applicants’ amendment necessitated the new grounds of rejection presented in this Office Action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP §706.07(a).  Applicants are reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached on Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MARTIN LERNER/Primary Examiner
Art Unit 2657
July 26, 2021