DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to the preliminary amendment filed September 19, 2019.  Claims 1-8 have been cancelled.  Claims 9-15 have been added.  Claims 9-15 are pending.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on September 19, 2019 and June 19, 2020, is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 9-15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claims 9 and 15 recite performing speech recognition on a speaker's speech; extracting a preset keyword from a recognition result; referring to an extraction result, and determining whether the speaker's speech is a conversation; and extracting a command for operating an apparatus from the recognition result when the speech is determined not to be a 
This judicial exception is not integrated into a practical application because the recited processing circuitry amounts to no more than mere instructions to apply the exception using generic computer components or circuitry.  The recited claims as a whole merely describe how to generally apply the concept of post-processing speech recognition results.  The claims fail to recite any additional elements (specific or non-generic) that would impose meaningful limits on practicing the judicial exception.   The claim is directed to an abstract idea.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because, as discussed above with respect to integration of the abstract idea, the elements of the processing circuitry to perform the various steps amounts to no more than mere instructions to apply the exception using generic processing components.  Mere instructions to apply an exception using generic computer processing components cannot provide an inventive concept.  The claims are not patent eligible.
Dependent claims 10-14 do not integrate the judicial exception into a practical application and do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 10 recites limitations to acquire face-direction information of at least either a speaker or a person other than the speaker which can be achieved by the person observing the speaker and people the speaker is speaking with to ascertain their face 
Claim 11 recites limitations to acquire face-direction information of a person other than a speaker, which can be achieved by the person observing the speaker and people the speaker is speaking with to ascertain their face directions; and to detect presence or absence of a response of the other person on a basis of at least either the acquired face-direction information of the other person in response to the speaker's speech or a recognized speech response of the other person in response to the speaker's speech, which can be achieved by the person observing the interactions and conversations of each participant; and to set, when having detected the response of the other person, the speaker's speech or a part of the speaker's speech, as the keyword, which can be achieved by the person reviewing and analyzing the conversations and determining which words are set to be keywords.
Claim 12 recites limitations to determine whether an interval between speech sections in the recognition results is equal to or more than a preset threshold value can be achieved by the person timing each speaker’s speech, and estimates that the conversation has been terminated, when the interval between the speech sections is equal to or more than the preset threshold value, can be achieved by the person timing the conversation portions and when a particular time has elapsed deciding the conversation has ended.
Claim 13 recites limitations to determine whether a word indicating termination of conversation is included in the recognition result can be achieved by the person reviewing the written words and identifying terminating words, and estimates that the conversation has been terminated, when the word indicating termination of conversation is included, can be achieved by the person deciding the conversation is over when the person encounters the terminating word.
Claim 14 recites limitations to determining that the speaker's speech is a conversation, performs a control to provide notification about a result of the determination, can be achieved by the person, once analyzing the words to determine a conversation, issuing notification of the people having a conversation.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 9 and 12-15 are rejected under 35 U.S.C. 103 as being unpatentable over Lehman et al (US Patent Application Publication No. 2018/0068658), hereinafter Lehman, in view of Lee et al (US Patent Application Publication No. 2013/0297317), hereinafter Lee.
Lehman teaches classifying segments of speech based on acoustic features and context.  Regarding claims 9 and 15, Lehman teaches a speech recognition device and method [Fig. 1, 2, 5; para 0009-0011] comprising:  processing circuitry [Fig.1; para 0009-0011]; performing speech recognition on a speaker's speech [para 0010 -- recognize spoken words and phrases; 0025 -- user speaking or uttering one or more words]; extracting a preset keyword from a recognition result [para 0016-- Keyword module 143 may be configured to recognize keywords. Keyword module 143 may be configured to identify keywords corresponding to the any of the one or more keywords included in keywords 136; 0025-0026 -- keyword module 143 identifies keywords and context module 145 analyzes a context of the identified keywords; 0034]; referring to an extraction result, and determining whether the speaker's speech is a conversation [para 0034-0037 – social speech]; and extracting a command for operating an apparatus from the recognition result when the speech is determined not to be a conversation [Fig. 2--201a:202a:203a:204a; para 0026], and not extracting the command from the recognition result when the speech is determined to be a conversation [Fig. 2—201c:202c:203c:204c; para 0030-0031 -- As such, at 204c, processing module 149 may terminate the process for executing the action associated with the keyword "go" before execution of the action by component 190 has begun—where one having ordinary skill in the art would have recognized the advantages of not extracting the command associated with the keyword when the speech reflects conversation, for the purpose of reducing system processing time and power].  Lehman fails to teach the preset keyword is a word indicating a personal name or a call.  Lee teaches processing conversations that applies keyword parsing process to identify keywords, where keywords can be a name or appellation of a person [para 0032].    One having ordinary skill in the art at the time of the invention would have recognized the advantages of implementing the keyword with personal name conversation processing, as suggested by Lee, in the system of Lehman, to provide additional contextual information to be used by Lee in determining the user is addressing another person and having the social conversation.  
Regarding claim 12, the combination of Lehman and Lee teaches while determining the speaker's speech to be a conversation, the processing circuitry determines whether an interval between speech sections in the recognition results is equal to or more than a preset threshold value, and estimates that the conversation has been terminated, when the interval between the speech sections is equal to or more than the preset threshold value [Lehman’s VAD and silence detection at para 0019, where one having ordinary skill in the art at the time of the invention would have recognized the advantages of utilizing the processing and functions of the VAD and silence detection to determine times between speech sections exceeding thresholds and determining the conversation is over, so as to ensure the system can be in a context-aware state to process additional speech and reduce the risk of pertinent command being missed and not processed].
Regarding claim 13, the combination of Lehman and Lee teaches while determining the speaker's speech to be a conversation, the processing circuitry  keyword table 136 at para 0016, where the keywords can be English words, and one having ordinary skill in the art at the time of the invention would have recognized the advantages of creating a keyword list of any English words to include termination words to determine the conversation has ended, so as to ensure the system can be in a context-aware state to process additional speech and reduce the risk of pertinent command being missed and not processed
Regarding claim 14, the combination of Lehman and Lee teaches determining that the speaker's speech is a conversation, performing a control to provide notification about a result of the determination [Lehman at para 0030-0031 -- context module 245c may determine the context of the keyword based on the probability that one or more preceding segments of digitized speech 108 is a keyword, social speech, or background….processing module 149 may terminate the process for executing the action associated with the keyword "go" before execution of the action by component 190 has begun..where the termination provides a form of notification that the speech segments are social speech (conversation)].



Claims 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Lehman in view of Lee as applied to claim 9 above, and further in view of Teller et al (US Patent Application Publication No. 2013/0304479), hereinafter Teller.
Regarding claim 10, Lehman and Lee fail to teach acquiring face-direction information of at least either a speaker or a person other than the speaker; and to determine when the processing circuitry determines that the speech is not a conversation, whether the speaker's speech is a conversation, on a basis of whether the acquired face-direction information satisfies a preset condition.  Teller teaches sustained eye gaze for determining intent to interact by determining that a gaze direction is in a direction of a gaze target, and determining whether a predetermined time period has elapsed while the gaze direction is in the direction of the gaze target.  Teller teaches acquiring face-direction information of at least either a speaker or a person other than the speaker [para 0036-0039 -- a gaze tracking device such as a camera may be used to obtain an image of people in a field of view of the camera. The image may be processed to determine a gaze direction of one or more users in the image] and determining when the processing circuitry determines that the speech is not a conversation, whether the speaker's speech is a conversation, on a basis of whether the acquired face-direction information satisfies a preset condition [para 0039-0041 – predetermined threshold; 0044-0045 – detecting starting of a conversation based on threshold;  0060-0062 – filters out speech issued by second user who has not achieved gaze lock or other background audio—where the speech of second user and background audio can be considered conversation speech].  Teller teaches the use of sustained eye gaze as a trigger may allow the user to otherwise carry on a conversation As such, at 204c, processing module 149 may terminate the process for executing the action associated with the keyword "go" before execution of the action by component 190 has begun—where one having ordinary skill in the art would have recognized the advantages of not extracting the command associated with the keyword when the speech reflects conversation, for the purpose of reducing system processing time and power].  
Regarding claim 11, the combination of Lehman and Lee fail to teach acquire face-direction information of a person other than a speaker; and to detect presence or absence of a response of the other person on a basis of at least either the acquired face-direction information of the other person in response to the speaker's speech or a recognized speech response of the other person in response to the speaker's speech; and to set, when having detected the response of the other person, the speaker's a gaze tracking device such as a camera may be used to obtain an image of people in a field of view of the camera. The image may be processed to determine a gaze direction of one or more users in the image]; and to detect presence or absence of a response of the other person on a basis of at least either the acquired face-direction information of the other person in response to the speaker's speech or a recognized speech response of the other person in response to the speaker's speech 0060-0062 – filters out speech issued by second user who has not achieved gaze lock or other background audio—where the speech of second user and background audio can be considered conversation speech] and set, when having detected the response of the other person, the speaker's speech or a part of the speaker's speech, as the keyword [para 0061 – filters out speech issued by second user who has not achieved gaze lock or other background audio.. where the speech of speaker that has achieved gaze lock is processed and would necessarily provide the extracted keyword].   One having ordinary skill in the art at the time of the invention would have recognize the advantages of implementing the eye gaze lock and face direction processing suggested by Teller, in the Lehman/Lee system, for the purpose of allowing the user to otherwise carry on a conversation with another user without worrying about inadvertently causing execution of control commands of nearby speech controlled devices, as suggested by Teller.  


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Goto et al, discloses a speech spotter which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation.
Yamagata et al, discloses system request detection in conversation based on acoustic and speaker alternation features.
Reich et al, discloses a real time speech command detector for a smart control room that discriminates spoken commands directed to the system from other spoken input.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANGELA A ARMSTRONG whose telephone number is (571)272-7598.  The examiner can normally be reached on M,T,TH,F 11:30-8:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


ANGELA A. ARMSTRONG
Primary Examiner
Art Unit 2659



/ANGELA A ARMSTRONG/Primary Examiner, Art Unit 2659