Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4 and 8-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Phillips; Michael S. et al. US 20110055256 A1 (hereinafter Phillips).
A voice interaction system that has a conversation with a user by using a voice, comprising: 5hardware, including at least one memory configured to store a computer program and at least one processor configured to execute the computer program; a speech acquisition unit, implemented by the hardware, configured to acquire user speech given by the user; (0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a feature extraction unit, implemented by the hardware, configured to extract 10 at least a feature of the acquired user speech; (feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a response determination unit, implemented by the hardware, configured to determine a response in accordance with the extracted feature using any one of a plurality of learning models generated in advance by machine learning; (determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a response execution unit, implemented by the hardware, configured to 15perform control in order to execute the determined response; (command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a user state detection unit, implemented by the hardware, configured to detect a user state, which is a state of the user; and (user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a learning model selection unit, implemented by the hardware, configured to select a learning model from the plurality of learning models in accordance with the 20detected user state, (one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
wherein the response determination unit, implemented by the hardware, determines the response using the learning model selected by the learning model selection unit.  (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate multiple models using the embodiments of various contextual models as in Phillips to allow for different user scenarios as well as different applications/programs. 


Re claim 2, Phillips teaches
2. The voice interaction system according to Claim 1, wherein the user state detection unit detects a degree of activeness of the user in the conversation as the user state, and the learning model selection unit selects the learning model that corresponds to the degree of the activeness of the user.  (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)


Re claim 3, Phillips teaches
3. The voice interaction system according to Claim 2, wherein36 the user state detection unit detects an amount of speech given by the user in a predetermined period or a percentage of time during which the user has made a speech with respect to a sum of time during which the voice interaction system has output a voice as a response and (time periods and phonemic durations as well as frequency of occurrence i.e. inputs within a time sample overall…contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
the learning model selection unit selects the learning model that corresponds to the amount of speech given by the user or the percentage of the time during which the user has made a speech.  (correlated to the models, time periods and phonemic durations as well as frequency of occurrence i.e. inputs within a time sample overall…contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)

Re claim 4, Phillips teaches
104. The voice interaction system according to Claim 1, wherein the user state detection unit detects identification information on the user as the user state, and the learning model selection unit selects the learning model that corresponds to the identification information on the user.  (user profiles identity user, contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)


Re claims 8 and 9, Phillips teaches
8. A voice interaction method performed by a voice interaction system that has a conversation with a user by using a voice, the voice interaction method cornprising: 
acquiring user speech given by the user;  (user speech 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
5extracting at least a feature of the acquired user speech; (feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
determining a response in accordance with the extracted feature using any one of a plurality of learning models generated in advance by machine learning; (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
performing control in order to execute the determined response; (command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
detecting a user state, which is a state of the user; and (user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
10selecting a learning model from the plurality of learning models in accordance with the detected user state, (based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
wherein the response is determined using the selected learning model.  (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate multiple models using the embodiments of various contextual models as in Phillips to allow for different user scenarios as well as different applications/programs. 


Re claims 10 and 11, Phillips teaches
10. A learning model generation apparatus configured to generate a learning model used in a voice interaction system that has a conversation with a user by using a 30voice, the apparatus comprising: hardware, including at least one memory configured to store a computer program and at least one processor configured to execute the computer program; (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
(system interaction with user 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a feature extraction unit, implemented by the hardware, configured to extract 5a feature vector indicating at least a feature of the acquired user speech; (feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a sample data generation unit configured to generate sample data in which a correct label indicating a response to the user speech and the feature vector are associated with each other; (determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a user state acquisition unit, implemented by the hardware, configured to 10acquire a user state, which is a state of the desired user when the user has made a speech, to associate the acquired user state with the sample data that corresponds to the user speech; (user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
a sample data classification unit, implemented by the hardware, configured to classify the sample data for each of the user states; and (contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
(model update is model generation, contextual response based on one of multiple models selecting depending on context of input, user state as in the user actions, command and control, determine response from system based on input based on multiple models, feature space 0071, 0062, 0099, 0117, 0142, 0145 with fig. 2-2b)
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate multiple models using the embodiments of various contextual models as in Phillips to allow for different user scenarios as well as different applications/programs. 


Claim 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Phillips; Michael S. et al. US 20110055256 A1 (hereinafter Phillips) in view of KIM; Dae Hoon et al.	US 20180136615 A1 (hereinafter KIM).
Re claim 5, Phillips while teaching models fails to teach
5. The voice interaction system according to Claim 1, wherein the user state detection unit detects emotion of the user as the user state, and the learning model selection unit selects the learning model that corresponds to the emotion of the user.  (Kim 0019)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate the above claim limitations as taught by Kim to allow for an improvement of context in Phillips by adding a new context such as emotion, wherein such an addition would aid in predicting user .


Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Phillips; Michael S. et al. US 20110055256 A1 (hereinafter Phillips) in view of HAN; Youngwoong et al.	US 20170147753 A1 (hereinafter Han).
Re claim 6, Phillips while teaching models fails to teach
6. The voice interaction system according to Claim 1 , wherein the user state detection unit detects a health condition of the user as the user state, and the learning model selection unit selects the learning model that corresponds 25to the health condition of the user.  (Han 0024, 0047)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate the above claim limitations as taught by Han to allow for an improvement of context in Phillips by adding a new context such as medical/health, wherein such an addition would aid in predicting user actions e.g. if the user has a health related application where contextual input can comprise commands related to logging health, scheduling appointments, conditions, etc. analogous with Phillips.


Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Phillips; Michael S. et al. US 20110055256 A1 (hereinafter Phillips) in view of Biemer; Michael US 20150009010 A1 (hereinafter Biemer).
Re claim 7, Phillips while teaching models fails to teach
7. The voice interaction system according to Claim 1 , wherein the user state detection unit detects a degree of an awakening state of the user as the user state, and  30the learning model selection unit selects the learning model that corresponds to the degree of the awakening state of the user.  (Biemer 0090)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the system of Phillips to incorporate the above claim limitations as taught by Biemer to allow for an improvement of context in Phillips by adding a new context such as user alert/awake/absence status, wherein such an addition would aid in predicting user actions e.g. if the user is awake or not to preserve battery power of a device or to alert the user in the instance he/she is driving and falls asleep by analyzing gaze, thereby correlating user actions to associated models.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

HERGENROEDER; Alex Lauren	US 20180277117 A1


If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at (571)272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/MICHAEL COLUCCI/Primary Examiner, Art Unit 2656                                                                                                                                                                                                        (571)-270-1847
Examiner FAX:  (571)-270-2847
Michael.Colucci@uspto.gov