DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Hoffmeister et al., (US 2016/0379632 A1) in View of Catanzaro et al., (US 2017/0148433 A1).

As per claims 1, 9 and 10, Hoffmeister et al., teach a dialog device/method and a computer program  (Fig.2, item 110) comprising: a prediction unit configured to predict an utterance length attribute of a user utterance (0059-0061, “ Offered a speech processing system that makes use of the content of speech when determining an endpoint of the utterance”); a selection unit configured to use the utterance attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model (0061, “The present system considers the content of the speech using information from acoustic models and language models when determining an endpoint.”); and an estimation unit configured to estimate an end point of the user utterance using the feature model selected by the selection unit (0061, 0076, “An endpoint detector may determine an endpoint based on different hypotheses determined by the speech recognition engine 258.”, “Thus, for making the endpoint decision the endpointing module 890 may consider only hypotheses being in a language model end state, and among these hypotheses the endpointing module may select the best scoring one.”). 
Hoffmeister et al., do not specifically teach the selection unit configured to use the utterance length attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model. However, Catanzaro et al., do teach the claimed the utterance length attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model (Fig.3, 0078 and 0081-0083). Therefore it would have been obvious to one of ordinary skill in the art to incorporate the use of utterance length to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model, as taught by Catanzaro et al., in the dialog device/method and a computer program of Hoffmeister et al., because, it would greatly improve the performance of the system, particularly with long inputs or outputs (Catanzaro et al., 0044).
As per claim 2, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein: the selection unit is configured to: set weightings for the acoustic feature model and the lexical feature model based on the utterance length attribute and a confidence value that indicates a probability that an estimation of the end point is correct; and select, from either the acoustic feature model or the lexical feature model, a model that achieves a predetermined weighting criterion (0020). 
As per claim 3, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein: the lexical feature model includes a plurality of sub-models; the plurality of sub-models include machine learning models trained to estimate an end point in the user utterance a predetermined number of words earlier; and the selection unit is configured to: calculate, when estimating the end point in the user utterance using the lexical feature model, a delay time from an end of the user utterance to an output of a machine utterance; calculate an utterance rate of a user based on previous user utterances; and select, based on the delay time and the utterance rate of the user, a sub-model from among the plurality of sub-models that is capable of reducing the delay time to within a predetermined time (0054). 
As per claim 4, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein: the lexical feature model is configured to: input, as a lexical feature, any one of a word, a phoneme, or a morpheme, and estimate an end point of the user utterance (Hoffmeister et al., Fig.2, Catanzaro et al., 0044). 
As per claim 5, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein both the acoustic feature model and the lexical feature model are configured to calculate: a probability that the end point in the user utterance is a back-channel opportunity; and a probability that the end point in the user utterance is an utterance termination (0028, 0063, 0069, 0074). 
As per claim 6, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 5, wherein: the dialog device further includes a response generation unit; and the response generation unit is configured to: generate and output a back-channel response in a case that a probability that the end point in the user utterance is a back-channel opportunity achieves a predetermined back-channel probability criterion; and generate, in a case that a probability that the end point in the user utterance is an utterance termination achieves a predetermined termination probability criterion, a response (hereinafter referred to as a "machine utterance") generated by the dialog device in accordance with content of the user utterance using a natural language understanding technique (0028, 0063, 0069, 0074, 0054-0055, 0014). 
As per claim 7, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein: the prediction unit is configured to: determine a machine utterance action indicating an intention of the machine utterance; predict the user utterance based on the machine utterance act; and predict the utterance length attribute of the user utterance by determining a user utterance action that indicates an intention of the user utterance (0051, 0053-0055, 0036 -0041). 
As per claim 8, Hoffmeister et al., in view of Catanzaro et al., teach the dialog device according to claim 1, wherein: the acoustic feature model and the lexical feature model can be trained by a recursive neural network (Hoffmeister et al., 0053-0055, Catanzaro et al., 0044). 

Response to Arguments

Applicant’s arguments, see pages 8-19, filed 2/4/2022, with respect to the rejection(s) of claim(s) 1-10 under 35 U.S.C. 102 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Catanzaro et al., (US 2017/0148433 A1).

Applicant’s arguments, see page 8, filed 2/4/2022, with respect to claim 10 has been fully considered and are persuasive.  The 35 U.S.C. 101 of claim 10 has been withdrawn. 

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see attached form PTO-892.
Willett et al., (US 2020/0043468 A1) teach a system, method and computer-readable storage device provides an improved speech processing approach in which hyper parameters used for speech recognition are modified dynamically or in batch mode rather than fixed statically. The method includes estimating, via a model trained on audio data and/or metadata, a set of parameters useful for performing automatic speech recognition, receiving speech at an automatic speech recognition system, applying, by the automatic speech recognition system, the set of parameters to processing the speech to yield text and outputting the text from the automatic speech recognition system.
Gaskill et al., (US 2018/0052913 A1) teach systems and methods for selecting types of generated prompts for further data from a user in a multi-turn interactive dialog. In one scenario, a processed sequence of user inputs and machine-generated prompts improves searches for the most relevant items available for purchase in an electronic marketplace. The number of prompts may be limited to a predetermined maximum value. Prompt generation is minimized by incorporating into a knowledge graph world knowledge that helps user intent inference. Prompt generation may be suppressed if a search indicates the reply to a prompt will not lead to any satisfactory search results. Prompts can provide suggestions for available search results that either meet all query constraints, or meet only some query constraints if a search indicates no search results are available that meet all query constraints. Prompts can provide suggested incisive reply phrasing likely to improve search results through an affirmation or negation reply.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY B CHAWAN whose telephone number is (571)272-7601. The examiner can normally be reached 7-5 Monday thru Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/VIJAY B CHAWAN/Primary Examiner, Art Unit 2658