DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. JP2017-141790, filed on July 21, 2017.
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Applicant cannot rely upon the certified copy of the foreign priority application to overcome this rejection because a translation of said application has not been made of record in accordance with 37 CFR 1.55. See MPEP §§ 215 and 216.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “learning data storage unit”, “model learning unit”,  “model storage unit”, “speech satisfaction estimator”, “conversation satisfaction estimator”, “satisfaction estimating unit” in claims 1-5.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.





Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-9 are rejected under 35 U.S.C. 101 because the claims appear to be directed to a software embodiment and not to hardware embodiment, where a machine claim is directed towards a system, apparatus, or arrangement. The claim appears to be directed towards a software embodiment. Paragraphs [0045-0049] of the Published Specification describes the elements of the system being implemented as software alone actualizing the embodiments of the invention. The claimed limitations are capable of being performed as software as described in the above paragraphs, alone since no hardware component is being claimed. Software, alone, are not physical components and thus are not statutory since software do not define any structural and functional interrelationships between the computer programs and other claimed elements of a computer, which permit the computer' s program functionality to be realized. Hence, the stated functions comprise software and is thus not directed to a hardware embodiment. Data structures not claimed as embodied in computer readable media are descriptive material per se and are not statutory because they are not capable of causing functional change in the computer. See e.g., Warmerdam, 33 F.3d at 1361, 31, USPQ2d at 1760 (claim to a data structure per se held nonstatutory). Such claimed data structures do not define any structural and functional interrelationships between data and other claimed aspects of the invention, which permit the data structure’s functionality to be realized. In contrast, a claimed computer readable medium encoded with a data structure defines structural and functional interrelationships between the data structure and 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-2 and 5-9 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Hammer et al. (US Pub No. 2019/0005421). 
Regarding claim 1, Hammel et al. teaches a satisfaction estimation model learning apparatus, (see [0087 also supported in provisional at page 23, line 10], AI System Controller) comprising: a learning data storage unit (see [0005 also supported in provisional at page 23, line 29], Knowledge Base) that stores learning data including a conversation voice containing a conversation including a plurality of speeches, (see [0088 or provisional at p. 24, line 15-16], where audio files are processed by the Unstructured Data Analysis Engine into homogeneous voice segments that get stored in the Knowledge Base. The voice segments may come from different speakers, and the “identification of the pieces may be accomplished with a learning algorithm as homogeneous human speech, simultaneous speech, or noninformative parts.” [0088]. For instance, see [0088], where “customer voice can be identified as Speaker A and agent voice can be identified as Speaker B” and “obtaining an ‘ABABAB’ type analysis of the conversation may be done (eg. conversation voice with a plurality of speeches), which prepares the data for Speaker A&B Feature Extraction” [0088]).  
a correct answer value of a conversation satisfaction for the conversation, and a correct answer value of a speech satisfaction for each of the speeches included in the conversation; (See [0008 or provisional at p. 3, line 27-8], where an Initial Feature Set “describes individual feature vectors” and may be stored in the Knowledge Base, and see [0089 or provisional at p. 25, line 28-31], where the individual feature vectors are extracted from the homogeneous voice segments (eg. conversation data including speeches). In context of this, see [0009 or provisional at p. 4, line 7-10], where “the system may align feature vectors with Prediction Classes. Prediction Classes comprise mathematical values designating a correct outcome or desired descriptor, also known as labels, targets, desired outputs, supervisory signals, response variables, or explained variables (eg. correct value of speech/conversation satisfaction)”)
and a model learning unit that learns a satisfaction estimation model using a feature quantity of each speech extracted from the conversation voice, , (see [0090, or provisional p. 26, line 14-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice) AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access statistical functions contained in Tools Knowledge Base 63 and dynamically build an Emotional Model. The dynamically built (eg. model learning unit) Emotional Model is used by Unstructured Data Analysis Engine 32 to analyze and determine emotions, emotional based behaviors, and/or emotional states based on one or more extracted features from homogeneous voice segments”)
(see [0420 or provisional p. 4, lines 17-20], where a data point is “a feature vector (eg. derived from the conversation data made up of homogeneous voice segments/speech excerpts) specifically to be used for model training”; see [0009 or provisional p. 4, line 11-13], where the Initial Mapping Ruleset(s) that “align feature vectors with Prediction Classes” also define “the needed context(s) for determining Initial Data Points (the correct values for prediction classes that will be used for the model training)” and see [0010 or provisional p. 4, lines 17-18], where “the system may identify an Initial Data Point as a single feature vector or an aggregate of feature vectors. (the aggregation of the feature vectors from homogeneous voice segments is interpreted as a conversation)”)
the satisfaction estimation model configured by connecting a speech satisfaction estimation model part that receives a feature quantity of each speech and estimates the speech satisfaction of each speech  (see [0424, also supported by provisional p. 26], where the emotional model is defined as a model that takes “a numerical representation of an audio signal containing voice (eg. speech) and maps it to the most likely emotion (eg. satisfaction) or behavior being expressed therein.” and see [0090, or provisional lines 18-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states (eg. estimate satisfaction) from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice)”)
with a conversation satisfaction estimation model part that receives at least the speech satisfaction of each speech and estimates the conversation satisfaction (see [0090, or provisional p. 26 lines 24-28], where “Unstructured Data Analysis Engine 32 may dynamically create Emotional Transition State features. Statistical analysis of Emotional Transition State features determines whether consecutive agent and/or customer homogeneous voice segments separately (eg. multiple sequential homogeneous voice segments analyzed in conjunction with each other is interpreted as conversation, and the emotional state of each separate voice segment constitutes a speech satisfaction) contain changes in different emotional states. (eg. conversation satisfaction)”)

Regarding claim 2, Hammel et al. teaches wherein the speech satisfaction estimation model part constitutes one speech satisfaction estimator for one speech, the speech satisfaction estimator receives the feature quantity of each speech and estimates and outputs the speech satisfaction of the speech (see [0090, or provisional or p. 26, lines 14-18], where the “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access statistical functions contained in Tools Knowledge Base 63 and dynamically build an Emotional Model. The dynamically built (eg. one speech satisfaction estimator for one speech, as the estimator part in the mode learning unit dynamically changes with each input) Emotional Model is used by Unstructured Data Analysis Engine 32 to analyze and determine emotions, emotional based behaviors, and/or emotional states based on one or more extracted features from homogeneous voice segments (eg. using a feature quantity of each speech extracted from the conversation voice)”)
the conversation satisfaction estimation model part constitutes one conversation satisfaction estimator for one speech satisfaction estimator, and the conversation satisfaction estimator receives the speech satisfaction outputted from the speech satisfaction estimator and information contributing to the estimation of the conversation satisfaction accompanied by the speech satisfaction, using information related to a speech before the speech or speeches before and after the speech and estimates and outputs the conversation satisfaction from a first speech included in the conversation to the speech using the information related to the speech before the speech. (see [0090, or provisional p. 26, lines 25-31], where “Unstructured Data Analysis Engine 32 may dynamically (eg. one conversation satisfaction estimator for one speech satisfaction estimator, as the estimator part in the mode learning unit dynamically changes with each input) create Emotional Transition State features. Statistical analysis of Emotional Transition State features determines whether consecutive (eg. using information related to a speech before the speech or speeches before and after the speech) agent and/or customer homogeneous voice segments separately (eg. multiple sequential homogeneous voice segments analyzed in conjunction with each other is interpreted as conversation, and the emotional state of each separate voice segment constitutes a received speech satisfaction) contain changes in different emotional states. (eg. conversation satisfaction)… [Unstructured Data Analysis Engine 32 stores] Emotional State Transition features in Knowledge Base”)

Regarding claims 5 and 8-9, Hammel et al. teaches a model storage unit that stores the satisfaction estimation model learned by the satisfaction estimation model learning apparatus according to any one of claims 1 to 3, (see [0090, or provisional p. 26] and Figure 1 p. 1, where the parts involved in the Emotional Model, like Knowledge Base and the Unstructured Data Analysis Engine, for estimating satisfaction are stored in the Processing Database 60 and Server 20)
and program causing a computer to function as the satisfaction estimation model learning apparatus according to any one of claims 1 to 3 or a program causing a computer to function as the satisfaction estimating apparatus according to claim 5 (see [0453 or provisional p. 156, lines 28-31 and p. 157, lines 1-15], where the “processing may be implemented in computer programs executed on programmable computers”)
 Furthermore, Hammel et al. teaches a satisfaction estimating unit that inputs the feature quantity of each speech extracted from the conversation voice containing the conversation including a plurality of speeches to the satisfaction estimation model , (see [0090, or provisional p. 26, lines 14-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice) AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access statistical functions contained in Tools Knowledge Base 63 and dynamically build an Emotional Model. The dynamically built (eg. model learning unit) Emotional Model is used by Unstructured Data Analysis Engine 32 to analyze and determine emotions, emotional based behaviors, and/or emotional states based on one or more extracted features from homogeneous voice segments”)
and estimates the speech satisfaction for each speech and the conversation satisfaction for the conversation , (see [0090, or provisional p. 26, line 18-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states (eg. estimate satisfaction) from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice)” and where “Unstructured Data Analysis Engine 32 may dynamically create Emotional Transition State features. Statistical analysis of Emotional Transition State features determines whether consecutive agent and/or customer homogeneous voice segments separately (eg. multiple sequential homogeneous voice segments analyzed in conjunction with each other is interpreted as conversation, and the emotional state of each separate voice segment constitutes a received speech satisfaction) contain changes in different emotional states. (eg. conversation satisfaction)”)

Regarding claim 6, Hammel et al. teaches a satisfaction estimation model learning method, wherein learning data including a conversation voice containing a conversation including a plurality of speeches, (see [0088 or provisional at p. 24, line 15-16], where audio files are processed by the Unstructured Data Analysis Engine into homogeneous voice segments that get stored in the Knowledge Base. The voice segments may come from different speakers, and the “identification of the pieces may be accomplished with a learning algorithm as homogeneous human speech, simultaneous speech, or noninformative parts.” [0088]. For instance, see [0088], where “customer voice can be identified as Speaker A and agent voice can be identified as Speaker B” and “obtaining an ‘ABABAB’ type analysis of the conversation may be done (eg. conversation voice with a plurality of speeches), which prepares the data for Speaker A&B Feature Extraction” [0088]).
a correct answer value of a conversation satisfaction for the conversation, and a correct answer value of a speech satisfaction for each of the speeches included in the conversation is stored in a learning data storage unit, (See [0008 or provisional at p. 3, line 27-8], where an Initial Feature Set “describes individual feature vectors” and may be stored in the Knowledge Base, and see [0089 or provisional at p. 25, line 28-31], where the individual feature vectors are extracted from the homogeneous voice segments (eg. conversation data including speeches). In context of this, see [0009 or provisional at p. 4, line 7-10], where “the system may align feature vectors with Prediction Classes. Prediction Classes comprise mathematical values designating a correct outcome or desired descriptor, also known as labels, targets, desired outputs, supervisory signals, response variables, or explained variables (eg. correct value of speech/conversation satisfaction)”)
learning, by a model learning unit, a satisfaction estimation model using a feature quantity of each speech extracted from the conversation voice, (see [0090, or provisional p. 26, line 14-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice) AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access statistical functions contained in Tools Knowledge Base 63 and dynamically build an Emotional Model. The dynamically built (eg. model learning unit) Emotional Model is used by Unstructured Data Analysis Engine 32 to analyze and determine emotions, emotional based behaviors, and/or emotional states based on one or more extracted features from homogeneous voice segments”)
the correct answer value of the speech satisfaction, and the correct answer value of the conversation satisfaction, (see [0420 or provisional p. 4, lines 17-20], where a data point is “a feature vector (eg. derived from the conversation data made up of homogeneous voice segments/speech excerpts) specifically to be used for model training”; see [0009 or provisional p. 4, line 11-13], where the Initial Mapping Ruleset(s) that “align feature vectors with Prediction Classes” also define “the needed context(s) for determining Initial Data Points (the correct values for prediction classes that will be used for the model training)” and see [0010 or provisional p. 4, lines 17-18], where “the system may identify an Initial Data Point as a single feature vector or an aggregate of feature vectors. (the aggregation of the feature vectors from homogeneous voice segments is interpreted as a conversation)”)
the satisfaction estimation model configured by connecting a speech satisfaction estimation model part that receives a feature quantity of each speech and estimates the speech satisfaction of each speech (see [0424, also supported by provisional p. 26], where the emotional model is defined as a model that takes “a numerical representation of an audio signal containing voice (eg. speech) and maps it to the most likely emotion (eg. satisfaction) or behavior being expressed therein.” and see [0090, or provisional lines 18-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states (eg. estimate satisfaction) from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice)”)
(see [0090, or provisional p. 26 lines 24-28], where “Unstructured Data Analysis Engine 32 may dynamically create Emotional Transition State features. Statistical analysis of Emotional Transition State features determines whether consecutive agent and/or customer homogeneous voice segments separately (eg. multiple sequential homogeneous voice segments analyzed in conjunction with each other is interpreted as conversation, and the emotional state of each separate voice segment constitutes a speech satisfaction) contain changes in different emotional states. (eg. conversation satisfaction)”)

Regarding claim 7, Hammel et al. wherein the satisfaction estimation method comprising: inputting, by a satisfaction estimating unit, the feature quantity of each speech extracted from the conversation voice containing the conversation including a plurality of speeches to the satisfaction estimation model , (see [0090, or provisional p. 26, line 14-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice) AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access statistical functions contained in Tools Knowledge Base 63 and dynamically build an Emotional Model. The dynamically built (eg. model learning unit) Emotional Model is used by Unstructured Data Analysis Engine 32 to analyze and determine emotions, emotional based behaviors, and/or emotional states based on one or more extracted features from homogeneous voice segments”)
and estimating the speech satisfaction for each speech and the conversation satisfaction for the conversation. , (see [0090, or provisional p. 26, line 14-24], where “AI System Controller 21 may instruct Unstructured Data Analysis Engine 32 to access Seed Knowledge Base 61 and utilize a pre-existing Emotional Model to analyze and determine emotions, emotional based behaviors, and/or emotional states (eg. estimate satisfaction) from one or more extracted features from homogeneous voice segments. (eg. using a feature quantity of each speech extracted from the conversation voice)” and where “Unstructured Data Analysis Engine 32 may dynamically create Emotional Transition State features. Statistical analysis of Emotional Transition State features determines whether consecutive agent and/or customer homogeneous voice segments separately (eg. multiple sequential homogeneous voice segments analyzed in conjunction with each other is interpreted as conversation, and the emotional state of each separate voice segment constitutes a received speech satisfaction) contain changes in different emotional states. (eg. conversation satisfaction)”)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Hammel et al. in view of Bengio et al. (US Patent No. 10,573,293).
Regarding claim 3, Hammel et al. teaches the speech satisfaction estimator and the conversation satisfaction estimator according to claim 2. Hammel et al. does not teach gates within a LSTM-RNN model. Bengio et al. teaches the LSTM-RNN estimator including an input gate and an output gate and an oblivion gate, (see col 1. lines 34-39, where a neural network is described in which “each LSTM memory block can include one or more cells that each include an input gate, a 35 forget gate (eg. oblivion gate), and an output gate that allow the cell to store previous states for the cell, e.g., for use in generating a current activation or to be provided to other components of the LSTM neural network”)
and a reset gate and an update gate (see col. 4, lines 5-6, where the decoder neural network  includes “one or more gated recurrent unit (eg. GRU) neural network layers”; the applicant notes in [0031] of the as filed specification that the reset gate and the update gate are configurations for the GRU; see col 3, lines  47-48, where “the decoder input for the first decoder step can be an all-zero frame (i.e. a <GO> frame).”, and this initializing input is interpreted to be a reset gate; see col. 3, lines 32-37, where “For each decoder input in the sequence, the decoder neural network 118 is configured to process the decoder input and the encoded representations generated by the encoder CBHG 35 neural network 116 to generate multiple frames of the spectrogram of the sequence of characters.” and the processing of encoded representations into frames is interpreted to be the update gate)
Hammel et al. and Bengio et al. are combinable because they both describe the use of machine learning for speech processing. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the estimators laid out in Hammel et al. with Bengio et al.’s gates regarding a LSTM-RNN model. One would be motivated to do because if using the LSTM-RNN as the machine learning model for the estimation apparatus, the LSTM neural network would generally contain gates in its memory blocks for use in the network, because the gates . 

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Hammel et al. in view of Bengio et al. and Senior et al. (US Pub No. US 20170011738).
Regarding claim 4, Hammel et al. and Bengio et al. teaches the speech satisfaction estimation model part and the conversation satisfaction estimation model part from claims 1-3. Bengio et al. teaches the adjusting of weights of a loss function, which can be applied to any LSTM estimation model such as for speech satisfaction or conversation satisfaction (see col. 4, lines 38-42, where “the system 100 (or an external system) can backpropagate an estimate of a gradient of a loss function to jointly adjust the current values of all network parameters (eg. adjusting of weights) of the post-processing neural network 108 and the seq2seq network”). Hammel et al. and Bengio et al. do not teach the loss function of the weighted sum of a loss function of from two different model parts such as for the speech satisfaction estimation model part and the conversation satisfaction estimation model part. However, Senior et al. teaches wherein a loss function is a weighted sum of two different loss functions from different neural networks for estimating what words are most likely to be identified in speech (see [0008], where “a second neural network may be trained based on the outputs of the first neural network to generate outputs indicating likelihoods for a second set of phonetic units that is different from the first set used by the first neural network.”, and see also [0017], where the training the second neural network involves “using a loss function that is a weighted combination of the two or more loss functions.”).  
Hammel et al., Bengio et al., and Senior et al. are combinable because they describe the use of neural networks for speech processing. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the training both speech 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARVAJNA KALVA whose telephone number is (571) 272-4692. The examiner can normally be reached on Monday - Friday 9 to 6. Examiner interviews are available via telephone, in person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppairmy.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.\



/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659