DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 11 September 2020 in reference to application 16/980,332.  Claims 1-20 are pending and have been examined.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a first receiving module,” “a first extracting module,” “a second receiving module,” “a second extracting module,” “a determining module,” and “a response generating module,” in claim 17 and “an update module” in claim 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 12 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 12 recites the limitation "the second message" in line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kao et al. (US PAP 2017/0310820).

Consider claim 1, Kao teaches a method for generating a response in a human-machine conversation (abstract, figure 2, figure 1B, with IVR), comprising: 
receiving a first sound input in the conversation (0027, capturing first digitized voice segment); 
extracting a first audio attribute from the first sound input, wherein the first audio attribute indicates a first condition of a user (0031, extracting voice features from first segment, indicative of emotion for example at 0034); 
receiving a second sound input in the conversation (0068, capturing second digitized voice segment); 
extracting a second audio attribute from the second sound input, wherein the second audio attribute indicates a second condition of a user (0068, extracting characteristics of second segment, indicative of emotion); 
determining a difference between the second audio attribute and the first audio attribute, wherein the difference indicates a condition change of the user from the first condition to the second condition (0069, determining change in emotional level by comparing first and second characteristics); and 
generating a response to the second sound input based at least on the condition change (0095-97, generating customer service score for the call).

Consider claim 2, Kao teaches the method of claim 1, wherein, 
the first audio attribute comprises a first multi-dimensional vector of emotion (0031, extraction of first voice features), wherein each dimension in the first multi-dimensional vector of emotion represents an emotion category respectively, and the first condition of the user comprises a first emotion condition (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); 
the second audio attribute comprises a second multi-dimensional vector of emotion (0068, extracting characteristics of second segment), wherein each dimension in the second multi-dimensional vector of emotion represents the same emotion category respectively as the ones represented in the first multi-dimensional vector of emotion, and the second condition of the user comprises a second emotion condition (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); 
the difference between the second audio attribute and the first audio attribute comprises a multi-dimensional vector difference between the first multi-dimensional vector of emotion and the second multi-dimensional vector of emotion (0069, determining change in emotional level by determining change in extracted voice features); and 
the condition change comprises an emotion condition change of the user from the first emotion condition to the second emotion condition (0095, determining customer service score based on detected emotional change at 0069.).

Consider claim 3, Kao teaches the method of claim 2, further comprising: 
assigning a weight to each dimension of the multi-dimensional vector difference (0046, each voice feature may be weighted based on importance); and 
determining, based on the one or more weighted dimensions of the multi- dimensional vector difference, the emotion condition change of the second emotion condition with respect to the first emotion condition (0046, each voice feature may be weighted based on importance, 0069, determining change in emotional level by determining change in extracted voice features).

Consider claim 4, Kao teaches the method of claim 1, wherein, 
the first audio attribute comprises a first multi-dimensional vector of environment (0031, extraction of first voice features, 0047 environmental factors), wherein each dimension in the first multi-dimensional vector of environment represents an environment category respectively, and the first condition of the user comprises a first environment condition (0034-48, different dimensions represent different characteristics [categories] which determine emotion of a vector which determines emotional categories, also, see 0047, environmental factors may be part of the vector as well); 
the second audio attribute comprises a second multi-dimensional vector of environment (0068, extracting characteristics of second segment, 0047, environmental factors), wherein each dimension in the second multi-dimensional vector of environment represents the same environment category respectively as the ones represented in the first multi-dimensional vector of environment, and the second condition of the user comprises a second environment condition (0034-48, different dimensions represent different characteristics [categories] which determine emotion of a vector which determines emotional categories, also, see 0047, environmental factors may be part of the vector as well); 
the difference of the second audio attribute and the first audio attribute comprises a multi-dimensional vector difference between the first multi-dimensional vector of environment and the second multi-dimensional vector of environment (0069, determining change in emotional level by determining change in extracted voice features); and 
the condition change comprises an environment condition change of the user from the first environment condition to the second environment condition (0095, determining customer service score based on detected change at 0069.).

Consider claim 5, Kao teaches the method of claim 4, further comprising: 
assigning a weight to each dimension of the multi-dimensional vector difference (0046, each feature may be weighted based on importance); and 
determining, based on the one or more weighted dimensions of the multi- dimensional vector difference, the environment condition change of the second environment condition with respect to the first environment condition (0046, each voice feature may be weighted based on importance, 0069, determining change in emotional level by determining change in extracted voice features).

Consider claim 6, Kao teaches the method of claim 1, wherein, 
the first audio attribute comprises a first multi-dimensional vector of physical status (0031, extraction of first voice features), wherein each dimension in the first multi-dimensional vector of physical status represents a physical status category, respectively, and the first condition of the user comprises a first physical status condition (0034-48, different dimensions represent different vocal characteristics [categories]); 
the second audio attribute comprises a second multi-dimensional vector of physical status (0068, extracting characteristics of second segment), wherein each dimension in the second multi-dimensional vector of physical status represents the same physical status category respectively as the ones represented in the first multi-dimensional vector of physical status, and the second condition of the user comprises a second physical status condition (0034-48, different dimensions represent different vocal characteristics [categories]); 
the difference between the second audio attribute and the first audio attribute comprises a multi-dimensional vector difference between the first multi-dimensional vector of physical status and the second multi-dimensional vector of physical status (0069, determining change in emotional level by determining change in extracted voice features); and 
the condition change comprises a physical status change of the user from the first physical status condition to the second physical status condition (0095, determining customer service score based on detected emotional change at 0069.).

Consider claim 7, Kao teaches the method of claim 6, further comprising: 
assigning a weight to each dimension of the multi-dimensional vector difference (0046, each feature may be weighted based on importance); and 
determining, based on the one or more weighted dimensions of the multi- dimensional vector difference, the physical status change of the second physical status condition (0046, each voice feature may be weighted based on importance, 0069, determining change in emotional level by determining change in extracted voice features).

Consider claim 8, Kao teaches the method of claim 1, wherein, 
the first audio attribute comprises at least one of a first multi-dimensional vector of emotion, a first multi-dimensional vector of environment and a first multi- dimensional vector of physical status (0031, first characteristics 0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); and 
the second audio attribute comprises at least one of a second multi-dimensional vector of emotion, a second multi-dimensional vector of environment and a second multi-dimensional vector of physical status (0068, second characteristics 0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories).

Consider claim 9, Kao teaches the method of claim 1, wherein generating the response to the second sound input is further based on at least one of: the first condition of the user, the second condition of the user (0095, determining customer service score based on detected emotional change at 0069), a first semantic information extracted from the first sound input, a second semantic information extracted from the second sound input, a conversation context, and a user profile.

Consider claim 10, Kao teaches the method of claim 1, further comprising: 
determining an initial audio attribute of the user from a user profile, wherein the initial audio attribute indicates an initial condition of the user (0032-33 baseline profile may be retrieved); 
determining, before receiving the second sound input, a difference between the first audio attribute and the initial audio attribute, wherein the difference indicates an initial condition change of the user from the initial condition to the first condition (0034, determining emotional change from baseline and first audio); and 
generating a response to the first sound input based at least on the initial condition change (0033, calibrating emotional response and detection based on differences with baseline).

Consider claim 11, Kao teaches the method of claim 10, further comprising: updating the initial audio attribute of the user based on at least one of the first audio attribute and the second audio attribute (0033, updating the user’s baseline with each call).

Consider claim 12, Kao teaches the method of claim 10, wherein generating the second message to the first sound input is further based on at least one of: the initial condition of the user, the first condition of the user, a first semantic information extracted from the first sound input, a conversation context, and the user profile (0033, calibrating emotional response and detection based on differences with baseline).

Consider claim 13, Kao teaches the method of claim 1, further comprising: 
receiving a third sound input in the conversation (0030, voice segments may be captured an analyzed throughout the conversation,); 
extracting a third audio attribute from the third sound input, wherein the third audio attribute indicates a third condition of the user (0068, extracting characteristics of segment, indicative of emotion); 
determining a difference between the third audio attribute and the second audio attribute, wherein the difference indicates an additional condition change of the user from the second condition to the third condition (0030, voice segments may be captured an analyzed throughout the conversation,0069, comparing segments to determine changes in emotion); and 58WO 2019/200584PCT/CN2018/083735
generating a response to the third sound input based at least on the condition change and the additional condition change (0030, continuously monitoring for changes, 0095, determining customer service score based on detected emotional change at 0069).

Consider claim 14, Kao teaches the method of claim 1, further comprising: 
receiving a third sound input in the conversation (0030, voice segments may be captured an analyzed throughout the conversation,); 
extracting a third audio attribute from the third sound input, wherein the third audio attribute indicates a third condition of the user (0068, extracting characteristics of segment, indicative of emotion); 
determining an average attribute between the first audio attribute and the second audio attribute, wherein the average attribute indicates an average condition between the first condition and the second condition of the user (0032-33 baseline profile may be retrieved and may be updated with new utterances, 0054-66, analysis to determine emotion may be based on a comparison to the mean, or average); 
determining a difference between the third audio attribute and the average attribute indicating a second condition change of the user from the average condition to the third condition (0032-33 baseline profile may be retrieved and may be updated with new utterances, 0054-66, analysis to determine emotion may be based on a comparison to the mean, or average); and 
generating a third message in response to the third sound input based at least on the second condition change (0030, continuously monitoring for changes, 0095, determining customer service score based on detected emotional change at 0069).

Consider claim 15, Kao teaches the method of claim 1, further comprising: generating, before receiving the first sound input, an initial message based on user information independent of the current conversation, wherein the initial message is a request for the first sound input (0029, initial series of questions).

Consider claim 16, Kao teaches the method of claim 1, further comprising: generating, before receiving the second sound input, an intermediate response for confirming the first condition of the user or for requesting the second sound input (0068, “is there anything else I can help you with today?”).

Consider claim 17, Kao teaches an apparatus for generating a response in a human-machine conversation (abstract, figure 2, figure 1B, with IVR), comprising: 
a first receiving module (0099-0102, modules may be implemented by computer components), for receiving a first sound input in the conversation (0027, capturing first digitized voice segment); 
a first extracting module, for extracting a first audio attribute from the first sound input, wherein the first audio attribute indicates a first condition of a user (0031, extracting voice features from first segment, indicative of emotion for example at 0034); 59WO 2019/200584PCT/CN2018/083735 
a second receiving module, for receiving a second sound input in the conversation (0068, capturing second digitized voice segment); 
a second extracting module, for extracting a second audio attribute from the second sound input, wherein the second audio attribute indicates a second condition of a user (0068, extracting characteristics of second segment, indicative of emotion); 
a determining module, for determining a difference between the second audio attribute and the first audio attribute, wherein the difference indicates a condition change of the user from the first condition to the second condition (0069, determining change in emotional level by comparing first and second characteristics); and 
a response generating module, for generating a response to the second sound input based at least on the condition change (0095-97, generating customer service score for the call).

Consider claim 18, Kao teaches the apparatus of claim 17, wherein, 
the first audio attribute comprises at least one of: a first multi-dimensional vector of emotion (0031, extraction of first voice features), wherein each dimension in the first multi-dimensional vector of emotion represents an emotion category respectively (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); a first multi-dimensional vector of environment, wherein each dimension in the first multi-dimensional vector of environment represents an environment category respectively (OPTIONAL); and a first multi- dimensional vector of physical status, wherein each dimension in the first multi- dimensional vector of physical status represents a physical status category respectively  (OPTIONAL), 
the second audio attribute comprises at least one of: a second multi-dimensional vector of emotion (0068, extracting characteristics of second segment), wherein each dimension in the second multi-dimensional vector of emotion represents the same emotion category respectively as the ones represented in the first multi-dimensional vector of emotion (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); a second multi-dimensional vector of environment, wherein each dimension in the second multi-dimensional vector of environment represents the same environment category respectively as the ones represented in the first multi-dimensional vector of environment (OPTIONAL); a second multi- dimensional vector of physical status, wherein each dimension in the second multi- dimensional vector of physical status represents the same physical status category respectively as the ones represented in the first multi-dimensional vector of physical status (OPTIONAL); 60WO 2019/200584PCT/CN2018/083735 
the first condition of the user comprises at least one of: a first emotion condition (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories), a first environment condition, and a first physical status condition; 
the second condition of the user comprises at least one of: a second emotion condition (0034-48, different dimensions represent different characteristics [categories] of a vector which determines emotional categories); a second environment condition, and a second physical status condition; 
the difference between the second audio attribute and the first audio attribute comprises at least one of: a multi-dimensional vector difference between the first multi-dimensional vector of emotion and the second multi-dimensional vector of emotion (0069, determining change in emotional level by determining change in extracted voice features); a multi-dimensional vector difference between the first multi-dimensional vector of environment and the second multi--dimensional vector of environment (OPTIONAL); and a multi-dimensional vector difference between the first multi-dimensional vector of physical status and the second multi-dimensional vector of physical status (OPTIONAL); and 
the condition change comprises at least one of: an emotion condition change of the user from the first emotion condition to the second emotion condition (0095, determining customer service score based on detected emotional change at 0069.); an environment condition change of the user from the first environment condition to the second environment condition (OPTIONAL); and a physical status condition change of the user from the first physical status condition to the second physical status condition (OPTIONAL).

Consider claim 19, Kao teaches the apparatus of claim 17, wherein the determining module is further for determining an initial audio attribute of the user from a user profile, wherein the initial audio attribute indicates an initial condition of the user (0032-33 baseline profile may be retrieved); and 
the apparatus further comprises an update module for updating the initial audio attribute of the user based on at least one of the first audio attribute and the second audio attribute (attribute (0033, updating the users baseline with each call).

Consider claim 20, Kao teaches an apparatus for generating a response in a human-machine conversation (abstract, figure 2, figure 1B, with IVR), comprising: 
one or more processors (0102, processors); and 
a memory storing computer-executable instructions that (0100, 0102, memory and instructions), when executed, cause the one or more processors to:
receive a first sound input in the conversation (0027, capturing first digitized voice segment); 
extract a first audio attribute from the first sound input, wherein the first audio attribute indicates a first condition of a user (0031, extracting voice features from first segment, indicative of emotion for example at 0034); 
receive a second sound input in the conversation (0068, capturing second digitized voice segment); 
extract a second audio attribute from the second sound input, wherein the second audio attribute indicates a second condition of a user (0068, extracting characteristics of second segment, indicative of emotion); 
determine a difference between the second audio attribute and the first audio attribute, wherein the difference indicates a condition change of the user from the first condition to the second condition (0069, determining change in emotional level by comparing first and second characteristics); and 
generate a response to the second sound input based at least on the condition change (0095-97, generating customer service score for the call) and
present a message including the generated response to the user in the conservation (0096, may display emotional changes to representative).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Triartas et al. (US PAP 2017/0084295) also tracks changes in emotion. Herzig et al. (US PAP 2019/0188261) discusses tailoring dialog responses to detected emotion.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2655