Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. JP2020-050046, filed on 03/19/2020.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/10/2021 and 06/17/2022 are considered by the examiner.
Drawings
The drawing submitted on10/08/2021 is considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: an estimation unit, a decision unit, an output unit, in claims 1-10 and a learning unit, in claim 9.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 5,  rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being incomplete for omitting essential steps, such omission amounting to a gap between the steps.  See MPEP § 2172.01.  The omitted steps are:  Claims 5,  recites the limitation " the estimation unit estimates an emotion of a reception side user who receives a speech from the detection information ". It is unclear how “…a reception side user receives a speech from the detection information” thus as being incomplete amounting to a gap between steps. It would have been clear and complete if the claim recites as “…a reception side user who receives a speech from an utterance side user”. For the examination purpose, examiner interpreted the limitation as “…a reception side user who receives a speech from an utterance side user”.  Appropriate correction is required.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4, and 6-12, are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yu (US 2016/0196836 X1).

Regarding Claims 1 and 11-12, Yu teaches:  An output apparatus comprising ([0051] In order to solve the problem that the communication effect is affected for the mobile terminal user is in a negative emotion in the related art, the embodiments of the present invention provide a method and device for transmitting voice data. The embodiments of the present invention will be further described in detail in combination with the accompanying drawings below. The embodiments in the present invention and the characteristics in the embodiments can be optionally combined with each other in the condition of no conflict. ): an estimation unit  (Fig.3, a comparison subunit) that estimates an emotion of a user from detection information detected by a predetermined detection device; a decision unit (Fig.3, a determination subunit) that decides information to be changed on the basis of the estimated emotion of the user (Fig. 3,  monitoring module 10) (Fig.7, voice input device (not shown))  ([0074] In the preferred embodiment, the monitoring module 10 can monitor whether the voice data are required to be adjusted with the structure of the first monitoring unit 12, or monitor whether the voice data are required to be adjusted with the structure of the second monitoring unit 14, or use the structures of the above first monitoring unit 12 and second monitoring unit 14 together, thereby improving the monitoring accuracy. In FIG. 3, only the preferred structure of the monitoring module 10 including the first monitoring unit 12 and the second monitoring unit 14 is taken as an example to make descriptions. [0076] The above first monitoring unit 12 includes: a comparison subunit, configured to: compare the characteristic parameter with the first characteristic parameter; wherein the first characteristic parameter is the characteristic parameter of the sent voice data when the sending end is in the abnormal emotional state; and a determination subunit, configured to: determine whether the voice data are required to be adjusted according to a comparison result.); and an output unit (Fig.4, a prompt module 40 ) that outputs information for changing the information to be changed ([0080] After the monitoring module 10 monitors that the voice data are required to be adjusted, that is, the user of the sending end is in the abnormal emotional state, the embodiment provides a preferred embodiment, and as shown in FIG. 4, besides all the above modules shown in FIG. 3, the above device also includes: a prompt module 40, configured to send a prompt signal in a case that a monitoring result of the above monitoring module 10 is that the voice data are required to be adjusted. The prompt signal can be a prompt tone or vibration, which is used for reminding the user to control emotion, tones and expressions and so on when communicating with other users. ).


Regarding Claim 2, Yu teaches: The output apparatus according to claim 1, wherein the decision unit decides a speech (real-time angry voice data from buffer) to be changed on the basis of the estimated emotion of the user (See rejection of claim 1 and [0091] The main function of the voice emotion identification module is equivalent to the function of the monitoring module 10 in the above embodiment, and the voice emotion identification module is configured to: extract an emotion characteristic parameter of the voice data in the voice buffer area in real time, and judge and identify whether the emotion of the user at the sending end is out of control (that is, in the abnormal emotional state) during the call according to the emotion characteristic parameter, and judges whether an indecent vocabulary exists in the call in the meantime. [0110] When the user is in the call process, the sound is input via a microphone of the mobile phone, and transcribed into an uncompressed voice file through a certain sampling frequency, bit and sound channel, and stored in the voice buffer area to be processed by the voice emotion identification module, and the voice emotion identification module extracts a characteristic parameter of the voice data in the voice buffer area, compares the characteristic parameter of the voice data with a characteristic parameter in the emotion voice database, to judge the user’s emotion at this point, and if the user is excited at the moment and is in abnormal emotional states such as anger and so on, the voice emotion identification module will trigger the reminding module to vibrate the mobile phone so as to remind the user to adjust the emotion in time, which avoids that the emotion is out of control. While judging the user’s emotion, the emotion voice database also will count the voice characteristic parameter of the user at the moment and the minimum interval time T between statements in anger, and will correct and adjust the data of the basic database, so that the voice emotion identification module is more easy and accurate to identify the user’s emotion and generate an adjustment parameter, and the adjustment parameter can be used as an adjustment parameter for adjusting the subsequent angry statements.).

Regarding Claim 3, Yu teaches: The output apparatus according to claim 2, wherein the decision unit predicts (judge) a speech (angry voice data) of the user on the basis of the estimated emotion of the user (compares the characteristic parameter of the voice data with a characteristic parameter in the emotion voice database ) and decides a speech satisfying a predetermined condition in the predicted speech as the speech to be changed (See rejection of claim 2).

Regarding Claim 4, Yu teaches: The output apparatus according to claim 1, wherein the estimation unit estimates an emotion of an utterance side user who utters a speech (angry voice data) from the detection information, and the decision unit decides the speech of the utterance side user as the speech to be changed on the basis of the estimated emotion of the utterance side user (See rejection of claim 2).

Regarding Claim 6, Yu teaches:  The output apparatus according to claim 1, wherein the decision unit decides the information to be changed in a case where the emotion of the user satisfies a predetermined condition (See rejection of Claim 2).

Regarding Claim 7, Yu teaches: The output apparatus according to claim 6, wherein the decision unit decides the information to be changed in a case where the emotion of the user is a negative emotion (abnormal emotion i.e. anger) as the predetermined condition (See rejection of claim 2).

Regarding Claim 8, Yu teaches: The output apparatus according to claim 1, wherein the decision unit predicts a speech of the user on the basis of the estimated emotion of the user and decides the predicted speech as the speech to be changed, and the output unit outputs a voice having an opposite phase (abnormal emotional state can be adjusted to the voice data in the normal state) to a voice indicating the predicted speech (See rejection of claim 2 and [0057] In the embodiment, it is to monitor whether voice data are required to be adjusted, monitoring whether the voice data are required to be adjusted can be implemented in various ways, no matter which ways are adopted, whether the voice data are required to be adjusted should be monitored, that is, whether a user at the sending end of the voice data is in an abnormal emotional state should be monitored. Based on this, the embodiment provides a preferred embodiment, that is, based on a preset statement database to be adjusted, the step of monitoring the voice data sent by the sending end includes: extracting a characteristic parameter in the voice data; and based on whether the above characteristic parameter is matched with a first characteristic parameter stored in the above statement database to be adjusted, monitoring the above voice data; and/or, extracting a vocabulary in the above voice data; and based on whether the above vocabulary is matched with a preset vocabulary stored in the above statement database to be adjusted, monitoring the above voice data. Through the above preferred embodiment, monitoring whether the sending end is in the abnormal emotional state is implemented, which provides a basis for adjusting the voice data sent by the sending end in the above case later. [0063] After monitoring that the voice data sent by the sending end are required to be adjusted, that is, the user at the sending end is in the abnormal emotional state, it is required to adjust the voice data, a specific adjustment policy can be implemented in various ways, as long as the voice data sent by the user at the sending end in the abnormal emotional state can be adjusted to the voice data in the normal state. Based on this, the embodiment provides a preferred embodiment, that is, a pitch frequency parameter of the above voice data is acquired, and according to the set standard voice format, the pitch frequency parameter of the above voice data is adjusted in accordance with a time domain synchronization algorithm and a pitch frequency adjustment parameter; and/or, voice energy of the above voice data is acquired, and according to the set standard voice format, the above voice energy is adjusted in accordance with an energy adjustment parameter; and/or, a statement duration of the above voice data is extended according to the set standard voice format.
[0064] In another adjustment way, it also can search whether a polite vocabulary corresponding to the preset vocabulary exists in the statement database to be adjusted; and when the polite vocabulary corresponding to the preset vocabulary exists, the preset vocabulary is replaced with the polite vocabulary.).

Regarding Claim 9, Yu teaches: The output apparatus according to claims 1, further comprising a learning unit that causes a model to learn a speech of a content satisfying a predetermined condition in a speech uttered by the user when the user has a predetermined emotion, wherein the decision unit decides the speech to be changed using the model(emotion voice database) (See rejection of claim 2 and [0096] Here, an emotion voice database storing the emotion characteristic parameters in the normal call is defined as a normal voice database; and an emotion voice database storing the emotion characteristic parameters in the anger is defined as an angry voice database. After the mobile phone is out of factory and is used by the user, the user’s emotion will be judged according to the initial setting of the emotion voice database at the beginning, and the emotion voice database will correct and adjust the emotion characteristic parameters when the user is in the normal call and in the anger call through the self-learning in the meantime, and it finally compares the two groups of parameters to obtain an adjustment parameter, which is used for the following module adjusting the angry statement. In addition, the angry voice database is also used for counting a minimum interval time T between statements in the angry state, which prepares for adjusting the subsequent angry statement.).

Regarding Claim 10, Yu teaches:  The output apparatus according to of claims 1, wherein the estimation unit estimates the emotion of the user from biological information (pitch of the voice or voice energy) of the user among the detection information detected by the predetermined detection device (See rejection of claim 2 and [0063] Based on this, the embodiment provides a preferred embodiment, that is, a pitch frequency parameter of the above voice data is acquired, and according to the set standard voice format, the pitch frequency parameter of the above voice data is adjusted in accordance with a time domain synchronization algorithm and a pitch frequency adjustment parameter; and/or, voice energy of the above voice data is acquired, and according to the set standard voice format, the above voice energy is adjusted in accordance with an energy adjustment parameter; and/or, a statement duration of the above voice data is extended according to the set standard voice format. [0092] Common emotion characteristic parameters are introduced in Table 1. Wherein, the duration of vocal cords opening and closing once namely a vibration period is called a tone period or a pitch period, and a reciprocal thereof is called a pitch frequency, and it also can be called a radical frequency for short.).

Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
X patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 5, is rejected under 35 U.S.C. 103 as being unpatentable over Yu  in view of Wen (US 2016/0240213 X1).

Regarding Claims 5, Yu teaches: estimates an emotion from  an utterance (sender) side user  speech and the utterance side user speech to be changed on the basis of the estimated emotion of the sender side user (See rejection of claim 2).

Yu does not teach: The output apparatus according to claims 1, wherein the estimation unit estimates an emotion of a reception side user who receives a speech from the detection information, and the decision unit decides a speech of the reception side user as the speech to be changed on the basis of the estimated emotion of the reception side user.
Wen teaches: the estimation unit estimates an emotion of a reception side user who receives a speech from the detection information, and the decision unit decides a speech of the reception side user as the speech to be changed on the basis of the estimated emotion of the reception side user ([0074] In another example, if voice signals of a user of a local communication terminal are received via a microphone of the local communication terminal, the device 100 may obtain the voice signals of the user of the local communication terminal via the local communication terminal. In another example, if voice signals of a user of a remote communication terminal are received via a microphone of the remote communication terminal, the device 100 may obtain the voice signals of the user of the remote communication terminal from the remote communication terminal. [0102] According to an exemplary embodiment, the content of a virtual speech may be determined based on expression attributes (like an emotion state) of speech information input to a communication terminal. Emotion states regarding a local user and a remote user may be analyzed and obtained at a communication terminal or the device 100, where the content of a virtual speech may be controlled according to the emotion states. [0103] For example, if speech information input to a communication terminal by a local user is related to a remote user of a current communication and the type of the emotion state of the speech information is anger, the content of a virtual speech may include prompt information for persuading the local user to control their emotion. For example, if the topic of the speech information of the local user includes a topic that triggers anger of the remote user (e.g., the age of the remote user), the content of the virtual speech may include a topic different from the topic mentioned in the speech information, such as weather and sports. [0112] The device 100 according to an exemplary embodiment may determine the emotion state of a participant in a remote call based on speech information input to a communication terminal by the participant. For example, if an emotion state obtained from speech information of a participant in a remote call is a negative type emotion state (e.g., anger type), the emotion state of the corresponding participant may be indicated as abnormal.[0128] For example, if speech information input to a communication terminal during a communication includes a pre-set sensitive keyword (e.g., opposition) that may induce a quarrel, a virtual character of a local communication terminal may delay transmission of speech information to a remote communication terminal and may provide a virtual speech including a suggestion for changing a topic or controlling an emotion to a local user.).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Yu to include the teaching of Wen above in order to  determining a virtual speech to be provided to the at least one participant based on the speech information and the state information, and outputting the determined virtual speech.
Alternate Rejection of Claim 10
15.	Claim(s) 10, is rejected under 35 U.S.C. 103 as being unpatentable over Yu  in view of Hwang et al.(US 2020/0349938 A1).
Regarding Claim 10, Yu teaches:  The output apparatus according to of claims 1, wherein the estimation unit estimates the emotion of the user from voice information (pitch of the voice or voice energy) of the user among the detection information detected by the predetermined detection device (See rejection of claim 2).
Yu does not teach: the estimation unit estimates the emotion of the user from biological information of the user.
Hwang et al. teach: the estimation unit estimates the emotion of the user from biological information of the user ([0068] According to an embodiment, the electronic device 1000 may obtain actual emotion information of the user related to the user event by obtaining biometric information of the user (e.g., heart rate information, pulse information, breathing information, eye twitching information, facial expression change information, sweating information, body temperature change information, or voice input), and modifying the default emotion information related to the user event, based on the biometric information of the user. For example, the electronic device 1000 may obtain the default emotion information related to at least one event from the table in which events are mapped to emotion information, identify an emotional state of the user for the at least one event, based on information obtained by analyzing voice data of an utterance of the user, and learn the emotion information of the user for the at least one event by modifying the default emotion information, based on the emotional state of the user. The modifying of the default emotion information may mean adjusting a weight corresponding to each of emotion elements included in the default emotion information. For example, when the default emotion information is ‘nervousness weight: 0.3, joy weight: 0.7’, the electronic device 1000 may increase the nervousness weight from ‘0.3’ to ‘0.6’ and reduce the joy weight from ‘0.7’ to ‘0.4’, based on the biometric information of the user.).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Yu to include the teaching of Hwang et al. above in order to  obtain actual emotion information of the user related to the user event by obtaining biometric information of the user, and modifying the default emotion information related to the user event, based on the biometric information of the user.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record O’Connor et al.(Us 2018/0068226 A1) teach: A user sentiment associated with a system response provided by the dialog system as part of the chat flow is determined based on observation of the user. A next system response is rerouted from a planned sequence of the chat flow to a sentiment-based repair sequence to alter content delivered to the user based on a detected aspect of the user sentiment.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656