DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on January 30, 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 14 is objected to because of the following informalities:  
The claim is grammatically improper. The claim appears to be a literal translation into English from a foreign document
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “A state observation unit,” “a reward calculation unit,” and “a value function update unit” in claim 1, “a decision making unit” in claim 6, “a work intention recognition unit” in claim 9, and “a speech recognition unit” in claim 10. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7 and 16 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claims 1-7 are directed toward a device, i.e. a machine. Claim 16 is directed toward a method. Therefore, all claims are within at least one of the four statutory categories.
Claim 1 is directed to the abstract idea of a mathematical concept of calculating a reward and updating a function. The limitations directed to the abstract idea of a mathematical concept are as follows:
… calculating a reward based on control data… (a mathematical calculation based on data is a mathematical concept. Furthermore, Examiner notes the language “for controlling the robot…” is intended use and therefore given no patentable weight)
 … updating an action value function… (Updating a function can be as simple as adding an integer to said function, which is a mathematical calculation and/or formula. Furthermore,  Examiner notes the language “for controlling a movement of the robot…” is intended use and therefore given no patentable weight). 
The limitations not directed to an abstract idea are:
a state observation unit observing a state variable of the robot…
a reward calculation unit…
a value function update unit…
This judicial exception is not integrated into a practical application.  Observing a state variable of the robot is considered to be insignificant data gathering of necessary inputs for the mathematical concept describe above.  Therefore, this limitation merely adds insignificant extra-solution activity to the judicial exception (see MPEP 2106.05 (g)).
Additionally, the additional element of using a state observation unit, a reward calculation unit and a value function update unit (i.e. a  general purpose computer or processor as described in [0045] of the specification) amounts to no more than to apply the exception to a generic computer component.  Mere instructions to apply an exception to a generic computer component cannot provide an inventive concept (See MPEP 2106.05).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the above reasons. Furthermore, the observation step is considered well-understood routine and conventional receiving of data over a network (See MPEP 2106.05(d)). 

Regarding claims 2-7, the claims specify and/or further limits similar to the previously address abstract idea above and do not recite additional elements that present a practical application not amount to “significantly more for similar reasons above.

Regarding claim 16, the claim recites analogous language to claim 1 above, and is therefore rejected under the same premise. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-16 are rejected under 35 U.S.C. 103 as being unpatentable over Ozaki et al. (US 2018/0056520 A1, from Applicant’s IDS dated January 30, 2020, hereinafter “Ozaki”) in view of Hwang et al. (NPL: K. S. Hwang, J. L. Ling, Y. Chen and W. Wang, "Reward shaping for reinforcement learning by emotion expressions," 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2014, pp. 1288-1293, doi: 10.1109/SMC.2014.6974092. hereinafter “Hwang”).

	Regarding claim 1, Ozaki teaches:
	A machine learning device learning a movement of a robot where a human and the robot collaboratively work (see at least Abstract), the device comprising:
 a state observation unit observing a state variable representing a state of the robot when the human and the robot collaboratively work (see at least Fig. 1, element 21; [0028], disclosing a state observation unit; see also [0011]);
a reward calculation unit calculating a reward based on control data for controlling the robot, the state variable, an action of the human (see at least fig. 1, element 22; [0028], disclosing a reward calculation unit; see also [0011]), 
a value function update unit updating an action value function for controlling a movement of the robot, based on the reward and the state variable (see at least fig. 1, element 23; [0028], disclosing a value function update unit; see also [0011]).
Ozaki does not explicitly teach calculating a reward based on a facial expression of the human.
However, in the same field of endeavor, robots learning from human teaching, Hwang teaches calculating a reward based on a facial expression of the human (see at least Section III A, pg. 1289, disclosing "The proposed system uses non expert human's facial expressions as reward to train robots learning appropriate actions; see also Table 1.).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate the addition of human facial expressions as a reward, as taught by Hwang.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Regarding claim 2, the combination of Ozaki and Hwang teaches:
The machine learning device according to claim 1, wherein the state variable includes an output from an image sensor, a camera, a force sensor, a microphone, and a tactile sensor (Ozaki: see at least [0013], [0031], disclosing the camera, force sensor, microphone, and tactile sensor; Hwang: see at least abstract, disclosing the system recognized human's facial expressions captured by a web camera, i.e. an image sensor).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate an image sensor such as a webcam to facilitate the capturing of human facial expressions to calculate the reward, as taught by Hwang.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Regarding claim 3, the combination of Ozaki and Hwang teaches:
The machine learning device according to claim 1, wherein the reward calculation unit calculates the reward by adding a second reward based on the action of the human and a third reward based on the facial expression of the human (Hwang: see at least Section III A, pg. 1289, disclosing an emotion reward ER evaluated by facial expression) to a first reward based on the control data and the state variable (see at least [0012], [0061]-[0062], disclosing a second reward, i.e. an action, may be added to the first reward, i.e. control data and state variable).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate the addition of human facial expressions as a reward, as taught by Hwang, to the total reward as taught by Ozaki.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Regarding claim 4, the combination of Ozaki and Hwang teaches:
 The machine learning device according to claim 3, wherein as the second reward, a positive reward is set when the robot is stroked via the tactile sensor provided at the robot, and a negative reward is set when the robot is hit (Ozaki: see at least [0012], [0029], disclosing a positive reward when the robot is stroked and a negative reward when the robot is hit), 
or a positive reward is set when the robot is praised via a microphone provided at a part of the robot or near the robot or worn by the human, and a negative reward is set when the robot is reprimanded (Ozaki: see at least [0012], [0030], disclosing a positive reward when the robot is praised and a negative reward when the robot is scolded).

Regarding claim 5, the combination of Ozaki and Hwang teaches:
The machine learning device according to claim 3, wherein as the third reward, the facial expression of the human is recognized via the image sensor provided at the robot (Hwang: see at least Section III A, disclosing face detection and facial expression recognition), and a positive reward is set when the facial expression of the human is a smile or an expression of pleasure (Hwang: see at least Table 1, disclosing a positive reward when the shape of a mouth is curved and not a frown, i.e. a smile), and a negative reward is set when the facial expression of the human is a frown or a cry (Hwang: see at least Table 1, disclosing a negative reward for a frown).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the camera of Ozaki to include detecting facial expressions and providing a reward based on the facial expressions, as taught by Hwang. One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288

Regarding claim 6, the combination of Ozaki and Hwang teaches:
The machine learning device according to claim 1, further comprising a decision making unit deciding command data prescribing a movement of the robot, based on an output from the value function update unit (Ozaki: see at least [0013], [0031], disclosing a decision unit 24).

Regarding claim 7, the combination of Ozaki and Hwang teaches:
The machine learning device according to claim 2, wherein the image sensor is provided directly at the robot or in a periphery of the robot, the camera is provided directly at the robot or in an upper periphery of the robot, the force sensor is provided at a base part or a hand part of the robot or at a peripheral facility, or the tactile sensor is provided at a part of the robot or at a peripheral facility (Ozaki: see at least Fig. 4, disclosing the camera 44 directly at the robot, the force sensor 45 at the base, and the tactile sensor 41 is at a part of the robot).
Furthermore, Examiner notes the placement of the image sensor would have been an obvious design choice to one of ordinary skill in the art.  One or ordinary skill in the art would elect to place the image sensor in an area that is most capable of capturing the facial expressions of a human and would therefore elect to place the image sensor on the robot or in the upper periphery of the robot.

Regarding claim 8, the combination of Ozaki and Hwang teaches:
A robot system comprising:
 the machine learning device according to claim 1; (cited above)
 the robot working collaboratively with the human (Ozaki: see at least Fig. 4, disclosing a worker 1 working collaboratively with the robot 3); and
a robot control unit controlling a movement of the robot, wherein the machine learning device learns the movement of the robot by analyzing distribution of a feature point or a workpiece after the human and the robot collaboratively work (see at least Fig. 1, element 30, disclosing a robot control unit; [0014]).

Regarding claim 9, the combination of Ozaki and Hwang teaches:
The robot system according to claim 8, further comprising:
 an image sensor (Hwang: see at least abstract, disclosing the system recognized human's facial expressions captured by a web camera, i.e. an image sensor), a camera, a force sensor, a tactile sensor, a microphone, and input device (Ozaki: see at least [0013], [0031]);
and a work intention recognition unit receiving an output from the image sensor, the camera, the force sensor, the tactile sensor, the microphone, and the input device, (Ozaki: see at least Fig. 6, element 51; [0015], disclosing a task intention recognition unit) and recognizing an intention of work (Ozaki: see at least [0064], disclosing receiving outputs from the camera, force sensor, tactile sensor, the microphone, and the input device).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate an image sensor such as a webcam to facilitate the capturing of human facial expressions to calculate the reward, as taught by Hwang, and therefore allow the work intention unit of Ozaki to receive an output from the image sensor.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Regarding claim 10, the combination of Ozaki and Hwang teaches:
The robot system according to claim 9, further comprising a speech recognition unit recognizing a speech of the human inputted from the microphone (Ozaki: see at least Fig. 6, element 52, disclosing a voice recognition unit; see also [0065]),
wherein the work intention recognition unit corrects the movement of the robot, based on the speech recognition unit (Ozaki: see at least [0065]).

Regarding claim 11, the combination of Ozaki and Hwang teaches:
The robot system according to claim 10, further comprising: 
a question generation unit generating a question to the human, based on an analysis of work intention by the work intention recognition unit (Ozaki: see at least Fig. 6, element 53, disclosing a question generation unit; see also [0066]); and 
a speaker delivering the question generated by the question generation unit to the human (Ozaki: see at least Fig. 6, element 46, disclosing a speaker, see also [0066]).

Regarding claim 12, the combination of Ozaki and Hwang teaches:
The robot system according to claim 11, wherein the microphone receives a response from the human to the question from the speaker, and the speech recognition unit recognizes the response from the human inputted via the microphone and outputs the response to the work intention recognition unit (Ozaki: see at least [0066]).

Regarding claim 13, the combination of Ozaki and Hwang teaches:
The robot system according to claim 9, wherein the state variable inputted to the state observation unit of the machine learning device is an output from the work intention recognition unit, and the work intention recognition unit converts a positive reward based on the action of the human into a state variable that is set to the positive reward, and outputs the state variable to the state observation unit, converts a negative reward based on the action of the human into a state variable that is set to the negative reward, and outputs the state variable to the state observation unit (Ozaki: see at least [0067]),
converts a positive reward based on the facial expression of the human into a state variable that is set to the positive reward (Hwang: see at least section III A; Table 1, disclosing transforming human's facial expressions to an emotion value such that these emotional expressions can be used as reward value. A positive reward is given based on facial expression), 
and outputs the state variable to the state observation unit (Ozaki: see at least [0067]), 
and converts a negative reward based on the facial recognition of the human into a state variable that is set to the negative reward (Hwang: see at least section III A; Table 1, disclosing transforming human's facial expressions to an emotion value such that these emotional expressions can be used as reward value. A negative reward is given based on facial expression),
and outputs the state variable to the state observation unit (Ozaki: see at least [0067]).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate the addition of human facial expressions as a reward, as taught by Hwang.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Regarding claim 14, the combination of Ozaki and Hwang teaches:
 The robot system according to claim 8, wherein the machine learning device is able to be set not to learn any more a movement learned up to a predetermined time point (Ozaki: see at least [0068]).

Regarding claim 15, the combination of Ozaki and Hwang teaches:
The robot system according to claim 9, wherein the robot control unit stops the robot when the tactile sensor detects a slight collision (Ozaki: see at least [0068]).

Regarding claim 16, Ozaki teaches:
	A machine learning method for learning a movement of a robot where a human and the robot collaboratively work (see at least Abstract), the method comprising:
observing a state variable representing a state of the robot when the human and the robot collaboratively work (see at least Fig. 1, element 21; [0028], disclosing a state observation unit; see also [0011]);
calculating a reward based on control data for controlling the robot, the state variable, an action of the human (see at least fig. 1, element 22; [0028], disclosing a reward calculation unit; see also [0011]), 
updating an action value function for controlling a movement of the robot, based on the reward and the state variable (see at least fig. 1, element 23; [0028], disclosing a value function update unit; see also [0011]).
Ozaki does not explicitly teach calculating a reward based on a facial expression of the human.
However, in the same field of endeavor, robots learning from human teaching, Hwang teaches calculating a reward based on a facial expression of the human (see at least Section III A, pg. 1289, disclosing "The proposed system uses non expert human's facial expressions as reward to train robots learning appropriate actions; see also Table 1.).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reward of Ozaki to incorporate the addition of human facial expressions as a reward, as taught by Hwang.  One would have been motivated to make this modification in order to better teach robots how to interact and to provide an easier way for humans to train robots, as suggested by Hwang in at least Section 1, pg. 1288.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Noda et al. (US 2019/0308317 A1), disclosing a robot recognizing reward based on human racial expression.
Lee et al. (US 2018/0178372 A1), disclosing calculating a reward value by analyzing the facial expression of a user or voice signals from a user.
GU et al. (US 2013/0114852 A1), disclosing using a webcam as an image sensor for facial recognition is well known.
Nugent (US 2010/0280982 A1), disclosing reinforcing robotic behavior based on facial expressions.
Ueda et al. (US 2010/0114807 A1)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA ALEXANDER GARZA whose telephone number is (469)295-9178. The examiner can normally be reached 7:30-4:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JEFF BURKE can be reached on 469-295-9067. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
JOSHUA ALEXANDER GARZA
Examiner
Art Unit 3664
/JEFF A BURKE/Supervisory Patent Examiner, Art Unit 3664