DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 

(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “Control unit” in claim 1, “data acquisition unit” in claim 1, “pre-processing unit” in claim 1, 2, 4, 5, “learning unit” in claim 2, “learning model storage unit” in claim 3, “decision-making unit” in claim 3.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-7 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Hasegawa (US 20180222048 A1) 
Regarding Claim 1, Hasegawa teaches A controller controlling an industrial robot having a function of detecting a force and a moment applied to a manipulator, the controller comprising (Fig. 1 element 40 control device, element P force sensor, element (TCP) tool center point, and element 23 gripper [0058] the control device 40 controls a position of the TCP and an acting force acting on the TCP): a control unit controlling the industrial robot based on a control command ([0061] the control unit 43 drives the arms of the robots 1 to 3); a data acquisition unit acquiring at least one of the force and the moment applied to the manipulator of the industrial robot as acquisition data ([0099] the contact determination portion 43 c acquires an output from the force sensor P of each of the robots 1 to 3, and determines that the robots 1 to 3 come into contact with an object); and a pre-processing unit generating force state data including information related to the force applied to the manipulator and control command adjustment data indicating an adjustment behavior of the control command related to the manipulator as state data based on the acquisition data ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.), wherein the controller performs a process of machine learning related to the adjustment behavior of the control command related to the manipulator based on the state data. (Fig. 6 [0119] the control device 40 includes the calculation unit 41 in order to automatically determine the parameters 44 a. In the present embodiment, the calculation unit 41 can calculate optical parameters, operation parameters, and force control parameters by using machine learning.)
Regarding Claim 2, Hasegawa teaches wherein the pre-processing unit generates determination data indicating a determination result of an operating state of the manipulator after the adjustment behavior is further performed, based on the acquisition data ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.), and- 33 - the controller further comprises a learning unit generating a learning model obtained by applying reinforcement learning to the adjustment behavior of the control command related to the manipulator with respect to the state of the force applied to the manipulator, as a process of the machine learning, using the state data and the determination data. ([0120] a learning portion 41 b learning the parameters 44 a on the basis of an observed state variable. [0123] The learning portion 41 b may optimize the parameters 44 a through learning, and optimizes the parameters 44 a through reinforcement learning in the present embodiment. the learning portion 41 b determines a behavior of changing the parameters 44 a on the basis of a state variable, and performs the behavior. If a reward is evaluated according to a state after the behavior, a behavior value of the behavior is determined. Therefore, the calculation unit 41 optimizes the parameters 44 a by repeating observation of a state variable, determination of a behavior corresponding to the state variable, and evaluation of a reward obtained through the behavior)
Regarding Claim 3, Hasegawa teaches further comprising a learning model storage unit storing a learning model obtained by applying reinforcement learning to the adjustment behavior of the control command related to the manipulator with respect to the state of the force applied to the manipulator ([0079] The storage unit 44 stores a robot program 44 b for controlling the robots 1 to 3 in addition to the parameters 44 a.), and a decision-making unit estimating the adjustment behavior of the control command related to the manipulator using the learning model stored in the learning model storage unit, based on the state data, as the process of the machine learning. ([0128] behavior information 44 d indicating a learning target parameter and a behavior which can be taken is recorded in the storage unit 44 in advance. In other words, an optical parameter described as a learning target in the behavior information 44 d is a learning target)
Regarding Claim 4, Hasegawa teaches wherein the pre-processing unit generates load determination data indicating a degree of load applied to the manipulator after the adjustment behavior is performed, as the determination data. ([0175] a reward is evaluated on the basis of whether work is good or bad performed by the robot 3. The learning portion 41 b observes whether the work is good or bad so as to evaluate whether the work is good or bad. The learning portion 41 b determines a reward for the behavior a, and the states s and s′ on the basis of whether the work is good or bad. [0185] The currents of the motors M1 to M6, the values of the encoders E1 to E6, and the output from the force sensor P directly indicate operations of the robot 3, and the operations directly indicate whether work is good or bad.)
Regarding Claim 5, Hasegawa teaches wherein the pre-processing unit generates operation time data indicating a degree of operation time of the manipulator after the adjustment behavior is performed, as the determination data.  ([0016] The learning portion may evaluate a reward for the behavior on the basis of whether work performed by the robot is good or bad. [0018] According to the configuration in which the reward is evaluated to be positive in a case where a required time for work is shorter than a reference, it is possible to easily calculate an operation parameter for causing the robot to perform work in a short period of time)
Regarding Claim 6, Hasegawa teaches A control system which is a system in which a plurality of devices are connected to each other via a network, wherein the plurality of devices include a first controller which is the controller according to claim 2.  (Fig. 1 element 40 control device [0274] The control device may be formed of a plurality of devices, and the control unit 43 and the calculation unit 41 may be formed of different devices. The control device may be a robot controller, a teaching pendant, a PC, a server connected to a network, or the like, and may include these devices.)
Regarding Claim 7, Hasegawa teaches wherein the plurality of devices include a computer having a machine learning device therein (Abstract: A control device includes a processor that is configured to execute computer-executable instructions so as to control a robot, wherein the processor is configured to calculate an operation parameter related to an operation of a robot by using machine learning), the computer acquires a learning model as at least one result of the reinforcement learning of the first controller ([0123] The learning portion 41 b may optimize the parameters 44 a through learning, and optimizes the parameters 44 a through reinforcement learning in the present embodiment), and the machine learning device provided in the computer optimizes or streamlines based on the acquired learning model.  ([0123] The learning portion 41 b may optimize the parameters 44 a through learning)
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Hasegawa (US 20180222048 A1) in view of Levine (US 20190232488 A1).
Regarding Claim 8, Hasegawa does not disclose, but Levine discloses wherein the plurality of devices include a second controller different from the first controller, and a learning result by the first controller is shared with the second controller. ([0059] As illustrated in FIG. 1 and described in more detail herein, the experience collector engine 112 receives instances of experience data generated by the robot 180A and the robot 180B (and optionally additional robot(s)) while they are performing episodes. Each instance of experience data is generated by a corresponding robot based on input applied to, and/or output generated over, the policy network of the robot in a corresponding iteration. For example, each instance of experience data may indicate a current state of the robot, an action to be performed based on the output of the policy network, a state of the robot after implementation of the action, and/or a reward for the action (as indicated by the output generated over the policy network and/or a separate reward function). [0064] all or aspects of one or more of those components may be implemented on one or more computer systems that are separate from, but in network communication with, robots 180A and 180B. In some of those implementations, experience data can be transmitted from a robot to the components over one or more networks, and updated policy parameters can be transmitted from the components to the robot over one or more of the networks)
In this way, the system of Levine includes one or more components in communication with the network and robots. Like Hasegawa, Levine is concerned with using reinforcement learning to improve the robot performance.
Therefore, from these teachings of Hasegawa and Levine, one of ordinary skill in the art at the time of the invention was made would have found it obvious to apply the teachings of Levine to the system of Hasegawa since doing so would enhance the system by transmitting data through communication between the components, robots, and the network. 
Regarding Claim 9, Hasegawa does not disclose, but Levine discloses wherein the plurality of devices include a second controller different from the first controller, and data observed by the second controller is available for reinforcement learning by the first controller via the network. ([0070] the system may apply the current state as input to a reinforcement learning policy model and generate, over the model based on the input, output that indicates an action to implement. [0102] the robot control system 660 may perform one or more aspects of methods 300, 400, and/or 500 described herein [0103] all or aspects of control system 660 may be implemented on one or more computing devices that are in wired and/or wireless communication with the robot 620, such as computing device 710. [0104] Network interface subsystem 716 provides an interface to outside networks and is coupled to corresponding interface devices in other computing devices.)
In this way, the system of Levine includes one or more components in communication with the network and robots. Like Hasegawa, Levine is concerned with using reinforcement learning to improve the robot performance.
Therefore, from these teachings of Hasegawa and Levine, one of ordinary skill in the art at the time of the invention was made would have found it obvious to apply the teachings of Levine to the system of Hasegawa since doing so would enhance the system by transmitting data through communication between the components, robots, and the network.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH TRAN whose telephone number is (313)446-6642.  The examiner can normally be reached on 7:30am-4:30pm M-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Khoi Tran can be reached on (571) 272-6919.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/S.A.T./Examiner, Art Unit 3664                    
                                                                                                                                                                          /KHOI H TRAN/Supervisory Patent Examiner, Art Unit 3664