DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 10-14 and 17-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hasegawa (US 20180222048 A1).
	Regarding Claim 10, Hasegawa teaches A controller configured to control an industrial robot having a function of detecting at least a force applied to a manipulator, the controller comprising: (Fig. 1 element 40 control device, element P force sensor, element (TCP) tool center point, and element 23 gripper [0058] the control device 40 controls a position of the TCP and an acting force acting on the TCP)
at least one memory; and ([0060] The control device 40 includes hardware resources…The hardware resources may include…a memory like a RAM, a ROM, and the like)
at least one processor configured to: ([0060] The control device 40 includes hardware resources…The hardware resources may include a processor like a CPU)
obtain at least the force applied in a feed direction of the manipulator and a control command of the industrial robot as acquisition data; ([0061] the control unit 43 drives the arms of the robots 1 to 3 [0099] the contact determination portion 43 c acquires an output from the force sensor P of each of the robots 1 to 3, and determines that the robots 1 to 3 come into contact with an object)
generate, based on the acquisition data, data including information indicating the force applied to the manipulator in the feed direction and information indicating the control command of the industrial robot; and ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.)
generate, based on the generated data, a neural network configured to output an adjustment action of the control command in the feed direction of the manipulator.  (Fig. 6 [0119] the control device 40 includes the calculation unit 41 in order to automatically determine the parameters 44 a. In the present embodiment, the calculation unit 41 can calculate optical parameters, operation parameters, and force control parameters by using machine learning. [0123] Specifically, the learning portion 41 b determines a behavior of changing the parameters 44 a on the basis of a state variable, and performs the behavior. If a reward is evaluated according to a state after the behavior, a behavior value of the behavior is determined. Therefore, the calculation unit 41 optimizes the parameters 44 a by repeating observation of a state variable, determination of a behavior corresponding to the state variable, and evaluation of a reward obtained through the behavior)
Regarding Claim 11, Hasegawa teaches A controller configured to control an industrial robot having a function of detecting at least a force applied to a manipulator, the controller comprising: (Fig. 1 element 40 control device, element P force sensor, element (TCP) tool center point, and element 23 gripper [0058] the control device 40 controls a position of the TCP and an acting force acting on the TCP)
at least one memory; and ([0060] The control device 40 includes hardware resources…The hardware resources may include…a memory like a RAM, a ROM, and the like)
at least one processor configured to: ([0060] The control device 40 includes hardware resources…The hardware resources may include a processor like a CPU)
obtain at least the force applied in a feed direction of the manipulator and a control command of the industrial robot as acquisition data; ([0061] the control unit 43 drives the arms of the robots 1 to 3 [0099] the contact determination portion 43 c acquires an output from the force sensor P of each of the robots 1 to 3, and determines that the robots 1 to 3 come into contact with an object)
generate, based on the acquisition data, data including information indicating the force applied to the manipulator in the feed direction and information indicating the control command of the industrial robot; and ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.)
generate, based on the generated data, a learning model of a neural network obtained by applying reinforcement learning to an adjustment behavior of the control command related to the manipulator with respect to a state of the force in the feed direction of the manipulator.  (Fig. 6 [0119] the control device 40 includes the calculation unit 41 in order to automatically determine the parameters 44 a. In the present embodiment, the calculation unit 41 can calculate optical parameters, operation parameters, and force control parameters by using machine learning. [0123] Specifically, the learning portion 41 b determines a behavior of changing the parameters 44 a on the basis of a state variable, and performs the behavior. If a reward is evaluated according to a state after the behavior, a behavior value of the behavior is determined. Therefore, the calculation unit 41 optimizes the parameters 44 a by repeating observation of a state variable, determination of a behavior corresponding to the state variable, and evaluation of a reward obtained through the behavior. The learning portion 41 b may optimize the parameters 44 a through learning, and optimizes the parameters 44 a through reinforcement learning in the present embodiment)
Regarding Claim 12, Hasegawa teaches A controller according to the claim 11, wherein, the at least one processor generates load determination data indicating a degree of load applied to the manipulator after the adjustment behavior is performed, as the generated data.  ([0175] a reward is evaluated on the basis of whether work is good or bad performed by the robot 3. The learning portion 41 b observes whether the work is good or bad so as to evaluate whether the work is good or bad. The learning portion 41 b determines a reward for the behavior a, and the states s and s′ on the basis of whether the work is good or bad. [0185] The currents of the motors M1 to M6, the values of the encoders E1 to E6, and the output from the force sensor P directly indicate operations of the robot 3, and the operations directly indicate whether work is good or bad.)
Regarding Claim 13, Hasegawa teaches A control system in which a plurality of devices are connected to each other via a network, wherein, the plurality of devices include a first controller which is the controller according to claim 11.  (Fig. 1 element 40 control device [0274] The control device may be formed of a plurality of devices, and the control unit 43 and the calculation unit 41 may be formed of different devices. The control device may be a robot controller, a teaching pendant, a PC, a server connected to a network, or the like, and may include these devices.)
Regarding Claim 14, Hasegawa teaches The control system according to claim 13, wherein, the plurality of devices include a computer having a machine learning device therein (Abstract: A control device includes a processor that is configured to execute computer-executable instructions so as to control a robot, wherein the processor is configured to calculate an operation parameter related to an operation of a robot by using machine learning), the computer acquires the learning model as at least one result of the reinforcement learning of the first controller ([0123] The learning portion 41 b may optimize the parameters 44 a through learning, and optimizes the parameters 44 a through reinforcement learning in the present embodiment), and the machine learning device provided in the computer optimizes or streamlines based on the acquired learning model.  ([0123] The learning portion 41 b may optimize the parameters 44 a through learning)
Regarding Claim 17, Hasegawa teaches A controller configured to control an industrial robot having a function of detecting at least a force applied to a manipulator, the controller comprising: (Fig. 1 element 40 control device, element P force sensor, element (TCP) tool center point, and element 23 gripper [0058] the control device 40 controls a position of the TCP and an acting force acting on the TCP)
at least one memory; and ([0060] The control device 40 includes hardware resources…The hardware resources may include…a memory like a RAM, a ROM, and the like)
at least one processor configured to: ([0060] The control device 40 includes hardware resources…The hardware resources may include a processor like a CPU)
obtain at least the force applied in a feed direction of the manipulator and a control command of the industrial robot as acquisition data; ([0061] the control unit 43 drives the arms of the robots 1 to 3 [0099] the contact determination portion 43 c acquires an output from the force sensor P of each of the robots 1 to 3, and determines that the robots 1 to 3 come into contact with an object)
generate, based on the acquisition data, data including information indicating the force applied to the manipulator in the feed direction and information indicating the control command of the industrial robot; and ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.)
store in the at least one memory a learning model obtained by applying reinforcement learning to an adjustment behavior of the control command related to the manipulator with respect to a state of the force in the feed direction of the manipulator by considering a work Page 3 of 5Appln. No.: 16/588,081AIWA-288USAmendment Dated January 14, 2022time of the manipulator based on the control command; and ([0123] The learning portion 41 b may optimize the parameters 44 a through learning, and optimizes the parameters 44 a through reinforcement learning [0152] the state s, the behavior a, and the reward r are stored in the storage unit 44 in correlation with each trial number t, and may be referred to at any timing)
estimate, based on the generated data, the adjustment behavior of the control command related to the manipulator using the learning model stored in the at least one memory.  (Fig. 6 [0119] the control device 40 includes the calculation unit 41 in order to automatically determine the parameters 44 a. In the present embodiment, the calculation unit 41 can calculate optical parameters, operation parameters, and force control parameters by using machine learning. [0123] Specifically, the learning portion 41 b determines a behavior of changing the parameters 44 a on the basis of a state variable, and performs the behavior. If a reward is evaluated according to a state after the behavior, a behavior value of the behavior is determined. Therefore, the calculation unit 41 optimizes the parameters 44 a by repeating observation of a state variable, determination of a behavior corresponding to the state variable, and evaluation of a reward obtained through the behavior.)
Regarding Claim 18, Hasegawa teaches A controller configured to control an industrial robot having a function of detecting at least a force applied to a manipulator, the controller comprising: (Fig. 1 element 40 control device, element P force sensor, element (TCP) tool center point, and element 23 gripper [0058] the control device 40 controls a position of the TCP and an acting force acting on the TCP)
at least one memory; and ([0060] The control device 40 includes hardware resources…The hardware resources may include…a memory like a RAM, a ROM, and the like)
at least one processor configured to: ([0060] The control device 40 includes hardware resources…The hardware resources may include a processor like a CPU)
obtain at least the force applied in a feed direction of the manipulator and a control command of the industrial robot as acquisition data; ([0061] the control unit 43 drives the arms of the robots 1 to 3 [0099] the contact determination portion 43 c acquires an output from the force sensor P of each of the robots 1 to 3, and determines that the robots 1 to 3 come into contact with an object)
generate, based on the acquisition data, data including information indicating the force applied to the manipulator in the feed direction and information indicating the control command of the industrial robot; and ([0120] the state observation portion 41 a observes a result generated by changing the parameters 44 a, as a state variable. Thus, the state observation portion 41 a may acquire, as state variables, a control result of the servo 43 d, values of the encoders E1 to E6, an output from the force sensor P, and an image acquired by the detection unit 42.)
obtain an adjustment action of the control command in the feed direction of the manipulator based on an output from a neural network when the generated data is input to a pre-generated neural network. (Fig. 6 [0119] the control device 40 includes the calculation unit 41 in order to automatically determine the parameters 44 a. In the present embodiment, the calculation unit 41 can calculate optical parameters, operation parameters, and force control parameters by using machine learning. [0123] Specifically, the learning portion 41 b determines a behavior of changing the parameters 44 a on the basis of a state variable, and performs the behavior. If a reward is evaluated according to a state after the behavior, a behavior value of the behavior is determined. Therefore, the calculation unit 41 optimizes the parameters 44 a by repeating observation of a state variable, determination of a behavior corresponding to the state variable, and evaluation of a reward obtained through the behavior.)
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Hasegawa (US 20180222048 A1) in view of Tsuda (US 20170028553 A1)
Regarding Claim 15, Hasegawa does not expressly disclose but Tsuda discloses The control system according to claim 13, wherein, the plurality of devices include a second controller different from the first controller (Fig. 8 [0095] The robot system 4 includes a first robot controller 2 a which controls the first robot 1 a, and a second robot controller 2 b which controls the second robot 1 b.), and a learning result by the first controller is shared with the second controller.  ([0097] In this manner, pieces of information learned by the respective robot controllers 2 a and 2 b can be shared by the first robot controller 2 a and the second robot controller 2 b. )
In this way, the system of Tsuda includes A machine learning device for a robot that allows a human and the robot to work cooperatively. Like Hasegawa, Tsuda is concerned with robotic control.
Therefore, from these teachings of Hasegawa and Tsuda, one of ordinary skill in the art at the time the invention was made would have found it obvious to apply the teachings of Tsuda to the system of Hasegawa since doing so would enhance the system by sharing of, e.g., action patterns for learning to increase the number of learning operations. This can improve the learning accuracy ([0097])
Regarding Claim 16, Hasegawa does not expressly disclose but Tsuda discloses The control system according to claim 13, wherein, the plurality of devices include a second controller different from the first controller (Fig. 8 [0095] The robot system 4 includes a first robot controller 2 a which controls the first robot 1 a, and a second robot controller 2 b which controls the second robot 1 b.), and data observed by the second controller is available for reinforcement learning by the first controller via the network.  ([0049] A variety of machine learning techniques are available, which are roughly classified into, e.g., “supervised learning,” “unsupervised learning,” and “reinforcement learning.” To implement these techniques, another technique called “deep learning” in which extraction of feature amounts themselves is learned is available. [0097] In this manner, pieces of information learned by the respective robot controllers 2 a and 2 b can be shared by the first robot controller 2 a and the second robot controller 2 b. )
In this way, the system of Tsuda includes A machine learning device for a robot that allows a human and the robot to work cooperatively. Like Hasegawa, Tsuda is concerned with robotic control.
Therefore, from these teachings of Hasegawa and Tsuda, one of ordinary skill in the art at the time the invention was made would have found it obvious to apply the teachings of Tsuda to the system of Hasegawa since doing so would enhance the system by sharing of, e.g., action patterns for learning to increase the number of learning operations. This can improve the learning accuracy ([0097])
Response to Arguments
Applicants arguments filed on 6/1/2022 are fully considered as follows:
Applicant argues that the 35 USC 112(b) rejection to claim 14 should not be maintained in view of the amendment. This argument is persuasive in view of the amendment. Therefore, the rejection is not maintained.
Applicant argues that the 35 USC 103 rejection to the claims should not be maintained in view of “Hasegawa fails to obtain the force applied in a feed direction of the manipulator. Rather, in Hasewaga, the force sensor P detects magnitudes of forces which are parallel to three detecting axes orthogonal to each other, and magnitudes of torques about the three detection axes (id., [0055])” However, Hasewaga teaches “[0057] force control of controlling a force acting on the robot can be performed, and the force control is performed such that an acting force acting on any point becomes a target force. Forces applied to various parts are defined in a force control coordinate system which is a three-dimensional orthogonal coordinate system. The target force (including a torque) may be expressed by a vector having an acting point of force expressed in the force control coordinate system as a starting point, and a starting point of the target force vector is the origin of the force control coordinate system, and a direction of the acting force matches one axis direction of the force control coordinate system.” Therefore, the rejection is maintained.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH TRAN whose telephone number is (313)446-6642. The examiner can normally be reached 7:30am-4:30pm M-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Khoi Tran can be reached on (571) 272-6919. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.A.T./Examiner, Art Unit 3664     
                                                                                                                                                                                                   /KHOI H TRAN/Supervisory Patent Examiner, Art Unit 3664