DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim(s) 1-9 are pending.
Claim(s) 1-9 are rejected.
Response to Amendment
This Office Action is responsive to the amendment filed on 06/17/2022.
Claims 1 and 6-8 are amended. Accordingly, the amended claims are being fully considered by the examiner.
Applicant’s amendments to claims 6 and 7 have overcome all the 35 USC § 112 rejections of claims 6 and 7 as set forth in the previous office action. However, upon further consideration of the amended claims, a new grounds of 35 USC § 112 rejections have been introduced.
This action is MADE FINAL. Please see response to arguments section for further details.






Claim Rejections - 35 USC § 112
35 U.S.C. 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 5 and 9 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.
Claim 5:
	Claim 5 recites, “an optimization action information output unit configured to output the adjustment information of the coefficient of the compensation unit” in lines 3-4.
	This limitation describe that “optimization action information output unit” outputs “adjustment information” “of the coefficient of the compensation unit.” 
	The independent claim 1 describes “one or more coefficients” (note: please see 112(b) rejections) are acquired by “state information acquisition unit,” and then describes that action information output unit outputs “adjustment information of the coefficients included in the state information” to the “the compensation unit.” However claim 1 doesn’t describe “adjustment information” “of the coefficient of the compensation unit.”
	The specification doesn’t disclose outputting “adjustment information” “of the coefficient of the compensation unit” by the “optimization action information output unit.”
	Specification describes:
In ¶115, “an action information output unit (for example, an action information output unit 303) configured to output action information including adjustment information of the coefficient included in the state information to the compensation unit,”
In ¶120, “The machine learning device outputs action information including adjustment information of the coefficient to the compensation unit.” 
In ¶101, “in step S22, the optimization action information output unit 305 generates the optimization action information on the basis of the value function Q and outputs the generated optimization action information to the servo control unit 100.”
In ¶102, “generate the optimization action information on the basis of the value function Q obtained by the machine learning unit 300 performing the learning, thereby simplifying the adjustment based on the optimization action information of the coefficients set presently of the coefficients a1 to a6 in Expression 1 of the position error compensation unit 111, the coefficients b1 to b6 in Expression 2 of the velocity command compensation unit 112, and the coefficients c1 to c6 in Expression 3 of the torque command compensation unit 113, resulting in enabling to improve the quality of a machining surface of a workpiece.”

	Applicant’s specification describes that “action information output unit” outputs “action information,” where the action information includes “adjustment information of the coefficient included in the state information.”
	However, specification doesn’t teach “adjustment information of the coefficient of the compensation unit” that is outputted by the “action information output unit.” In other words, according to the specification, “the coefficient” here is from “action information including adjustment information of the coefficients included in the state information” outputted by “action information output unit,” but not from “compensation unit.”
	One of the ordinary skilled in the art, based on the description in the specification, will not understand that “optimization action information output unit” outputs “adjustment information” “of the coefficient of the compensation unit,” because as described above, according to applicant specification, “the coefficient” here is from “action information including adjustment information of the coefficients included in the state information” outputted by “action information output unit,” but not from “compensation unit.”
	Appropriate correction is required.


Claim 9:
	Claim 9 recites, “outputting the adjustment information of the coefficient of the compensation unit” in line 2.
	This limitation describe that “adjustment information” “of the coefficient of the compensation unit” is outputted
	The independent claim 9 describes “state information” including “one or more coefficients” (note: please see 112(b) rejections). Claim 8 further describes outputting “adjustment information of the coefficients included in the state information” to the “the compensation unit.” However claim 8 doesn’t describe “adjustment information” “of the coefficient of the compensation unit.”
	The specification doesn’t disclose outputting “adjustment information” “of the coefficient of the compensation unit.”
	Specification describes:
In ¶115, “an action information output unit (for example, an action information output unit 303) configured to output action information including adjustment information of the coefficient included in the state information to the compensation unit,”
In ¶120, “The machine learning device outputs action information including adjustment information of the coefficient to the compensation unit.” 
In ¶101, “in step S22, the optimization action information output unit 305 generates the optimization action information on the basis of the value function Q and outputs the generated optimization action information to the servo control unit 100.”
In ¶102, “generate the optimization action information on the basis of the value function Q obtained by the machine learning unit 300 performing the learning, thereby simplifying the adjustment based on the optimization action information of the coefficients set presently of the coefficients a1 to a6 in Expression 1 of the position error compensation unit 111, the coefficients b1 to b6 in Expression 2 of the velocity command compensation unit 112, and the coefficients c1 to c6 in Expression 3 of the torque command compensation unit 113, resulting in enabling to improve the quality of a machining surface of a workpiece.”

	Applicant’s specification describes that “action information output unit” outputs “action information,” where the action information includes “adjustment information of the coefficient included in the state information.”
	However, specification doesn’t teach “adjustment information of the coefficient of the compensation unit” that is outputted. In other words, according to the specification, “the coefficient” here is from “action information including adjustment information of the coefficients included in the state information” but not from “compensation unit.”
	One of the ordinary skilled in the art, based on the description in the specification, will not understand outputting “adjustment information” “of the coefficient of the compensation unit,” because as described above, according to applicant specification, “the coefficient” here is from “action information including adjustment information of the coefficients included in the state information” outputted, but not from “compensation unit.”
	Appropriate correction is required.

	










35 U.S.C. 112(b)

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-9 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
-Unclear limitations and insufficient antecedent basis:
Claim 1:
	Claim 1 recites:
“A machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors, the motors configured to drive a machine having a plurality of axes, with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes, 
	a first servo control unit related to the one axis receiving the interference among the plurality of servo control units, the first servo control unit comprising: 
	a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of variables related to a position command and variables related to position feedback information of a second servo control unit related to the axis generating the interference, and the machine learning device comprising: 
	a state information acquisition unit configured to acquire state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and coefficient of the function; 
	an action information output unit configured to output action information including adjustment information of the coefficients included in the state information to the compensation unit; 
	a reward output unit configured to output a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and 
	a value function updating unit configured to update a value function related to the adjustment information of the coefficients on the basis of the reward value output by the reward output unit, the state information, and the action information.”

There is insufficient antecedent basis for the limitations “coefficient,” “the coefficients,” “the function,” “one or more functions” in the claim.
Claim recites one or more functions, and then further refers to “the function” (i.e.; a singular function). It’s not clear what function from “one or more functions” is referred to by “the function.”
Claim recites “coefficient of the function,” (i.e.; singular coefficient of a function) and then further recites “the coefficients” (i.e.; plural coefficients). It’s not clear what coefficients are referred to by plural “the coefficients” when there is only one coefficient is defined.
	For the examination purpose, claim 1 is construed as, 
	“A machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors, the motors configured to drive a machine having a plurality of axes, with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes, 
	a first servo control unit related to the one axis receiving the interference among the plurality of servo control units, the first servo control unit comprising: 
	a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of variables related to a position command and variables related to position feedback information of a second servo control unit related to the axis generating the interference, and the machine learning device comprising: 
	a state information acquisition unit configured to acquire state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients  of the one or more functions ; 
	an action information output unit configured to output action information including adjustment information of the one or more coefficients included in the state information to the compensation unit; 
	a reward output unit configured to output a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and 
	a value function updating unit configured to update a value function related to the adjustment information of the one or more coefficients on the basis of the reward value output by the reward output unit, the state information, and the action information.”
	Appropriate correction is required.

Claim 5:
	Claim 5 recites the limitation “an optimization action information output unit configured to output the adjustment information of the coefficient of the compensation unit on the basis of the value function updated by the value function updating unit.”
	There is insufficient antecedent basis for the limitations “the coefficient,” in the claim.
	The independent claim doesn’t describe any “coefficient of the compensation unit.” Also refer to the 35 U.S.C. 112(a) rejections above.
	For the examination purpose, the above described limitation is construed as, “an optimization action information output unit configured to output the adjustment information of the one or more coefficients  on the basis of the value function updated by the value function updating unit.”
	Appropriate correction is required.

Claim 6:
	Claim 6 recites the limitation “the torque command of the first servo control unit on the basis of the function including the at least one of the variable related to the position command” in lines 10-11.
	There is insufficient antecedent basis for the limitations “the coefficient,” in the claim.
	For the examination purpose, the above described limitation is construed as, “the torque command of the first servo control unit on the basis of the one or more functions  including the at least one of the variable related to the position command”
	Appropriate correction is required.

Claim 6:
	Claim 6 recites the limitation “the machine learning device outputs the action information including the adjustment information of the coefficient to the compensation unit” in lines 14-15.
	There is insufficient antecedent basis for the limitations “the coefficient,” in the claim.
	For the examination purpose, the above described limitation is construed as, limitation “the machine learning device outputs the action information including the adjustment information of the one or more coefficients  to the compensation unit”
	Appropriate correction is required.




Claim 7:
	Claim 7 recites the limitation “the torque command of the first servo control unit on the basis of the function including the at least one of the variable related to the position command” in lines 10-11.
	There is insufficient antecedent basis for the limitations “the coefficient,” in the claim.
	For the examination purpose, the above described limitation is construed as, “the torque command of the first servo control unit on the basis of the one or more functions  including the at least one of the variable related to the position command”
	Appropriate correction is required.

Claim 7:
	Claim 7 recites the limitation “the machine learning device outputs the action information including the adjustment information of the coefficient to the compensation unit” in lines 14-15.
	There is insufficient antecedent basis for the limitations “the coefficient,” in the claim.
	For the examination purpose, the above described limitation is construed as, limitation “the machine learning device outputs the action information including the adjustment information of the one or more coefficients  to the compensation unit”
	Appropriate correction is required.
Claim 8:
	Claim 8 recites:
“A machine learning method for a machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors, the motors configured to drive a machine having a 4plurality of axes, with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes, 
	a first servo control unit related to the one axis receiving the interference among the plurality of servo control units, the first servo control unit comprising a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of variables related to a position command and variables related to position feedback information of a second servo control unit related to the axis generating the interference, and 
	the machine learning method comprising the steps of:
	acquiring state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and a coefficient of the function; 
	outputting action information including adjustment information of the coefficient included in the state information to the compensation unit; 
	outputting a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and 
	updating a value function related to the adjustment information of the coefficients on the basis of the reward value, the state information, and the action information.”

There is insufficient antecedent basis for the limitations “coefficient,” “the coefficients,” “the function,” “one or more functions” in the claim.
Claim recites one or more functions, and then further refers to “the function” (i.e.; a singular function). It’s not clear what function from “one or more functions” is referred to by “the function.”
Claim recites “coefficient of the function,” (i.e.; singular coefficient of a function) and then further recites “the coefficients” (i.e.; plural coefficients). It’s not clear what coefficients are referred to by plural “the coefficients” when there is only one coefficient is defined.
	For the examination purpose, claim 8 is construed as, 
	“A machine learning method for a machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors, the motors configured to drive a machine having a 4plurality of axes, with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes, 
	a first servo control unit related to the one axis receiving the interference among the plurality of servo control units, the first servo control unit comprising a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of variables related to a position command and variables related to position feedback information of a second servo control unit related to the axis generating the interference, and 
	the machine learning method comprising the steps of:
	acquiring state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and [[a]] one or more coefficients  of the one or more functions ; 
	outputting action information including adjustment information of the one or more coefficient included in the state information to the compensation unit; 
	outputting a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and 
	updating a value function related to the adjustment information of the one or more coefficients on the basis of the reward value, the state information, and the action information.”
	Appropriate correction is required.

Claim 9:
	Claim 9 recites the limitation “outputting the adjustment information of the coefficient of the compensation unit” in line 2.
	There is insufficient antecedent basis for the limitations “the coefficient” in the claim.
	The independent claim doesn’t describe any “coefficient of the compensation unit.” Also refer to the 35 U.S.C. 112(a) rejections above.
	For the examination purpose, the above described limitation is construed as, “outputting the adjustment information of the one or more coefficients ”
	Appropriate correction is required.

Claims 2-7:
	Based on their dependencies on claim 1, claims 2-7 also include the same deficiencies as claim 1; therefore, for the same reasons as described above in claim 1, claims 2-7 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.

Claim 9:
	Based on its dependency on claim 8, claim 9 also includes the same deficiencies as claim 8; therefore, for the same reasons as described above in claim 8, claim 9 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.




Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1-3, 6-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Namie et al. (US20170153614A1) [hereinafter Namie], and further in view of Kawai et al. (US20170154283A1) [hereinafter Kawai].
Claim 1 (amended):
	Regarding claim 1, Namie discloses, “A machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors,” “the motors configured to drive a machine having a plurality of axes,”  [See the machine learning system. See plurality of servo control units (e.g.; 200(1)-200(n)) that control plurality of motors (e.g.; control targets 3(1) to 3(n)). See motors (e.g.; control targets 3(1) to 3(n)) drive plurality of axes (e.g.; x, y, z, or rotational axis): “the controlled variable acquiring part further includes a learning controller” (¶15)… “plural lower-level controllers 200(1) to 200(n), and plural control targets 3(1) to 3(n) in which the drive is controlled with each of the lower-level controllers 200(1) to 200(n).” (¶76)… “controller 200 in order to perform drive control (such as “orbit follow-up control” and “orbit control in a working machine”) of the control target 3,” (¶34)… “the control target 3 is a servo motor, the lower-level controller 200 drives the servo motor such that the servo motor” (¶35)… “a control target 3 (a machine such as a servo motor and a machine element driven with the servo motor).” (¶33)];
	“with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes,” [See feedback from one of the controllers 200(1)-200(n) (e.g.; interference generated by movement along at least one of the other axes controlled by the motor of machine 3 that is controlled by controllers 200) is received by the controller 140 (e.g.; received by  command value correcting device 1; the one axis of the motor of control target 3 controlled by controller 140): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“a first servo control unit related to the one axis receiving the interference among the plurality of servo control units,” [See the first servo control unit (e.g.; 140) related to the axis (e.g.; axis controlled by the motor of the machine 3 controlled by controller 140) receiving interference from the other servo control units (e.g.; 200(1)-200(n)) that control plurality of motors such as control targets 3(1) to 3(n))): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“the first servo control unit comprising: a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of a variable related to a position command and a variable related to position feedback information of a second servo control unit related to the axis generating the interference,” [Examiner notes that claim requires obtaining a compensation value for compensating for only one of 1. a position error, 2. a velocity command, and 3. a torque command of the first servo control unit. The obtaining is performed on the basis of one or more functions including only one of : i. a variable related to a position command and ii. a variable related to position feedback information of a second servo control unit related to the axis generating the interference.
	Namie teaches: obtaining compensation value for compensating for a position error based on a function including a variable related to a position feedback information of a second servo control unit related to the axis generating the interference
	See the first servo controller (e.g.; 140) includes compensator 1 that determines corrects the command position (e.g.; correcting position error) based on a function including a variable related to a position feedback information of a second servo control unit related to the axis generating the interference (e.g.; feedback information including position feedback from any of the controllers 200(1)-200(n)): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)], but doesn’t explicitly disclose, “the machine learning device comprising: a state information acquisition unit configured to acquire state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients of the one or more functions;” “an action information output unit configured to output action information including adjustment information of the one or more coefficients included in the state information to the compensation unit;” “a reward output unit configured to output a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and” “a value function updating unit configured to update a value function related to the adjustment information of the one or more coefficients on the basis of the reward value output by the reward output unit, the state information, and the action information.”
	However, Kawai discloses, “the machine learning device comprising: a state information acquisition unit configured to acquire state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients of the one or more functions;” [Examiner notes that claim requires only one coefficient of one function.
	See machine leaning device acquires state information including first and second servo control information (e.g.; state information including control information of each axis including a first and second axis) and a coefficient (e.g.; γ or α) : “The machine learning apparatus 1 includes a state observation unit 11 and a learning unit 12.” (¶38)… “The state observation unit 11 observes a state variable composed of at least one of data relating to the number of errors between the position command relative to a rotor of a motor which is drive-controlled by the motor control apparatus” “any command of the position command, the speed command, or the current command in the motor control apparatus,” “data relating to a state of the machine tool including the motor control apparatus.” (¶39)… “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)];
	“an action information output unit configured to output action information including adjustment information of the one or more coefficients included in the state information to the compensation unit;” [See action information is output and used in the learning, where action information includes adjustment information of the coefficient included in the state information (e.g.; adjustment information such as γ can be adjusted between 0<γ≦1, α can be adjusted between of 0<α≦1): “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)];
	“a reward output unit configured to output a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and” [See system outputs a reward value for reinforcement learning using evaluation function serving as a function of the first servo control information (e.g.; using function of first servo control information such as function of information related to the control information of the servo motor): “the reward calculation unit 21 may be configured to increase the reward when the number of errors observed by the state observation unit 11 is smaller than the number of errors observed by the state observation unit 11 before the current number of errors, and reduce the reward when larger” “increase the reward when the number of errors observed by the state observation unit 11 is inside a specified range, and to reduce the reward when the number of errors is outside the specified range.” (¶51)];
	“a value function updating unit configured to update a value function related to the adjustment information of the one or more coefficients on the basis of the reward value output by the reward output unit, the state information, and the action information.” [See system updates the value function related to the adjustment information (e.g.; information related to the adjustment) of the coefficient (i.e.; correction coefficient) on the basis of reward value that was outputted, state information (e.g.; state variable observed) and action information (e.g.; action value table): “The function update unit 22 updates a function (action value table) for calculating the number of corrections used to correct any command of the position command, the speed command, or the current command in the motor control apparatus, based on the state variable observed by the state observation unit 11 and the reward calculated by the reward calculation unit 21.” (¶52)… “The learning unit 12 may calculate,” “the state variable observed by the state observation unit 11 and update the function (action value table) in real time.” “the function update unit 22” “update the function for calculating the number of corrections used to correct any command of the position command, the speed command, or the current command in the motor control apparatus, based on the state variable observed by the state observation unit 11 and the reward calculated by the reward calculation unit 21” (¶53)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of acquiring state information including first and second servo control information of the first and second servo control unit and a coefficient of the function; combined the capability of outputting adjustment information of the coefficient included in the state information; combined the capability of outputting a reward value using a function of the first servo control information; and combined the capability of updating a value function related to the adjustment information (e.g.; information related to the adjustment) of the coefficient (i.e.; correction coefficient) on the basis of the reward value output, the state information, and the action information taught by Kawai with the device taught by Namie as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination in order to perform efficient learning of data and operation of the servo controller [Kawai: “it is possible to use, in unsupervised learning, data that can be acquired without actually operating the motor control apparatus (for example, data of simulation) and perform learning efficiently.” (¶54)].

Claim 2:
	Regarding claim 2, Namie and Kawai disclose all the elements of claim 1,
	Namie further discloses, “wherein the first servo control information includes a position command and position feedback information of the first servo control unit or a first position error of the first servo control unit, and” [See the first servo control information includes first position error of the first servo control unit (e.g.; correcting position error): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)], but doesn’t explicitly disclose, “the evaluation function outputs the reward value on the basis of a value including a second position error obtained from the position command and the position feedback information of the first servo control unit or the first position error, an absolute value of the first or second position error, or a second power of the absolute value.”
	However, Kawai discloses, “the evaluation function outputs the reward value on the basis of a value including a second position error obtained from the position command and the position feedback information of the first servo control unit or the first position error, an absolute value of the first or second position error, or a second power of the absolute value.” [Examiner notes that claim requires outputs the reward value on the basis of only one of 1. a value including a second position error obtained from the position command and the position feedback information of the first servo control unit or the first position error, 2. an absolute value of the first or second position error, or 3. a second power of the absolute value. Further examiner notes that for #1 claim requires only one of  i. a value including a second position error obtained from the position command and the position feedback information of the first servo control unit, or ii. the first position error
	Kawai teaches, outputting the reward value on the basis of a first position error.
	See system outputs a reward value based on position error information: “the reward calculation unit 21 may be configured to increase the reward when the number of errors observed by the state observation unit 11 is smaller than the number of errors observed by the state observation unit 11 before the current number of errors, and reduce the reward when larger” “increase the reward when the number of errors observed by the state observation unit 11 is inside a specified range, and to reduce the reward when the number of errors is outside the specified range.” (¶51)];
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Kawai with the device taught by Namie and Kawai as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination for the same reasons as described in claim 1 above.

Claim 3:
	Regarding claim 3, Namie and Kawai disclose all the elements of claim 1,
	Namie further discloses, “wherein the variable related to the position command of the second servo control unit is at least one of the position command, a differentiated value of the position command, and a double-differentiated value of the position command of the second servo control unit, and” [Examiner notes that variable related to the position command of the second servo control unit requires only one of 1. the position command, 2. a differentiated value of the position command, and 3. a double-differentiated value of the position command of the second servo control unit.
	Namie teaches: variable related to the position command of the second servo control unit is position command.
	See the second servo control unit (e.g.; any one of 200s), where the variable related to the position command is a position command: “The servo driver 220 receives the command value (command position), particularly the corrected command position, which is corrected with the command value correcting device 1 using the controlled variable of the control target 3, from the controller 120, and controls the control target 3 based on the received corrected command position.” (¶59)];
	“the variable related to the position feedback information of the second servo control unit is at least one of the position feedback information, a differentiated value of the position feedback information, and a double-differentiated value of the position feedback information of the second servo control unit.” [Examiner notes that the variable related to the position feedback information of the second servo control unit requires only one of 1. the position feedback information, 2. a differentiated value of the position feedback information, and 3. a double-differentiated value of the position feedback information of the second servo control unit.
	Namie teaches: variable related to the position feedback information of the second servo control unit is position feedback information.
	See the second servo control unit (e.g.; any one of 200s), where variable related to the position feedback information of the second servo control unit is a position feedback (e.g.; feedback information including position feedback from any of the controllers 200(1)-200(n)): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)].

Claim 6 (amended):
	Regarding claim 6, Namie and Kawai disclose all the elements of claim 1,
	Namie further discloses, “A servo control device including: the machine learning device according to claim 1; and the plurality of servo control units configured to control the plurality of motors, the motors configured to drive the machine having the plurality of axes,” [See the machine learning system. See plurality of servo control units (e.g.; 200(1)-200(n)) that control plurality of motors (e.g.; control targets 3(1) to 3(n)). See motors (e.g.; control targets 3(1) to 3(n)) drive plurality of axes (e.g.; x, y, z, or rotational axis): “the controlled variable acquiring part further includes a learning controller” (¶15)… “plural lower-level controllers 200(1) to 200(n), and plural control targets 3(1) to 3(n) in which the drive is controlled with each of the lower-level controllers 200(1) to 200(n).” (¶76)… “controller 200 in order to perform drive control (such as “orbit follow-up control” and “orbit control in a working machine”) of the control target 3,” (¶34)… “the control target 3 is a servo motor, the lower-level controller 200 drives the servo motor such that the servo motor” (¶35)… “a control target 3 (a machine such as a servo motor and a machine element driven with the servo motor).” (¶33)];
	“with the one axis among the plurality of axes receiving the interference generated by the movement along at least one of the other axes, wherein” [See feedback from one of the controllers 200(1)-200(n) (e.g.; interference generated by movement along at least one of the other axes controlled by the motor of machine 3 that is controlled by controllers 200) is received by the controller 140 (e.g.; received by  command value correcting device 1; the one axis of the motor of control target 3 controlled by controller 140): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“the first servo control unit related to the one axis receiving the interference among the plurality of servo control units” [See the first servo control unit (e.g.; 140) related to the axis (e.g.; axis controlled by the motor of the machine 3 controlled by controller 140) receiving interference from the other servo control units (e.g.; 200(1)-200(n)) that control plurality of motors such as control targets 3(1) to 3(n))): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“comprises the compensation unit configured to obtain the compensation value for compensating for the at least one of the position error, the velocity command, and the torque command of the first servo control unit on the basis of the one or more functions including the at least one of the variable related to the position command and the variable related to the position feedback information of the second servo control unit related to the axis generating the interference, and” [Examiner notes that claim requires obtaining a compensation value for compensating for only one of 1. a position error, 2. a velocity command, and 3. a torque command of the first servo control unit. The obtaining is performed on the basis of a function including only one of : i. a variable related to a position command and ii. a variable related to position feedback information of a second servo control unit related to the axis generating the interference.
	Namie teaches: obtaining compensation value for compensating for a position error based on a variable related to a position feedback information of a second servo control unit related to the axis generating the interference
	See the first servo controller (e.g.; 140) includes compensator 1 that determines corrects the command position (e.g.; correcting position error) based on a variable related to a position feedback information of a second servo control unit related to the axis generating the interference (e.g.; feedback information including position feedback from any of the controllers 200(1)-200(n)): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)]; 
	“the machine learning device outputs the action information including the adjustment information of the one or more coefficients to the compensation unit.” [See action information is output and used in the learning, where action information includes adjustment information of the coefficient included in the state information (e.g.; adjustment information such as γ can be adjusted between 0<γ≦1, α can be adjusted between of 0<α≦1): “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)].

Claim 7 (amended):
	Regarding claim 7, Namie and Kawai disclose all the elements of claim 1,
	Namie further discloses, “A servo control system including: the machine learning device according to claim 1; and the plurality of servo control units configured to control the plurality of motors, the motors configured to drive the machine having the plurality of axes,” [See the machine learning system. See plurality of servo control units (e.g.; 200(1)-200(n)) that control plurality of motors (e.g.; control targets 3(1) to 3(n)). See motors (e.g.; control targets 3(1) to 3(n)) drive plurality of axes (e.g.; x, y, z, or rotational axis): “the controlled variable acquiring part further includes a learning controller” (¶15)… “plural lower-level controllers 200(1) to 200(n), and plural control targets 3(1) to 3(n) in which the drive is controlled with each of the lower-level controllers 200(1) to 200(n).” (¶76)… “controller 200 in order to perform drive control (such as “orbit follow-up control” and “orbit control in a working machine”) of the control target 3,” (¶34)… “the control target 3 is a servo motor, the lower-level controller 200 drives the servo motor such that the servo motor” (¶35)… “a control target 3 (a machine such as a servo motor and a machine element driven with the servo motor).” (¶33)];
	“with the one axis among the plurality of axes receiving the interference generated by the movement along at least one of the other axes, wherein” [See feedback from one of the controllers 200(1)-200(n) (e.g.; interference generated by movement along at least one of the other axes controlled by the motor of machine 3 that is controlled by controllers 200) is received by the controller 140 (e.g.; received by  command value correcting device 1; the one axis of the motor of control target 3 controlled by controller 140): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“the first servo control unit related to the one axis receiving the interference among the plurality of servo control units” [See the first servo control unit (e.g.; 140) related to the axis (e.g.; axis controlled by the motor of the machine 3 controlled by controller 140) receiving interference from the other servo control units (e.g.; 200(1)-200(n)) that control plurality of motors such as control targets 3(1) to 3(n))): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“comprises the compensation unit configured to obtain the compensation value for compensating for the at least one of the position error, the velocity command, and the torque command of the first servo control unit on the basis of the one or more functions including the at least one of the variable related to the position command and the variable related to the position feedback information of the second servo control unit related to the axis generating the interference, and” [Examiner notes that claim requires obtaining a compensation value for compensating for only one of 1. a position error, 2. a velocity command, and 3. a torque command of the first servo control unit. The obtaining is performed on the basis of a function including only one of : i. a variable related to a position command and ii. a variable related to position feedback information of a second servo control unit related to the axis generating the interference.
	Namie teaches: obtaining compensation value for compensating for a position error based on a variable related to a position feedback information of a second servo control unit related to the axis generating the interference
	See the first servo controller (e.g.; 140) includes compensator 1 that determines corrects the command position (e.g.; correcting position error) based on a variable related to a position feedback information of a second servo control unit related to the axis generating the interference (e.g.; feedback information including position feedback from any of the controllers 200(1)-200(n)): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)]; 
	“the machine learning device outputs the action information including the adjustment information of the one or more coefficients to the compensation unit.” [See action information is output and used in the learning, where action information includes adjustment information of the coefficient included in the state information (e.g.; adjustment information such as γ can be adjusted between 0<γ≦1, α can be adjusted between of 0<α≦1): “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)].

Claim 8 (amended):
	Regarding claim 8, Namie discloses, “A machine learning method for a machine learning device configured to perform machine learning with respect to a plurality of servo control units configured to control a plurality of motors, the motors configured to drive a machine having a plurality of axes,”  [See the machine learning system. See plurality of servo control units (e.g.; 200(1)-200(n)) that control plurality of motors (e.g.; control targets 3(1) to 3(n)). See motors (e.g.; control targets 3(1) to 3(n)) drive plurality of axes (e.g.; x, y, z, or rotational axis): “the controlled variable acquiring part further includes a learning controller” (¶15)… “plural lower-level controllers 200(1) to 200(n), and plural control targets 3(1) to 3(n) in which the drive is controlled with each of the lower-level controllers 200(1) to 200(n).” (¶76)… “controller 200 in order to perform drive control (such as “orbit follow-up control” and “orbit control in a working machine”) of the control target 3,” (¶34)… “the control target 3 is a servo motor, the lower-level controller 200 drives the servo motor such that the servo motor” (¶35)… “a control target 3 (a machine such as a servo motor and a machine element driven with the servo motor).” (¶33)];
	“with one axis among the plurality of axes receiving interference generated by movement along at least one of the other axes,” [See feedback from one of the controllers 200(1)-200(n) (e.g.; interference generated by movement along at least one of the other axes controlled by the motor of machine 3 that is controlled by controllers 200) is received by the controller 140 (e.g.; received by  command value correcting device 1; the one axis of the motor of control target 3 controlled by controller 140): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“a first servo control unit related to the one axis receiving the interference among the plurality of servo control units,” [See the first servo control unit (e.g.; 140) related to the axis (e.g.; axis controlled by the motor of the machine 3 controlled by controller 140) receiving interference from the other servo control units (e.g.; 200(1)-200(n)) that control plurality of motors such as control targets 3(1) to 3(n))): “The higher-level controller 100 includes a target orbit generator 4 and the command value correcting device 1.” “command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value. That is, the higher-level controller 100 transmits the corrected command value to the lower-level controller 200. The corrected command value is a value in which the command value correcting device 1 corrects a reference command value generated from the target orbit data (target orbit) in each control cycle using the controlled variable of the control target 3.” (¶34)];
	“the first servo control unit comprising a compensation unit configured to obtain a compensation value for compensating for at least one of a position error, a velocity command, and a torque command of the first servo control unit on the basis of one or more functions including at least one of a variable related to a position command and a variable related to position feedback information of a second servo control unit related to the axis generating the interference,” [Examiner notes that claim requires obtaining a compensation value for compensating for only one of 1. a position error, 2. a velocity command, and 3. a torque command of the first servo control unit. The obtaining is performed on the basis of one or more functions including only one of : i. a variable related to a position command and ii. a variable related to position feedback information of a second servo control unit related to the axis generating the interference.
	Namie teaches: obtaining compensation value for compensating for a position error based on a function including a variable related to a position feedback information of a second servo control unit related to the axis generating the interference
	See the first servo controller (e.g.; 140) includes compensator 1 that determines corrects the command position (e.g.; correcting position error) based on a function including a variable related to a position feedback information of a second servo control unit related to the axis generating the interference (e.g.; feedback information including position feedback from any of the controllers 200(1)-200(n)): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)], but doesn’t explicitly disclose, “and the machine learning method comprising the steps of: acquiring state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients of the one or more functions; outputting action information including adjustment information of the one or more coefficient included in the state information to the compensation unit; outputting a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information; and updating a value function related to the adjustment information of the one or more coefficients on the basis of the reward value, the state information, and the action information.”
	However, Kawai discloses, “the machine learning method comprising the steps of: acquiring state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients of the one or more functions;” [Examiner notes that claim requires only one coefficient of one function.
	See machine leaning device acquires state information including first and second servo control information (e.g.; state information including control information of each axis including a first and second axis) and a coefficient (e.g.; γ or α) : “The machine learning apparatus 1 includes a state observation unit 11 and a learning unit 12.” (¶38)… “The state observation unit 11 observes a state variable composed of at least one of data relating to the number of errors between the position command relative to a rotor of a motor which is drive-controlled by the motor control apparatus” “any command of the position command, the speed command, or the current command in the motor control apparatus,” “data relating to a state of the machine tool including the motor control apparatus.” (¶39)… “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)];
	“outputting action information including adjustment information of the one or more coefficient included in the state information to the compensation unit;” [See action information is output and used in the learning, where action information includes adjustment information of the coefficient included in the state information (e.g.; adjustment information such as γ can be adjusted between 0<γ≦1, α can be adjusted between of 0<α≦1): “In the above equation (1), st represents a state of the environment at a time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be gained via the change of the state. Further, the term with max is the Q-value multiplied by γ for the case where the action a for the highest Q-value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as discount rate. α is a learning factor, which is in the range of 0<α≦1.” (¶95)];
	“outputting a reward value for reinforcement learning using an evaluation function serving as a function of the first servo control information;” [See system outputs a reward value for reinforcement learning using evaluation function serving as a function of the first servo control information (e.g.; using function of first servo control information such as function of information related to the control information of the servo motor): “the reward calculation unit 21 may be configured to increase the reward when the number of errors observed by the state observation unit 11 is smaller than the number of errors observed by the state observation unit 11 before the current number of errors, and reduce the reward when larger” “increase the reward when the number of errors observed by the state observation unit 11 is inside a specified range, and to reduce the reward when the number of errors is outside the specified range.” (¶51)];
	“updating a value function related to the adjustment information of the one or more coefficients on the basis of the reward value, the state information, and the action information.” [See system updates the value function related to the adjustment information (e.g.; information related to the adjustment) of the coefficient (i.e.; correction coefficient) on the basis of reward value that was outputted, state information (e.g.; state variable observed) and action information (e.g.; action value table): “The function update unit 22 updates a function (action value table) for calculating the number of corrections used to correct any command of the position command, the speed command, or the current command in the motor control apparatus, based on the state variable observed by the state observation unit 11 and the reward calculated by the reward calculation unit 21.” (¶52)… “The learning unit 12 may calculate,” “the state variable observed by the state observation unit 11 and update the function (action value table) in real time.” “the function update unit 22” “update the function for calculating the number of corrections used to correct any command of the position command, the speed command, or the current command in the motor control apparatus, based on the state variable observed by the state observation unit 11 and the reward calculated by the reward calculation unit 21” (¶53)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of acquiring state information including first and second servo control information of the first and second servo control unit and a coefficient of the function; combined the capability of outputting adjustment information of the coefficient included in the state information; combined the capability of outputting a reward value using a function of the first servo control information; and combined the capability of updating a value function related to the adjustment information (e.g.; information related to the adjustment) of the coefficient (i.e.; correction coefficient) on the basis of the reward value output, the state information, and the action information taught by Kawai with the method taught by Namie as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination in order to perform efficient learning of data and operation of the servo controller [Kawai: “it is possible to use, in unsupervised learning, data that can be acquired without actually operating the motor control apparatus (for example, data of simulation) and perform learning efficiently.” (¶54)].

Claim(s) 4-5 and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Namie and Kawai, and further in view of Ikai et al. (US20180364678A1) [hereinafter Ikai].
Claim 4:
	Regarding claim 4, Namie and Kawai disclose all the elements of claim 1,
	Namie further discloses, “wherein a machining program during learning for controlling the first and second servo control units makes the first and second servo control units perform movement along the axis generating the interference” [See controlling first (e.g.; controlling controller 140) and second servo control units (e.g.; any of controller 200(1)-200(n)) to generate movement along the axis generating the interference (e.g.; generating feedback): “The command value correcting device 1 corrects the command position (reference command position) that is the reference control command value generated with the target orbit generator 4 using the controlled variable (detection position, that is, feedback position) that is the output of the control target 3 as the feedback information, and outputs the corrected command value (corrected command position).” (¶58)… “The command value correcting device 1 corrects the reference control command value generated with the target orbit generator 4 using a controlled variable that is output from the control target 3 as feedback information, and outputs a corrected command value.” (¶34)… “The command value correcting device 1 corrects the command value” “using the feedback information (the controlled variable of the control target 3) used in the lower-level controller 200” (¶39)], but doesn’t explicitly disclose, “wherein a machining program during learning for controlling the first and second servo control units makes the first and second servo control units” “stop movement along the axis receiving the interference during the machine learning.”
	However, Ikai discloses, “wherein a machining program during learning for controlling the first and second servo control units makes the first and second servo control units” “stop movement along the axis receiving the interference during the machine learning.” [See learning control of the motors are performed, and during learning operation, the system stops movement along the axis (e.g.; the motor 201 that moves the table in the X axis direction makes a transition from rotation to stop) receiving the interference (e.g.; X axis receiving interference due to the movement of the Y axis): “The machine learning device 300 searches by trial and error the optimal action a with which the total reward r for the future becomes the maximum. Thereby, the machine learning device 300 can select the optimal action a (that is, the optimal control parameters ai, bj)” (¶49)... “the motor 202 that moves the table in the Y axis direction makes a transition from stop to rotation operation, the motor 201 that moves the table in the X axis direction makes a transition from rotation to stop, and the table makes a transition from linear operation of the X axis direction to linear operation of the Y axis direction.” (¶52)… “the operation characteristics of when the motor 202 that drives the Y axis makes a transition from stop to the rotation operation are evaluated, and when the motor 202 that drives the Y axis rotates in the C2 point, the operation characteristics of when the motor 201 that drives the X axis makes a transition from the rotation operation to stop are evaluated. However, only with the operation in which the geometry is a square with corners R, operation characteristics in a shape in which movement starts in the same direction in stop and before the stop, cannot be evaluated.” (¶60)];
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of stopping movement of an axis that is receiving interference during a machine learning process taught by Ikai with the device taught by Namie and Kawai as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination in order to efficiently adjust a control parameter [Ikai: “controlling a motor by operating a motor control unit related to a machine tool, a robot, or an industrial machine, to efficiently adjust a control parameter, and the like of the motor control unit on the basis of a result of the control, an evaluation method, and a control device can be provided.” (¶19)].

Claim 5:
	Regarding claim 5, Namie and Kawai disclose all the elements of claim 1, but they do not explicitly disclose, “an optimization action information output unit configured to output the adjustment information of the one or more coefficients on the basis of the value function updated by the value function updating unit.”
	However, Ikai discloses, “an optimization action information output unit configured to output the adjustment information of the one or more coefficients on the basis of the value function updated by the value function updating unit.” [See the coefficients are adjusted based on based on the updated value function (e.g.; updated optimal evaluation): “then the evaluation program is operated in the CNC device 100, and thereby, operation characteristics of the CNC device related to control parameters ai, bj is observed. Thereby, the machine learning device 300 can adjust (learn) the coefficients ai, bj with which operation characteristics of a machine tool of when the machine tool is operated by the evaluation program are optimal, from among a set of coefficients ai, bj that have been set to arbitrary values. Thus, the machine learning device 300 uses the position detection value, and the like obtained by feedback from the motors 201, 202, to learn the control parameters ai, bj for feedforward compensation, and set the optimal control parameters with respect to the motor control units 103, 104.” (¶47)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Ikai with the device taught by Namie and Kawai as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination for the same reasons as described in claim 1 above.

Claim 9:
	Regarding claim 9, Namie and Kawai disclose all the elements of claim 8, but they do not explicitly disclose, “the machine learning device outputting adjustment information of the one or more coefficients, serving as optimization action information on the basis of the updated value function.”
	However, Ikai discloses, “the machine learning device outputting adjustment information of the one or more coefficients, serving as optimization action information on the basis of the updated value function.” [See the coefficients are adjusted based on based on the updated value function (e.g.; updated optimal evaluation): “then the evaluation program is operated in the CNC device 100, and thereby, operation characteristics of the CNC device related to control parameters ai, bj is observed. Thereby, the machine learning device 300 can adjust (learn) the coefficients ai, bj with which operation characteristics of a machine tool of when the machine tool is operated by the evaluation program are optimal, from among a set of coefficients ai, bj that have been set to arbitrary values. Thus, the machine learning device 300 uses the position detection value, and the like obtained by feedback from the motors 201, 202, to learn the control parameters ai, bj for feedforward compensation, and set the optimal control parameters with respect to the motor control units 103, 104.” (¶47)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of outputting adjustment information of the coefficient of the compensation unit, serving as optimization action information on the basis of the updated value function taught by Ikai with the method taught by Namie and Kawai as discussed above. A person of ordinary skill in the servo control optimization field would have been motivated to make such combination in order to efficiently adjust a control parameter [Ikai: “controlling a motor by operating a motor control unit related to a machine tool, a robot, or an industrial machine, to efficiently adjust a control parameter, and the like of the motor control unit on the basis of a result of the control, an evaluation method, and a control device can be provided.” (¶19)].

Response to Arguments
Applicant's arguments filed 06/17/2022 have been fully considered but they are not persuasive.
Applicant responds
(a)	Rejections under 35 U.S.C. § 103
	The Applicant asserts that Kawai fails to disclose or suggest machine learning and making adjustment to the coefficients ai to a6 of a position error compensation unit, the coefficients bi to b6 of a velocity command compensation unit, and the coefficients ci to c6 of a torque command compensation unit.
(Page: 9)

With respect to (a) above, Examiner appreciates the interpretative description given by Applicant in response.
	In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., “making adjustment to the coefficients ai to a6 of a position error compensation unit, the coefficients bi to b6 of a velocity command compensation unit, and the coefficients ci to c6 of a torque command compensation unit”) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Claim recites, “acquire state information including first servo control information of the first servo control unit, second servo control information of the second servo control unit, and one or more coefficients of the one or more functions;” “output action information including adjustment information of the one or more coefficients included in the state information” “update a value function related to the adjustment information of the one or more coefficients”
In broadest reasonable interpretation, claim describes updating one or more coefficients where the coefficients can be any coefficients. However, claim doesn’t describe specifically, “making adjustment to the coefficients” “of a position error compensation unit, the coefficients” “of a velocity command compensation unit, and the coefficients” “of a torque command compensation unit.”
Applicant’s arguments are fully considered, but for the above described reasons, they are not persuasive; therefore, claims 1-9 are rejected under 35 U.S.C. § 103 in view of the references as set forth in the current office action.

(b)	Rejections under 35 U.S.C. § 103
	As noted above, although the Examiner relies on the combination of Namie and Kawai for disclosing or suggesting all the features recited in independent claims 1 and 8, the Examiner appears to rely specifically on Kawai for disclosing or suggesting the features of the reward output unit and a value function updating unit, as similarly recited in claims 1 and 8. 
	Based on the deficiencies noted above in Kawai, the Applicant asserts that no combination of Namie and Kawai would result in, or otherwise render obvious, the features of claims 1 and 8. Additionally, no combination of Namie and Kawai would result in, or otherwise render obvious, the features of claims 2, 3, and 6 and 7 by virtue of their dependencies from independent claim 1.

	Claims 4 and 5 depend from independent claim 1, and claim 9 depends from independent claim 8. As noted above, no combination of Namie and Kawai would result in, or otherwise render obvious, the features of claims 1 and 9. Additionally, the Applicant asserts that Ikai fails to overcome the deficiencies noted in the combination of Namie and Kawai. Accordingly, no combination of Namie and Kawai with Ikai would result in, or otherwise render obvious, the features of claims 4, 5, and 9 at least by virtue of their respective dependencies from independent claims 1 and 9. 
	In light of the above, the Applicant submits that all the pending claims are patentable over the prior art of record. The Applicant respectfully requests that the Examiner withdraw the rejections presented in the outstanding Office Action and pass the present application to issue.
(Page: 9-10)

With respect to (b) above, Examiner appreciates the interpretative description given by Applicant in response.
Applicant’s arguments are fully considered, but for the same reasons as described above in (a), they are not persuasive; therefore, claims 1-9 are rejected under 35 U.S.C. §103 in view of the references as set forth in the current office action.
Conclusion
	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed in the PTO-892 Notice of Reference Cited document.
US20150148952A1 – Robot control apparatus and robot control method:
	Shiratsuchi describes, a positional-relation-matrix generating unit that generates, on the basis of N first command values for learning per one robot, which are position-corrected command values for positioning the first robot and the second robot on tracks of the first robot and the second robot during the synchronous driving, concerning the respective N first command values for learning, positional relation matrices for defining a positional relation during the synchronous driving between a first command value for learning related to the first robot and a first command value for learning related to the second robot; a first command-value output unit that outputs a first command value for driving at each of M (M>N) operation periods (¶11).
	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED SHAFAYET whose telephone number is (571)272-8239. The examiner can normally be reached M-F 8:30 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth M Lo can be reached on (571)272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/M.S./
Examiner
Art Unit 2116



/KENNETH M LO/Supervisory Patent Examiner, Art Unit 2116