DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

1. 	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
2.	Applicant’s AMENDMENTS TO THE CLAIMS filed December 18, 2020 is respectfully acknowledged. Claims 1-20 are pending for examination.

Response to Arguments
3.	Applicant's arguments filed December 18, 2020 have been fully considered but they are not persuasive. 
	Applicant states that Coenen does not teach the claimed two learning phases because, while “the phase (i) is the phase where the drone learns to take off to reach the desired position, phase (ii) is the phase where the drone learns to maintain stabilization of the drone position, and phase (iii) is the phase where the drone is in normal operation”, “phase (ii) cannot be the claimed operations and secondary learning phase because the drone has yet to learn to maintain stabilization of the drone position after phase (i)”, and “phase (iii) cannot be the claimed operations and secondary learning phase because no learning is taking place during normal operation”. However, Examiner’s position is that the Applicant appears to focus on what Coenen teaches rather than showing a contrast between the teaching of Coenen and the claim language. Particularly, the claim language does not specifically appear to define what happens during the two phases in a manner that distinguishes from the phases as 
	Applicant’s arguments with respect to claim(s) 1 that “Coenen makes no mention of using observation data for learning” have been considered but are moot because the new ground of rejection does not rely on all references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Interpretation
4. 	The following is a quotation of 35 U.S.C. 112(f):
	(f) Element in Claim for a Combination. - An element in a claim for a combination may be expressed as a means or step for 	performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be 	construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

5.    	The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
	An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital 	of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, 	or acts described in the specification and equivalents thereof.

6.    	This application includes one or more claim limitations that do not use the word "means," but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: "control and learning module" in claims 1-20, "state estimation module" in claims 5, 6, 8, and 16; "dynamics modeling module" in claims


	Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
	If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
7. 	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

8. 	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


9. 	Claims 1, 2, 5-11 and 16-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over
Coenen et al. (US 9,189,730 B1) in view of Ponulak et al. (US 9,597,797 B2).

	Regarding claim 1, Coenen et al. discloses a control and learning module for controlling a robotic arm (i.e. a controller apparatus 520 configured to control a plant 510 which may include, for example, a robotic arm - ¶86, 88; FIG. 5), comprising:
at least one learning module including at least one neural network (i.e. controller 520 of FIG. 5 may comprise spiking neuron network, e.g., the network 400 of FIG. 4 - ¶109), wherein the at least one neural network is configured to receive and be trained by both state measurements based on measurements of current state by sensors and observation measurements based on observation data obtained by an observation system (i.e. the input signal x(t) may comprise desired motion trajectory, for example, in order to predict future state of the robot on the basis of current state and desired motion; state variables q associated with the control model may be provided to the learning block 420 via the pathway 405, the learning block 420 of the neuron 430 may receive the output spike train y(t) via the pathway 408_1; the input signal x(t) may comprise a multi-dimensional input, comprising two or more input streams from individual sensor channels - ¶72, 88, 131).
	Coenen et al. does not disclose wherein the at least one neural network is configured to receive and be trained during the initial phase, and is configured to be re-tuned by updated observation data for improved performance during an operations and secondary learning phase when the robotic arm is in normal operation and after the intial learning phase in one embodiment.
	However, Coenen et al. in another embodiment (FIG. 9A-9C) that the position error is shown as a function of time during the following phases of operation: (i) initialization, where the pendulum motors may be activated and controller network initial weights assigned; (ii) training, where the pendulum controller may be provided with reinforcement signal while adjusting learning parameters of the network in order to stabilize pendulum orientation; (iii) operation after learning where the controller may maintain pendulum position without requiring farther reinforcement indication; and/or other phases of operation (¶253).
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the embodiment of at least FIG. 5 of Coenen et al. to include the embodiment of FIGS 9A-9C of Coenen et al. in order to reduce energy use by the controller.
	Coenen et al. does not disclose wherein the state measurements represent actual values of the current state and the observation measurements represent estimated values of the current state.
	However, Ponulak et al. discloses that estimation of the force/torque exerted by a trainer (the, so called, contact forces) may be achieved using an internal model configured to predict expected unconstrained readings from the force/torque sensors and by comparing them with the actual sensory readings (¶49).
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include the features of Ponulak et al. in order to reduce the discrepancy between the target behavior and the actual behavior of the robot.

	Regarding claim 2, Coenen et al. further discloses the control and learning module according to Claim 1, wherein the current state relates to information external to the robotic arm (i.e. sensors configured to detect feedback from a user and/or the environment - ¶48).

	Regarding claim 5, Coenen et al. further discloses the control and learning module according to Claim 1, wherein the at least one learning module comprises:
	a state estimation module configured to provide an estimated state based on only the 	observation measurements (i.e. learning block 420 of FIG. 4 may receive multidimensional 	sensory input signals x(t) and internal state variables - ¶120); and
	a dynamics modeling module configured to generate a dynamics model and a dynamics model 	output variance, the dynamics model output variance representing an uncertainty of the 	dynamics model (i.e. generalized dynamics equations for spiking neurons models may be 	expressed as a superposition of input, interaction between the input current and the neuronal 	state variables - ¶56).

	Regarding claim 6, Coenen et al. further discloses the control and learning module according to Claim 5, wherein the state estimation module is configured to output a first estimated current state and a variance associated with the first estimated current state (the input signal x(t) may comprise desired motion trajectory, for example, in order to predict future state of the robot on the basis of current state and desired motion - ¶88).

	Regarding claim 7, Coenen et al. further discloses the control and learning module according to Claim 6, wherein the dynamics modeling module is configured to output a second estimated current state (i.e. learning method implementation may be advantageous in applications where the performance function F(t) may depend on the current values of the inputs x, outputs y, state variables q, and/or signal r - ¶114).

	Regarding claim 8, neither Coenen et al. nor Ponulak et al. discloses the control and learning module according to Claim 7, wherein the state estimation module and the dynamics modeling module are each configured to receive an input relating to a difference between the first estimated current state and the second estimated current state to improve performance during the operations and secondary learning phase.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Ponulak et al. to include wherein the state estimation module and the dynamics modeling module are each configured to receive an input relating to a difference between the first estimated current state and the second estimated current state to improve performance during the operations and secondary learning phase, since the module of Ponulak et al. facilitates sensor value comparison so a discrepancy between the predicted state and actual state may be determined.

	Regarding claim 9, Coenen et al. further discloses the control and learning module according to Claim 5, wherein the estimated state includes estimated positions of obstacles and target objects in an environment (i.e. signal x(t) may comprise a stream of raw sensor data, e.g., proximity, inertial, terrain imaging, and/or other raw sensor data and/or preprocessed data, e.g., velocity, extracted from accelerometers, distance to obstacle, positions, and/or other preprocessed data; such as those involving object recognition, the signal x(t) may comprise an array of pixel values in the input image - ¶72).

	Regarding claim 10, Coenen et al. further discloses the control and learning module according to Claim 5, further comprising a control policy module configured to generate a control policy/rule command and a control policy/rule variance associated with the control policy command based on the estimated state from the state estimation module (i.e. M multiple performance measures PI (x,y,q,r,t), 1=1... M, may be utilized in order to, for example, implement more than one learning rule
simultaneously, and potentially in parallel, and/or to different sections of the preprocessing pathways. In one or more implementations, some of the multiple performance measures may be associated with sensory processing areas. In some implementations some of the multiple performance measures may correspond to areas providing state information and/or other type of information to the controllers -¶194).

	Regarding claim 11, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 5, wherein the control policy module is configured to generate the control policy command and the control policy variance only during the operations and secondary learning phase.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include wherein the control policy module is configured to generate the control policy command and the control policy variance only during the operations and secondary learning phase, since the embodiment of FIG. 9A-9C facilitates multiple phases of operation.

	Regarding claim 16, Coenen et al. does not disclose the control and learning module according to Claim 10, wherein the state estimation module, the dynamics modeling module, and the control policy module each include a neural network which receives training in both the initial learning phase and the operations and secondary learning phase in the same embodiment.
	However, Coenen et al. in another embodiment (FIG. 9A-9C) that the position error is shown as a function of time during the following phases of operation: (i) initialization, where the pendulum motors may be activated and controller network initial weights assigned; (ii) training, where the pendulum controller may be provided with reinforcement signal while adjusting learning parameters of the network in order to stabilize pendulum orientation; (iii) operation after learning where the
controller may maintain pendulum position without requiring farther reinforcement indication; and/or other phases of operation (¶253).
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the embodiment of at least FIG. 5 of Coenen et al. to include the embodiment of FIGS 9A-9C of Coenen et al. in order to reduce energy use by the controller.

	Regarding claim 17, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 16, wherein the state estimation module, the dynamics modeling module, and the control policy module each output a variance representing uncertainty of each of the state estimation module, the dynamics modeling module, and the control policy module.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include wherein the state estimation module, the dynamics modeling module, and the control policy module each output a variance representing uncertainty of each of the state estimation module, the dynamics modeling module, and the control policy module, since the system of Coenen et al. shows that it is capable of facilitating outputting an uncertainty/probability in its components.

	Regarding claim 18, Coenen et al. further discloses the control and learning module according to Claim 5, wherein the dynamics modeling module includes a preliminary dynamics model and a complementary dynamics model, the preliminary dynamics model being predetermined and providing state prediction based on existing knowledge about system dynamics of the robotic arm (i.e. the reinforcement learning process of the controller 520 of FIG. 5 may be based on the sensor input 504,
502 and the reinforcement signal 514, e.g., obstacle collision signal from robot bumpers, distance from robotic arm endpoint to the target position. The reinforcement signal r(t) may inform the adaptive
controller that the previous behavior led to "desired" or "undesired" results, corresponding to positive and negative reinforcements, respectively. While the plant must be controllable and the control system may be required to have access to appropriate sensory information, the detailed knowledge of motor actuator dynamics or of structure and significance of sensory signals may not be required to be known by the controller apparatus 520 - ¶114).

	Regarding claim 19, Coenen et al. further discloses the control and learning module according to Claim 18, wherein the complementary dynamics model is configured to generate a correction parameter to correct the state prediction provided by the preliminary dynamics model (i.e. controller 520 may be configured to generate the output 508 so as to optimize the performance measure. Optimizing the performance measure may minimize the error - ¶98).

10. 	Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Coenen et al. (US 9,189,730 B1) in view of Ponulak et al. (US 9,597,797 B2) as applied to claims 1, 2, 5-11 and 16-19 above, and further in view of Narayanan et al. (US 2018/0361514 A1).

	Regarding claim 3, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 1, wherein the at least one neural network is represented as a Bayesian neural network.
	However, Narayanan et al. discloses that the machine learning system 100 may employ, for example, a support vector machine, a tensor processing unit, a graphics processing unit, an artificial neural network, a Bayesian network, or a learning classifier system [0035].
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include the features of Narayanan et al. in order to be able to more quickly and easily hone in on variables or parameters that
will result in quality welds and characterize a quality weld in accordance with code requirements before going through all of the trouble of performing the mechanical testing.

11.    	Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Coenen et al. (US 9,189,730 B1) in view of Ponulak et al. (US 9,597,797 B2) as applied to claims 1, 2, 5-11 and 16-19 above, and further in view of Calise et al. (US 7,769,703 B2).

	Regarding claim 4, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 1, wherein the at least one neural network is configured to generate an output relating to an output task and a variance associated with the output, the variance being a measure of uncertainty relating to reliability of the output task.
	However, Calise et al. discloses that a diagram of a neural network comprising an input layer, hidden layer and output layer of neurons with connection weights N, M updated using the estimated training error signal to account for unmodeled dynamics and uncertainty in the parameters input to the neural network (¶4).
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include the features of Calise et al. in order to provide augmentation of an EKF with an NN which accounts for the unmodeled dynamics of the target and platform used to observe the target.

12.    	Claim(s) 12-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Coenen et al. (US 9,189,730 B1) in view of Ponulak et al. (US 9,597,797 B2) as applied to claims 1, 2, 5-11 and 16-19 above, and further in view of Okada et al. (US 2018/0218262 A1).

	Regarding claim 12, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 10, further comprising an optimal control module configured to generate an optimal control command based on the dynamics model from the dynamics modeling module and one of the state measurements and estimated states.
	However, Okada et al. discloses a control method for use in a control device for performing optimal control by path integral, including inputting a current state of a control target and an initial control sequence being a control sequence having a plurality of control parameters for the control target as its components into a neural network including a machine-learned dynamics model and cost function, and outputting a control sequence for controlling the control target, the control sequence being calculated by the neural network by path integral from the current state and the initial control sequence by using the dynamics model and the cost function [0038],
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include the features of Okada et al. in order to conceive a control device and control method capable of achieving optimal control using a neural network.

	Regarding claim 13, neither Coenen et al. nor Ponulak et al. nor Okada et al. disclose the control and learning module according to Claim 12, wherein the optimal control module is configured to override the control policy command from the control policy module when the control policy variance is larger than a predefined variance threshold value.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include wherein the optimal control module is configured to override the control policy command from the control policy module when the control policy variance is larger than a predefined variance threshold value,
discovering the optimum or workable ranges involves only routine skill in the art. In re Alter, 105 USPQ 233.

	Regarding claim 14, neither Coenen et al. nor Ponulak et al. nor Okada et al. disclose the control and learning module according to Claim 13, further comprising a reachability analysis module configured to receive the state measurements, the dynamics model parameters and the associated output variance from the dynamics modeling module, and determine whether the current state is in a safe state.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include a reachability analysis module configured to receive the state measurements, the dynamics model parameters and the associated output variance from the dynamics modeling module, and determine whether the current state is in a safe state, because the control system of Coenen et al. is shown to facilitate safe operations between the user and the robot.

	Regarding claim 15, neither Coenen et al. nor Ponulak et al. nor Okada et al. disclose the control and learning module according to Claim 14, wherein the reachability analysis module is configured to generate a robust control command overriding the optimal control command from the optimal control module when the reachability analysis module determines that the current state is in an unsafe state.
	However, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include wherein the reachability analysis module is configured to generate a robust control command overriding the optimal control command from the optimal control module when the reachability analysis module determines that the current state is in an unsafe state, since it is well known in the art to be desirable to place the control and learning module of Coenen et al. in a safe state of operation to prevent injury to the user.

13. 	Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Coenen et al. (US 9,189,730 B1) in view of Ponulak et al. (US 9,597,797 B2) as applied to claims 1, 2, 5-11 and 16-19 above, and further in view of Schreiber et al. (US 10,635,074 B2).

	Regarding claim 20, neither Coenen et al. nor Ponulak et al. disclose the control and learning module according to Claim 17, wherein the complementary dynamics model is configured to generate the dynamics model variance associated with the correction parameter.
	However, Schreiber et al. discloses that a dynamic model is typically defined by determining the dynamic equations of the manipulator based on the weight and geometric dimensions of the manipulator's components, with the variance between the dynamic model and reality have been shown to easily be 5-10% (¶9-11).
	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the module of Coenen et al. to include the features of Schreiber et al. for increased robustness of the recognition of the release request results.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RODNEY P KING whose telephone number is (571)270-7823.  The examiner can normally be reached on 7am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Khoi Tran can be reached on 571-272-6919.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


RODNEY P. KING
Examiner
Art Unit 3664



/KHOI H TRAN/Supervisory Patent Examiner, Art Unit 3664