DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
This is a final office action for application Serial No. 16/159,194The amendment filed on 04/27/2021 has been entered and fully considered.  
Claims 3 and 12 have been amended.
Claims 1-20 are pending in Instant Application.
Response to Arguments/Rejections
Applicant's arguments, see remarks filed 04/27/2021 have been fully considered but they are not persuasive. Applicant argues that the prior arts does not teach the features with respect to Claims 1, 10 and 19. Specifically, “Seo fails to disclose outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame, and at least one predicted action and controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action”, and further states “Seo and/or Shalev-Shwartz also fail to disclose controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action. There is no disclosure by Seo and/or Shalev-Shwartz of a naturalistic driving behavior data set that includes at least one goal-oriented action that is based on the at least one past image frame, the current image frame, and at least one predicted action that is utilized to control a vehicle to be autonomously driven”. 
Seo fails to disclose outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame, and at least one predicted action and controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action” as recited in claim 1.  The Examiner admitted that Seo does not teach the claim feature in the previous office action. However, Seo in view of Shalev-Shwartz teaches the feature, where the Shalev-Shwartz reference was brought in to teach the claim feature “outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame”. 
	The Examiner points to paragraph [0115] of the Shalev-Shwartz reference, where the citation is interpreted that the captured images analyzed are at least one past image frame, the current image frame. Furthermore, it interpreted that these analyzed images provides “control signals to one or more of throttling system 220, braking system 230, and steering system 240 to navigate vehicle 200 (e.g., by causing an acceleration, a turn, a lane shift, etc.)” which is controlling a vehicle to be autonomously driven based on a naturalistic driving behavior. As stated in paragraph [0200] of Shalev-Shwartz, the data set are a set of navigational goals for the host vehicle and the data set is used to provide driving comfort and human-like driving behavior of the system which is interpreted as the claim feature controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action developed by the safety model and executed in an autonomous driving state.

Applicant’s arguments, with respect to the rejection(s) of claim(s) 3 and 12 under 35 USC § 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Brown et al. (US-20200051252).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 9-10 and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seo et al. (US-2018/0246521) in view of Shalev-Shwartz (US-2018/0032082).
	Regarding Claim 1, Seo discloses a computer-implemented method for utilizing a temporal recurrent network for online action detection (see at least Seo:  Abstract), comprising: 
	receiving image data that is based on at least one image captured by a vehicle camera system (Seo: Para. [0052], [0074] an operation of the image processing apparatus 100 is described based on the first frame 110 and the second frame 121, but the following
description may be also applicable to other frames of the input image); 	
	analyzing the image data (Seo: Para. [0061], By adjusting the exposure of the second frame 121, the image processing apparatus 100 may generate a first frame 130 of the synthesized image by synthesizing the first frame 110 and the second frame 123. For example, the image processing apparatus 100 may synthesize the remaining region not including the target object 115 in the first frame 110 and a region 125 corresponding to the target object 115 in the second frame 123) to determine a plurality of image frames (Seo: Para. [0052], The input image may include a plurality of consecutive or substantially consecutive frames. For example, the input image includes a first frame 110 and a second frame 121) wherein the plurality of image frames include at least one past image frame, and a current image frame (Seo:  Para. [0056], The image processing apparatus 100 may adjust an exposure of the second frame 121 in response to the target object 115 being recognized in the first frame 110 (* i.e., past image frame). A second frame 123 (* i.e., current image frame) indicates a state in which the exposure is adjusted).
	Seo does not explicitly teach:
	outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame, and at least one predicted action; and 
	controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action. 
	However, in the same field of endeavor, Shalev–Shwartz teaches:
	outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame (Shalev–Shwartz: Para. [0115], lines 3-14, As indicated in FIG. 2F, vehicle 200 may include throttling system 220, braking system 230, and steering system 240. System 100 may provide inputs (e.g., control signals) to one or more of throttling system 220, braking system 230, and steering system 240 over one or more data links (e.g., any wired and/ or wireless link or links for transmitting data). For example, based on analysis of images acquired by image capture devices 122, 124, and/or 126, system 100 may provide control signals to one or more of throttling system 220, braking system 230, and steering system 240 to navigate vehicle 200 (e.g., by causing an acceleration, a turn, a lane shift, etc.)), and at least one predicted action (Shalev–Shwartz: Para. [0145], lines 8-10, processing unit 110 may estimate parameters for the detected object and compare the object's frame-by-frame position data to a predicted position); and 
	controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set (Shalev–Shwartz: Para. [0200], lines 12-18, The set of Desires, together with a set of hard constraints that are defined directly based on the sensed state, establish an optimization problem whose solution is the trajectory for the vehicle. The hard constraints may be employed to further increase the safety of the system, and the Desires can be used to provide driving comfort and human-like driving behavior of the system) that includes the at least one goal-oriented action (Shalev–Shwartz: Para. [0151], lines 1-6, At step 558, processing unit 110 may consider additional sources of information to further develop a safety model for vehicle 200 in the context of its surroundings. Processing unit 110 may use the safety model to define a context in which system 100 may execute autonomous control of vehicle 200 in a safe manner). 
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the computer-implemented method as taught by Seo and combine outputting at least one goal-oriented action as determined during the current image frame, wherein the at least one goal-oriented action is based on the at least one past image frame, the current image frame, and at least one predicted action; and controlling a vehicle to be autonomously driven based on a naturalistic driving behavior data set that includes the at least one goal-oriented action as taught by Shalev–Shwartz. One of ordinary skill in the art would 
	Regarding Claim 9, Seo in view of Shalev–Shwartz teaches the computer-implemented method of claim 1. Seo further teaches the method including classifying at least one stimulus-driven action based on evaluating at least one behavioral event and the image data (see at least Seo: Para. [0052], The second frame 121, according to one or more embodiments, may be a next frame following the first frame 110. However, any preceding or succeeding frame within a predetermined time window (e.g. suitably adjusted for speed of the vehicle and/or the object) may suitably be employed), wherein an external stimuli is determined to be a cause of the at least one behavioral event (see at least Seo:  Para. [0094], the control apparatus 920 may automatically maintain a distance between vehicles, indicate that the autonomous driving vehicle 900 is departing a lane or staying in a lane, and inform the autonomous driving vehicle 900 of an obstacle around the autonomous driving vehicle 900 based on the synthesized image), wherein controlling the vehicle to be autonomously driven is based on the naturalistic driving behavior data set that includes the at least one stimulus-driven action (see at least Seo: Para. [0094], the autonomous driving vehicle 900 such that the autonomous driving vehicle 900 is able to drive itself without a driver performing a steering wheel operation, an acceleration operation, a turn indicator operation, and a deceleration operation (amongst other operations) in an autonomous driving state).
	Regarding Claim 10, the claim(s) recites analogous limitations to claim(s) 1 above, and
is/are therefore rejected on the similar premise.		
	Regarding Claim 18, the claim(s) recites analogous limitations to claim(s) 9 above, and

	Regarding Claim 19 the claim(s) recites analogous limitations to claim(s) 1 above, and is/are therefore rejected on the same premise. Seo further discloses a non-transitory computer readable storage medium storing instructions that when executed by a computer, which includes a processor perform a method (see at least Seo: Para. [0015], A non-transitory computer-readable storage medium may store instructions that, when executed by the processor, cause the processor to perform the method)…
	Regarding Claim 20 the claim(s) recites analogous limitations to claim(s) 9 and 18 above, and is/are therefore rejected on the same premise.
Claims 2 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seo in view of Shalev-Shwartz as applied to claim 1 above, and further in view of Delgado et al. (US-2018/0053103).
	Regarding Claim 2, Seo in view of Shalev-Shwartz teaches the computer-implemented method of claim 1. Neither Seo nor Shalev-Shwartz teaches:
	 wherein analyzing the image data to determine the plurality of image frames includes down sampling the image data, wherein the down sampled image data is converted into the plurality of image frames. 
	However, in the same field of endeavor, Delgado teaches:
	where the image data includes down sampling the image data, wherein the down sampled image data is converted into the plurality of image frames (see at least Delgado: Para. [0100], visual data can be down-sampled to an optimal or preferred frames per second (fps) rate (e.g., 10 fps to 50 fps) deemed sufficient to accurately represent the visual behavior or allocation of the driver 303p while removing unnecessary data. Down-sampling can also allow the visual allocation management system to process visual data in a standard or uniform way, such that visual allocation management results can be more accurate).  
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the method according claim 1 taught by Seo in view of Shalev-Shwartz and combine wherein the down sampled image data is converted into the plurality of image frames taught by Delgado. One of ordinary skill in the art would have been motivated to make this modification in order to save memory space while still having optimal performance. (Delgado: Para. [0100]).
	Regarding Claim 11, Seo in view of Shalev-Shwartz teaches the system of claim 10. Neither Seo nor Shalev-Shwartz explicitly teach wherein analyzing the image data to determine the plurality of image frames. However, Seo does not mention where image date includes down sampling the image data, wherein the down sampled image data is converted into the plurality of image frames.
	However, in the same field of endeavor, Delgado teaches:  
	image date includes down sampling the image data, wherein the down sampled image data is converted into the plurality of image frames (see at least Delgado: Para. [0100], visual data can be down-sampled to an optimal or preferred frames per second (fps) rate (e.g., 10 fps to 50 fps) deemed sufficient to accurately represent the visual behavior or allocation of the driver 303p while removing unnecessary data. Down-sampling can also allow the visual allocation management system to process visual data in a standard or uniform way, such that visual allocation management results can be more accurate).  
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the system according to claim 10 as taught by Seo in view of .
Claims 3-5 and 12-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seo in view of Shalev-Shwartz as applied to claim 1 above, and further in view of Brown et al. (US-2020/0051252).
	Regarding claim 3, (Currently Amended) Seo in view of Shalev-Shwartz teaches the computer-implemented method of claim 1. Neither Seo nor Shalev-Shwartz teaches wherein analyzing the image data to determine a plurality of image frames includes performing spatial-temporal feature extraction on the at least one past image frame and the current image frame.
	However, in the same field of endeavor, Brown teach 
	wherein analyzing the image data to determine a plurality of image frames (Para. [0026], lines 7-12, sensory related ness can be determined by analyzing image frames that are both similar and different. Accordingly, sets of image frame triplets can be generated for training as discussed in more detail elsewhere herein. In this example, a reference frame 302 is selected from the video sequence) includes performing spatial-temporal feature extraction on the at least one past image frame and the current image frame (Para. [0021], lines 4-13, various embodiments utilize generalized vector space models that are created from temporal sequences of high-dimensional sensory input, such as a sequence of video frames captured within the physical environment. Such a temporal sequence can provide a spatial metric representative of the topologically of the environment, which can be used as a foundation for intelligent navigation. In various embodiments, deep convolutional neural networks (CNNs) are leveraged to extract feature vectors from visual sensory data; Para. [0046], lines 8-15, capturing image data of its current location and feeding that image data (or the corresponding feature vectors) to the trained embedding network. It should also be mentioned that the CNN for feature extraction and the CNN for the embedding can be trained or optimized in very different ways. The shape of the embed ding network is a CNN, similar to that of the feature extractor. The very distinct difference is that the models are trained on two completely different objective functions).
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the computer-implemented method of claim 3 as taught in the combination of Seo, Shalev-Shwartz and combine wherein analyzing the image data to determine a plurality of image frames includes performing spatial-temporal feature extraction on the at least one past image frame and the current image frame as taught by Brown. One of ordinary skill in the art would have been motivated to make this modification in order that s it can be desirable to capture sensory data for multiple paths through an environment, as well as potentially duplicative paths, in order to obtain additional data and/or improve precision (Para. [0029], lines 6-9).
	Regarding claim 4, (Original) the combination of Seo, Shalev-Shwartz and Brown teaches the computer-implemented method of claim 3. Neither Seo nor Shalev-Shwartz wherein at least one feature vector is extracted from the at least one past image frame and the current image frame that is associated with at least one spatial-temporal feature within the at least one past image frame and the current image frame.
	However, in the same field of endeavor, Brown teaches:
The feature extractor in this example can analyze individual image frames using a deep convolutional neural network to obtain a set of feature vectors for sets (i.e. , triplets) of image frames. In other embodiments the feature extractor can extract the feature vectors for all frames, which can then be selected for relevant image frame sets) that is associated with at least one spatial-temporal feature within the at least one past image frame and the current image frame (Para. [0021], lines 4-13, various embodiments utilize generalized vector space models that are created from temporal sequences of high-dimensional sensory input, such as a sequence of video frames captured within the physical environment . Such a temporal sequence can provide a spatial metric representative of the topologically of the environment, which can be used as a foundation for intelligent navigation. In various embodiments, deep convolutional neural networks (CNNs) are leveraged to extract feature vectors from visual sensory data)
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the computer-implemented method according claim 3 as taught in the combination of Seo, Shalev-Shwartz and Brown and combine wherein at least one feature vector is extracted from the at least one past image frame and the current image frame that is associated with at least one spatial-temporal feature within the at least one past image frame and the current image frame as taught by Brown. One of ordinary skill in the art would have been motivated to make this modification in order that s it can be desirable to capture sensory data for multiple paths through an environment, as well as potentially duplicative paths, in order to obtain additional data and/or improve precision (Para. [0029], lines 6-9).
Claim 5, the combination of Seo, Shalev-Shwartz and Brown teaches the computer-implemented method according to claim 3. Seo does not explicitly teach:
	wherein outputting the at least one goal-oriented action includes decoding to output the at least one predicted action based on a feature representation that is predicted to occur at an immediate future point in time.  
	However, in the same field of endeavor, Shalev-Shwartz teaches:
	wherein outputting the at least one goal-oriented action includes decoding to output the at least one predicted action based on a feature representation that is predicted to occur at an immediate future point in time (Shalev-Shwartz: Para. [0212], lines 4-11, In the supervised learning phase, a differentiable mapping from (st, at) to ŝt+1 can be learned such that ŝt+1≈st+1. This may be similar to "model-based" reinforcement learning. However, in the forward loop of the network, ŝt+1 may be replaced by the actual value of st+1, therefore eliminating the problem of error accumulation. The role of prediction of ŝt+1 is to propagate messages from the future back to past actions).  *** Examiner interprets that the “model-based” learning algorithm calculates an output a loop of network value that determines a predication from the immediate future to past actions.
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the computer-implemented method according claim 3 as taught in the combination of Seo, Shalev-Shwartz and Brown and combine wherein outputting the at least one goal-oriented action includes decoding to output the at least one predicted action based on a feature representation that is predicted to occur at an immediate future point in time as taught by Shalev-Shwartz. One of ordinary skill in the art would have been motivated to make this modification in order to take into account a variety of factors and make appropriate decisions 
	Regarding Claim 12 the claim(s) recites analogous limitations to claim(s) 3 above, and is/are therefore rejected on the similar premise.
	Regarding Claim 13 the claim(s) recites analogous limitations to claim(s) 4 above, and is/are therefore rejected on the similar premise.
	Regarding Claim 14 the claim(s) recites analogous limitations to claim(s) 5 above, and is/are therefore rejected on the similar premise.
Claims 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Seo, Shalev-Shwartz and Brown  as applied to claim 5 above, and further in view of  Lucey et al. (US-2015/0347918).
	Regarding Claim 6, the combination of Seo, Shalev-Shwartz and Brown teaches the computer-implemented method of claim 5. Seo does not explicitly teach:
	wherein the at least one predicted action is output based on the future representation of the at least one predicted action.
	 However, in the same field of endeavor, Shalev-Shwartz teaches: 
	wherein the at least one predicted action is output based on the future representation of the at least one predicted action (Shalev-Shwartz: Para. [0212], lines 4-11, In the supervised learning phase, a differentiable mapping from (st, at) to ŝt+1 can be learned such that ŝt+1≈st+1. This may be similar to "model-based" reinforcement learning. However, in the forward loop of the network, ŝt+1 may be replaced by the actual value of st+1, therefore eliminating the problem of error accumulation. The role of prediction of ŝt+1 is to propagate messages from the future back to past actions).

	Neither Seo, Shalev-Shwartz nor Brown teaches
	 wherein a future representation of the at least one predicted action is obtained by average pooling hidden states based on the feature vectors extracted from the at least one past image frame and the current image frame.  
	However, in the same field of endeavor, Lucey teaches:
	wherein a future representation of the at least one predicted action is obtained by average pooling hidden states based on the feature vectors extracted from the at least one past image frame and the current image frame (see at least Lucey:  Para. [0044], As illustrated in FIG. 8A, various features indicative of the likely shot location may be used to construct the feature-vector x. For example, information such as the shot start location, the opponent recent movements, recent shots average speed, and the player's recent movements may influence the hidden states h and the future event y in the a-HCRF model 200. These features may be extracted, for instance, from positional data captured by a camera system 420 designed to detect and track).

	Regarding Claim 15, the claim(s) recites analogous limitations to claim(s) 6 above, and is/are therefore rejected on the similar premise.
Claims 7-8 and 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seo, Shalev-Shwartz, Brown and Lucey as applied to claim 6 above, and further in view of Watters et al. (US-2020/0092565).
	Regarding Claim 7, the combination of Seo, Shalev-Shwartz, Brown and Lucey teaches the computer-implemented method of claim 6. Seo does not explicitly teach:
	 a future feature associated with the future representation of the predicted action. 
	However, in the same field of endeavor, Shalev-Shwartz teaches:
	 a future feature associated with the future representation of the predicted action (Shalev-Shwartz: Para. [0324], lines 1-9, At step 1625, the navigational system for the host vehicle may select a navigational action for the host vehicle based on a comparison of expected rewards, not just based on the potential actions identified relative to a current navigational state (e.g., at steps 1605, 1607, and 1609), but also based on expected rewards determined as a result of potential future actions available in response to predicted future states (e.g., determined at steps 1613, 1615, and 1617)).  
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the computer-implemented method according claim 6 as taught in the combination of Seo, Shalev-Shwartz, Brown and Lucey and combine a future feature associated with the future representation of the predicted action as taught by Watters. One of ordinary skill in the art would have been motivated to make 5this modification in order to take into account a variety of factors and make appropriate decisions based on those factors to safely and accurately reach an intended destination (Shalev–Shwartz: Para. [0003], lines 4-6).
	Neither Seo, Shalev-Shwartz nor Brown or Lucey teaches:
 	wherein outputting the at least one goal-oriented action includes an encoder of the temporal recurrent network concatenating at least one feature vector that is extracted for the at least one past image frame, the current image frame.
	However, in the same field of endeavor, Watters teaches:
	wherein outputting the at least one goal-oriented action includes an encoder of the temporal recurrent network concatenating (see at least Watters:  Para. [0053], The interaction network for each temporal offset in the dynamics predictor operates as follows. For each slot of a state code 302, e.g. slot 302a as illustrated, a relation neural network 304 is applied to the concatenation of the slot 302a with each other slot 302b, 302c, 302d, 302e. Each slot corresponds to a respective object (the repetition of slots 203b-e in FIG. 2 is merely for convenience of illustration). A self-dynamics neural network 306 is applied to the slot 302a itself) at least one feature vector that is extracted for the at least one past image frame, the current image frame (see at least Watters: Para. [0031],  The visual encoder 102 takes a sequence, in the example a triplet of consecutive frames 120a, 102b as input and outputs for each triplet a state code 122a, 122b. Each frame shows objects. A state is a list of each object's position and velocity vector. A state code is a list of vectors, one for each object in the scene. Each of these vectors is a distributed representation of the position and velocity of its corresponding object).
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the method according to claim 6 as taught in the combination of Seo, Shalev-Shwartz, Brown and Lucey and combine an encoder of the temporal recurrent network concatenating at least one feature vector that is extracted for the at least one past image frame, the current image frame, and a future feature taught by Watters. One of ordinary skill in the art would have been motivated to make this modification in order to convey that the system predicts dynamics accurately (Watters [0028]).
	Regarding Claim 8, the combination Seo, Shalev-Shwartz, Brown, Lucey and Watters teach the computer-implemented method of claim 7. Seo further teaches: 
	wherein a driving scene is evaluated to determine at least one driver action that is conducted absent any external stimuli that is presented within a surrounding environment of the vehicle (see at least Seo: Para. [0049], For example, in an autonomous driving state, a vehicle drives itself without a driver performing a steering wheel operation, an acceleration operation, and a deceleration operation, or to provide supplemental/information regarding the same to the driver. A technology for automatically maintaining a distance between vehicles, a technology for indicating that the vehicle is departing a lane or staying in a lane, and a technology for informing the vehicle of an obstacle around the vehicle are cooperatively used in examples to enable the vehicle to drive autonomously).  

	wherein outputting the at least one goal-oriented action includes outputting at least one action determined during a current frame based on the concatenation completed by the encoder of the temporal recurrent network.
	However, in the same field of endeavor, Watters teaches: 
	wherein outputting the at least one goal-oriented action includes outputting at least one action determined during a current frame based on the concatenation completed by the encoder of the temporal recurrent network (see at least Watters: Para. [0053], The interaction network for each temporal offset in the dynamics predictor operates as follows. For each slot of a state code 302, e.g. slot 302a as illustrated, a relation neural network 304 is applied to the concatenation of the slot 302a with each other slot 302b, 302c, 302d, 302e. Each slot corresponds to a respective object (the repetition of slots 203b-e in FIG. 2 is merely for convenience of illustration). A self-dynamics neural network 306 is applied to the slot 302a itself).
	Accordingly, it would been obvious to one of ordinary skill in the art before the time of filing the invention to modify the method according to claim 7 taught in the combination of Seo, Shalev-Shwartz, Brown, Lucey and Watters and combine outputting the at least one goal-oriented action includes outputting at least one action determined during a current frame based on the concatenation completed by the encoder of the temporal recurrent network taught by Watters. One of ordinary skill in the art would have been motivated to make this modification in order to convey that the system predicts dynamics accurately (Watters: Para. [0028]).
	Regarding Claim 16, the claim(s) recites analogous limitations to claim(s) 7 above, and 
is/are therefore rejected on the same premise.
Claim 17, the claim(s) recites analogous limitations to claim(s) 8 above, and is/are therefore rejected on the same premise.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BAKARI UNDERWOOD whose telephone number is (571)272-8462.  The examiner can normally be reached on M - F 8:00 TO 4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Geepy (GP) Pe can be reached on (571) 270-3703.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/B.U./Examiner, Art Unit 3663                                                                                                                                                                                                        
/BAO LONG T NGUYEN/Primary Examiner, Art Unit 3664