DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in Instant Application.

Priority
Examiner acknowledges Applicant’s claim to priority benefits of 62/695,618 filed 07/09/2018.

Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 02/09/2020 is/are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement(s) is/are being considered if signed and initialed by the Examiner.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-5, 7-12, 14-15, 17, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shalev-Shwartz et al. (USPGPub 2018/0032082).	As per claim 1, Shalev-Shwartz discloses a system for controlling a vehicle, comprising: 	a processor (see at least paragraph 0071; wherein processing unit 110), the processor being configured to execute instructions stored on a non- transitory computer readable medium (see at least paragraph 0071; wherein the memory may store software that, when executed by the processor, controls the operation of the system); 	a sensor (see at least Figure 1; items 120, 122, 124, 126, 130) coupled to the processor (see at least Figure 1; item 110) and configured to receive sensory input (see at least paragraph 0170; wherein processing unit 110 may combine the processed information derived from each of image capture devices 122, 124, and 126); and 	a controller coupled to the processor and configured to control the vehicle (see at least paragraph 0177; wherein control module 805, which may also be implemented using processing unit 110, may develop control instructions for one or more actuators or controlled devices associated with the host vehicle); 	wherein the processor (see at least Figure 1; item 110) is further configured to: see at least paragraph 0174; wherein sensing, which may include data from cameras and/or any other available sensors, along with map information, may be collected, analyzed, and formulated into a “sensed state,” describing information extracted from a scene in the environment of the host vehicle. The sensed state may include sensed information relating to target vehicles, lane markings, pedestrians, traffic lights, road geometry, lane shape, obstacles, distances to other objects/vehicles, relative velocities, relative accelerations, among any other potential sensed information); 		derive a deep reinforcement learning (RL) policy using the synthetic image (see at least paragraph 00179; wherein training of the system using reinforcement learning may involve learning a driving policy in order to map from sensed states to navigational actions), wherein the deep RL policy determines a longitudinal control for the vehicle (see at least paragraph 0176; wherein the output of driving policy module 803 may include at least one navigational action for the host vehicle and may include a desired acceleration (which may translate to an updated speed for the host vehicle), a desired yaw rate for the host vehicle, a desired trajectory, among other potential desired navigational actions…see at least paragraph 0178; wherein trained system trained through reinforcement learning may be used to implement driving policy module 803); and 				instruct the controller to control the vehicle based on the deep RL policy (see at least paragraph 0177; wherein control module 805 may be responsible for developing and outputting instructions to controllable components of the host vehicle in order to implement the desired navigational goals or requirements of driving policy module 803). 	As per claims 2 and 9, Shalev-Shwartz discloses wherein the sensory input includes at least a position of the vehicle (see at least paragraph 0073; wherein position sensor 130 may include any type of device suitable for determining a location associated with at least one component of system 100).  	As per claims 3 and 10, Shalev-Shwartz discloses wherein the sensory input includes at least a speed of the vehicle (see at least paragraph 0074; wherein system 100 may include components such as a speed sensor (e.g., a speedometer) for measuring a speed of vehicle 200).  	As per claims 4, 11, and 19, Shalev-Shwartz discloses wherein the sensory input corresponds to another vehicle (see at least paragraph 0164; wherein processing unit 110 may execute stereo image analysis module 404 to perform stereo image analysis of the first and second plurality of images to create a 3D map of the road in front of the vehicle and detect features within the images, such as lane markings, vehicles, pedestrians, road signs, highway exit ramps, traffic lights, road hazards, and the like).  	As per claims 5, 12, and 20, Shalev-Shwartz discloses wherein the sensory input corresponds to an object proximate the vehicle (see at least paragraph 0164; wherein processing unit 110 may execute stereo image analysis module 404 to perform stereo image analysis of the first and second plurality of images to create a 3D map of the road in front of the vehicle and detect features within the images, such as lane markings, vehicles, pedestrians, road signs, highway exit ramps, traffic lights, road hazards, and the like).  	As per claims 7 and 14, Shalev-Shwartz discloses wherein the processor is further configured to derive the deep RL policy using an artificial neural network (see at least paragraph 0173; wherein any of the modules (e.g., modules 801, 803, and 805) disclosed herein may implement techniques associated with a trained system (such as a neural network or a deep neural network)).  	As per claim 8, Shalev-Shwartz discloses a method for controlling a vehicle, comprising: 	receiving a sensory input from at least one sensor of the vehicle (see at least paragraph 0170; wherein processing unit 110 may combine the processed information derived from each of image capture devices 122, 124, and 126); 	creating a synthetic image based on the sensory input (see at least paragraph 0174; wherein sensing, which may include data from cameras and/or any other available sensors, along with map information, may be collected, analyzed, and formulated into a “sensed state,” describing information extracted from a scene in the environment of the host vehicle. The sensed state may include sensed information relating to target vehicles, lane markings, pedestrians, traffic lights, road geometry, lane shape, obstacles, distances to other objects/vehicles, relative velocities, relative accelerations, among any other potential sensed information);  1646107-02807 (V218-0036) 	deriving a policy based on the synthetic image (see at least paragraph 00179; wherein training of the system using reinforcement learning may involve learning a driving policy in order to map from sensed states to navigational actions), wherein the deep RL policy indicates a longitudinal control for the vehicle (see at least paragraph 0176; wherein the output of driving policy module 803 may include at least one navigational action for the host vehicle and may include a desired acceleration (which may translate to an updated speed for the host vehicle), a desired yaw rate for the host vehicle, a desired trajectory, among other potential desired navigational actions…see at least paragraph 0178; wherein trained system trained through reinforcement learning may be used to implement driving policy module 803); and 	selectively controlling the vehicle based on the longitudinal control indicated in by deep RL policy (see at least paragraph 0177; wherein control module 805 may be responsible for developing and outputting instructions to controllable components of the host vehicle in order to implement the desired navigational goals or requirements of driving policy module 803).  	As per claim 15, Shalev-Shwartz discloses an apparatus for controlling a vehicle, comprising: 	a processor (see at least paragraph 0071; wherein processing unit 110) in communication with a non-transitory computer readable medium that stores instructions that (see at least paragraph 0071; wherein the memory may store software that, when executed by the processor, controls the operation of the system), when executed by the processor (see at least paragraph 0071; wherein processing unit 110), cause the processor to:  	receive sensory input from at least one sensor of the vehicle (see at least paragraph 0170; wherein processing unit 110 may combine the processed information derived from each of image capture devices 122, 124, and 126);	generate a synthetic image based on the sensory input (see at least paragraph 0174; wherein sensing, which may include data from cameras and/or any other available sensors, along with map information, may be collected, analyzed, and formulated into a “sensed state,” describing information extracted from a scene in the environment of the host vehicle. The sensed state may include sensed information relating to target vehicles, lane markings, pedestrians, traffic lights, road geometry, lane shape, obstacles, distances to other objects/vehicles, relative velocities, relative accelerations, among any other potential sensed information);	use an artificial neural network (see at least paragraph 0173; wherein any of the modules (e.g., modules 801, 803, and 805) disclosed herein may implement techniques associated with a trained system (such as a neural network or a deep neural network)) to derive a deep reinforcement learning (RL) policy based the synthetic image (see at least paragraph 00179; wherein training of the system using reinforcement learning may involve learning a driving policy in order to map from sensed states to navigational actions), wherein the deep RL policy indicates a longitudinal control for the vehicle (see at least paragraph 0176; wherein the output of driving policy module 803 may include at least one navigational action for the host vehicle and may include a desired acceleration (which may translate to an updated speed for the host vehicle), a desired yaw rate for the host vehicle, a desired trajectory, among other potential desired navigational actions…see at least paragraph 0178; wherein trained system trained through reinforcement learning may be used to implement driving policy module 803); and 	selectively instruct a controller of the vehicle to control the vehicle based on the longitudinal control indicated in by deep RL policy (see at least paragraph 0177; wherein control module 805 may be responsible for developing and outputting instructions to controllable components of the host vehicle in order to implement the desired navigational goals or requirements of driving policy module 803).  	As per claim 17, Shalev-Shwartz discloses wherein the controller controls the vehicle based on the longitudinal control indicated in by deep RL policy by performing automatic cruise control functions (see at least paragraph 0191; wherein automatic Cruise Control (ACC) policy, 0.sub.ACC:S.fwdarw.A: this policy always outputs a yaw rate of 0 and only changes the speed so as to implement smooth and accident-free driving).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1.	Determining the scope and contents of the prior art.
2.	Ascertaining the differences between the prior art and the claims at issue.
3.	Resolving the level of ordinary skill in the pertinent art.
4.	Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 6, 13, and 16 are rejected under 35 U.S.C. 103(a) as being unpatentable over Shalev-Shwartz et al. (USPGPub 2018/0032082) in view of Jung et al. (USPGPub 2019/0325025).	As per claims 6, 13, and 16, Shalev-Shwartz does not explicitly mention wherein the processor is further configured to create the synthetic image using domain knowledge.	However Jung does disclose:	wherein the processor is further configured to create the synthetic image using domain knowledge (see at least paragraph 0078; wherein the teacher agent 240 samples domain information from the domain knowledge preprocessor 241 and provides the sampled domain information to the sense-making training set generator 242).  	Therefore it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Jung with the teachings as in Shalev-Shwartz. The motivation for doing so would have been to improve sense-making knowledge through cooperation between the teacher agent 240 and the learning agent 230, see Jung paragraph 0076.

Claim 18 is rejected under 35 U.S.C. 103(a) as being unpatentable over Shalev-Shwartz et al. (USPGPub 2018/0032082), in view of Chen et al. (CN105083278A).	As per claim 18, Shalev-Shwartz does not explicitly mention wherein the controller controls the vehicle based on the longitudinal control indicated in by deep RL policy by performing lane keeping functions.	However Chen does disclose:	wherein the controller controls the vehicle based on the longitudinal control indicated in by deep RL policy by performing lane keeping functions (see at least abstract; wherein when the driving mode is the lane keeping mode or the intelligent obstacle avoidance mode or the independent vehicle following mode, the vehicle is controlled by adopting a reinforcement learning method).  	Therefore it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Chen with the teachings as in Shalev-Shwartz. The motivation for doing so would have been to provide better stability, high reliability, and better flexibility when controlling a vehicle, see Chen paragraph 0076.

Relevant Art
The prior art made of record and not relied upon are considered pertinent to applicant’s disclosure:	USPGPub 2019/0368133 – Provides UAV based navigation through a dynamic correction path to inspect one or more assets in one or more infrastructures.	
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAHMOUD S ISMAIL whose telephone number is (571)272-1326.  The examiner can normally be reached on M - F: 10AM- 6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jelani Smith can be reached on 571-270-3969.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MAHMOUD S ISMAIL/Primary Examiner, Art Unit 3662