DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in reply to the response filed on 11/23/2021.
Claims 1-20 are currently pending and have been examined. 
This action is made FINAL.
Response to Arguments
Applicant's arguments filed 11/23/2021 have been fully considered but they are not persuasive. 
Applicant has argued that Battles does not describe determination of abnormal behavior by a vehicle in that Battles merely teaches the blocking of changing lanes in a potentially conflicting manor but does not teach that this behavior is abnormal (Remarks, page 10-11). Examiner respectfully disagrees. It is well known in the art that collisions of vehicles are to be avoided to prevent destruction of property and injury to users of vehicles in the result of an accident. Due to this principal, though it is not explicitly stated in the description of Fig. 2D, the act of Battles system recognizing that a collision will occur and blocking said vehicle from colliding with another is the system implicitly recognizing abnormal behavior, merging lanes in an unsafe manor which would cause a collision, and preventing the vehicle from following through on its abnormal behavior of causing a collision.

Applicant has further argued that Battles does not disclose the use of a plurality of controlled vehicles to control the anomaly vehicle (Remarks pages 11-12). Examiner respectfully disagrees., Note that the amended claim language has necessitated a new ground of rejection and an additional 

Applicant has further argued that Battles does not teach the determining that an abnormally behaving vehicle is operated in a non-autonomous mode (Remarks, page 12). As stated in the specification of Battles “The second vehicle 210 may be an autonomous or non-autonomous vehicle traveling at a higher speed than the autonomous vehicle 205, and the third vehicle 220 may also be an autonomous or non-autonomous vehicle traveling at a higher speed than the autonomous vehicle 205.” [49]. The vehicle has been sensed as being abnormal, as shown in the argument in the paragraph above. This section states that the abnormal vehicle may be in autonomous or non-autonomous mode. The system has encompassed preventing a non-autonomous mode vehicle from taking anomaly action. Battles explains that the action taken by the controlled vehicle to control the abnormal/ dangerous actions of the other vehicle is based on the dangerous action of the other vehicle. Battles elaborates that the abnormal vehicle may be in either autonomous or non-autonomous mode. In both cases, the controlled vehicle takes the same course of action to prevent the abnormal vehicle from merging into a collision event in the example given. The vehicle would take the same action no matter which mode the abnormal vehicle is in, therefore, Battles does teach control of an anomaly vehicle which is acting in a non-autonomous mode.

Applicant’s arguments with respect to claim(s) 1,7, & 14 regarding the use of past anomalous behavior, in pages 11-12, have been considered but are moot because the new ground of rejection .

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claim(s) 1, 5-9, 13-16, & 20 are rejected under 35 U.S.C. 103 as being unpatentable over Battles (U.S. Pub No. 20190049997) in view of Benhammou (U.S. Pub No. 20080095403).

Regarding claim 1
	Battles teaches:
A system for traffic control system comprising:
one or more processors; and
a memory in communication with the one or more processors, the memory having: (“The vehicle 900 may also include other mechanisms or systems not illustrated in FIG. 9, such as a braking or stopping mechanism, a communication system, processors, memories, input/output devices, and/or other mechanisms or systems.” [120])

an environment module having instructions that, when executed by the one or more processors, causes the one or more processors to obtain a state of an environment having a universe of vehicles operating therein, (“Visual sensing devices 930 may include imaging devices, imaging sensors, radar, LIDAR, time of flight sensors, thermal or infrared sensors, or other visual sensing devices to detect broadcast visual identifiers, license plates, displays, colors, symbols, or other identifiers on one or more surfaces or locations of a sensed vehicle.” [138]; “Further, the autonomous vehicle 900 may detect actions of other vehicles using any of the sensing devices,” [142])

an anomaly detection module having instructions that, when executed by the one or more processors, causes the one or more processors to identify one or more anomaly vehicles operating in a non-autonomous mode from the universe of vehicles operating in the environment, (“if the autonomous vehicle and the second vehicle are facing each other at an intersection having a four-way the autonomous vehicle detects the strategy mode and/or action of the second vehicle as performing a left turn operation which may be an unsafe or otherwise undesirable operation (e.g., due to an accident, construction, traffic congestion, or other incident in that direction),” [30] The unit has detected that an a vehicle is an anomaly by taking strange actions, such a driving toward a closed or dangerous area; in which the autonomous vehicle 205 may select a preventative strategy mode 108. The second vehicle 210 may be an autonomous or non-autonomous vehicle traveling at a higher speed than the autonomous vehicle 205, and the third vehicle 220 may also be an autonomous or non-autonomous vehicle traveling at a higher speed than the autonomous vehicle 205.” [49]; Fig 2D; Here vehicle 210 is the anomaly vehicle and is described as a manual/ non-autonomous vehicle ;)

an action selector module having instructions that, when executed by the one or more processors, causes the one or more processors to select one or more actions to control a plurality of controlled vehicles to control the operation of one or more anomaly vehicles, and (“An autonomous vehicle may select a particular strategy mode based on a variety of information related to the autonomous vehicle itself, other vehicles, and/or the environment.” [31]; “When operating in a preventative strategy mode, an autonomous vehicle may select actions that prevent a strategy mode and/or action of a second vehicle from being successfully performed by causing a change to current actions of the autonomous vehicle.” [30] An action has been selected to control another vehicle; ;“The autonomous vehicle management system 640 may also determine or receive one or more operational goals for the system as a whole. The operational goals for the system may include one or more of safety, resolvability, efficiency, time, priority, throughput, on-time completion, average speed, number of incidents or conflicts, fuel or resource utilization, load distribution across the system, risk distribution across the system, and/or other goals. The autonomous vehicle management system 640 may then process all the information received from the vehicles 601 and the autonomous vehicle 605, while also taking into account the operational goals for the system. The autonomous vehicle management system 640 may then instruct modifications to operations of one or more autonomous vehicles in order to achieve the operational goals for the system.” [97]; The system is capable of controlling the plurality of controlled vehicles to act together toward a common goal which includes safety. The system is shown in figure 6 acting to work together to increase the safety of the roadway, though not specifically stated, this could mean the system would work to use a group of vehicles to prevent a merging collision like shown in the earlier example)

a direction module having instructions that, when executed by the one or more processors, causes the one or more processors to direct the plurality of controlled vehicles to execute the one or more actions to prevent the one or more anomaly vehicles operating in the non-autonomous mode from performing the one or more abnormal actions. (“an autonomous vehicle may select actions that prevent a strategy mode and/or action of a second vehicle from being successfully performed by causing a change to current actions of the autonomous vehicle. For example, if the autonomous vehicle and the second vehicle are facing each other at an intersection having a four-way stop sign, and the autonomous vehicle detects the strategy mode and/or action of the second vehicle as The autonomous vehicle management system 640 may then instruct modifications to operations of one or more autonomous vehicles in order to achieve the operational goals for the system.” [97];  The vehicle has selected an action, been given a direction, and will act based on the direction to execute the action. The system works when the second vehicle is driven manually or autonomously, therefore the system has prevented the non-autonomous mode vehicle from performing an abnormal action)

	Battles does not teach, an anomaly detection module which is based on a prior performance of one or more abnormal actions, However, Benhammou does explicitly teach

an anomaly detection module having instructions that, when executed by the one or more processors, causes the one or more processors to identify one or more anomaly vehicles operating in a non-autonomous mode from the universe of vehicles operating in the environment, based on a prior performance of one or more abnormal actions, ( “Various metrics can then be created from the individual vehicle data including vehicle size, speed, direction of travel, position relative to a lane, and any abnormality activity. Abnormalities may be triggered by a vehicle falling outside of the normal behavior 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Battles to include the teachings of as taught by Benhammou to teach the vehicle system to observe and connect the repeated dangerous and abnormal actions of vehicles in the environment to improve their identification of dangerous vehicles and take action. 

Regarding claim 5
	As shown in the rejection above, Battles and Benhammou disclosed the limitations of claim 1.

Benhammou further teaches

wherein the one or more abnormal actions includes:
a number of lane changes over a period of time;
and a vehicle speed outside a specific range; (“Various metrics can then be created from the individual vehicle data including vehicle size, speed, direction of travel, position relative to a lane, and any abnormality activity. Abnormalities may be triggered by a vehicle falling outside of the normal behavior (e.g., statistical outliers). For example, traveling in the wrong direction, unusually high or low rates of speed, frequent lane changing, or similar behavior of a single vehicle may cause the behavior to be considered abnormal…Alerts to human or other computerized systems may be created from the detection of abnormalities.” [13] By detecting the frequency of lane changes, it must inherently be detecting a number of lane changes to know the vehicle has changed lanes more than ones, the system also compares the speed to range compared to what the system considers normal and uses these values to alert the system of an abnormal vehicle)


Regarding claim 6 
As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 1.
		
Battles further teaches
wherein a number of the plurality of controlled vehicles is equal to or greater than a number of the one or more anomaly vehicles. (“in which the autonomous vehicle 205 may select a preventative strategy mode 108. The second vehicle 210 may be an autonomous or non-autonomous vehicle traveling at a higher speed than the autonomous vehicle 205, and the third vehicle 220 may also be an autonomous or non-autonomous  205 and 220 are a plurality of autonomous/controlled vehicles while the anomaly vehicle 210 is outnumbered by the controlled vehicles)

Regarding claim 7:
	Claim 7 recites a method having substantially the same limitation as claim 1 above, therefore it is rejected for the same reason as claim 1.

Regarding claim 8:
	As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 7.
Claim 8 recites a method having substantially the same limitation as claim 6 above, therefore it is rejected for the same reason as claim 6.

Regarding claim 13:
	As shown in the rejection above, Battles and Benhammou disclosed the limitations of claim 7.
Claim 13 recites a method having substantially the same limitation as claim 5 above, therefore it is rejected for the same reason as claim 5.


Regarding claim 14:

	Battles further teaches 
A non-transitory computer-readable medium storing instructions for controlling a plurality of controlled vehicles with one or more controlled vehicles that, (“FIG. 12 is a block diagram illustrating various components of an example autonomous vehicle control system 950 for selection, broadcast, and determination of strategy modes for autonomous vehicle operations, according to an implementation. In various examples, the block diagram may be illustrative of one or more aspects of the autonomous vehicle control system 950 that may be used to implement the various systems and processes discussed above. In the illustrated implementation, the autonomous vehicle control system 950 includes one or more processors 1202, coupled to a non-transitory computer readable storage medium 1220” [154])

The remaining limitations of Claim 14 recite a system having substantially the same limitation as claim 1 above, therefore it is rejected for the same reason as claim 1.

Regarding claim 15:
	As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 14.
Claim 15 recites a system having substantially the same limitation as claim 6 above, therefore it is rejected for the same reason as claim 6.

Regarding claim 16:
	As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 14.


Regarding claim 20:
	As shown in the rejection above, Battles and Benhammou disclosed the limitations of claim 14.
Claim 20 recites a method having substantially the same limitation as claim 5 above, therefore it is rejected for the same reason as claim 5.


Claim(s) 2-3, 10-11, 17, & 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Battles and Benhammou in further view of Du (U.S. Pub No. 20200174471)

Regarding Claim 2
	As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 1.

the combination of Battles and Benhammou does not teach wherein the action selector module further comprises instructions that, when executed by the one or more processors, causes the one or more processors to select the one more actions by utilizing a reinforcement-learning trained algorithm, however Du does explicitly teach: 

wherein the action selector module further comprises instructions that, when executed by the one or more processors, causes the one or more processors to select the one more actions by utilizing a reinforcement-learning trained algorithm. (Fig2, 57-58 60
“the each of the agents 56 may refer to one or more modules of a respective vehicle. The agents 56 may transmit first signals 58 including vehicle and/or obstacle information to the RLP module 54 and receive second signals 60 including vehicle control commands.” [30] The control command of the vehicle is picked though a reinforcement learning process)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Battles and Benhammou to include the teachings of as taught by Du to make safe decisions, reinforcement learning aims to repeat reward actions, in this case “Actions that lead to a collision result in negative feedback. Actions that lead to safe driving behaviors result in positive feedback.” See at least [90].

Regarding Claim 3
	As shown in the rejection above, the combination of Battles, Benhammou and Du disclosed the limitations of claim 2.

Battles and Benhammou does not teach wherein the memory further comprises a reward controller module that, when executed by the one or more processors, causes the one or more processors to generate a reward based on a reward function, however Du does explicitly teach: 
wherein the memory further comprises a reward controller module that, when executed by the one or more processors, causes the one or more processors to generate a reward based on a reward function (” A mapping between actions and rewards and/or punishments may be stored in the memory 208. The actions performed that yielded high or low rewards may also be stored in the memory 208. As an example, rewards may be generated based on a predetermined reward function. The rewards may be determined by, for example, the RLP module 206 or some other module of the vehicle. An agent may receive and/or generate a reward for each action performed by the agent. A reward may be in the form of a scoring value (e.g., a value between 0 and 1 or a value between −1 and 1) for a particular set of one or more actions performed by a vehicle. In an embodiment, a positive value refers to a reward and a negative value refers to a punishment” [34])

and adjust one or more weights of the reinforcement-learning trained algorithm based on the reward. (“At 432, the RLP module 206, the target network module 204, and/or the target network 205 determines a loss value, which may be equal to a difference between Q.sub.target and Q.sub.eval of the evaluation and target networks 203… Back propagation is used to train the evaluation network 203, which includes updating the weights of the evaluation network 203.” [80-81] The system updates model weights which as shown in RLP is based on the rewards given)

Regarding claim 10:
	As shown in the rejection above, The combination of Battles and Benhammou disclosed the limitations of claim 7.
Claim 10 recites a method having substantially the same limitation as claim 2 above, therefore it is rejected for the same reason as claim 2.

Regarding claim 11:
	As shown in the rejection above, the combination of Battles, Benhammou and Du disclosed the limitations of claim 10. Claim 11 recites a method having substantially the same limitation as claim 3 above, therefore it is rejected for the same reason as claim 3.

Regarding claim 17:
	As shown in the rejection above, the combination of Battles and Benhammou disclosed the limitations of claim 14.
Claim 17 recites a system having substantially the same limitation as claim 2 above, therefore it is rejected for the same reason as claim 2.

Regarding Claim 19
	As shown in the rejection above, the combination of Battles, Benhammou and Du disclosed the limitations of claim 17.

Battles and Benhammou does not teach wherein the reinforcement- learning trained algorithm is a deep reinforcement-learning trained algorithm. However Du does explicitly teach: 

Wherein the reinforcement- learning trained algorithm is a deep reinforcement-learning trained algorithm (“In an embodiment, the collaborative RLP algorithm is a model free deep reinforcement learning method” [0067])


Claim(s) 4, 12, & 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Battles, Benhammou, and Du in further view of Emam (U.S. Pub No. 20070282519)

Regarding claim 4
	As shown in the rejection above, the combination of Battles and Du disclosed the limitations of claim 3.

Du further teaches the reward function (“The collaboration enabling module 226 and/or the policy module 228 monitor reward values for collaboration purposes. As an example, Table 1 below includes example events shown in reward based ranking from a most negative ranking (or worst reward value) to a most positive ranking (or best reward value). The precise reward values is a matter of tuning. The rewards may be constrained to be in a range of [−1,1]… TABLE-US-00001 TABLE 1 Reward Ranking Event Most Negative −−− Colliding with another vehicle on the treadmill (or road). −− Impacting the side of the treadmill (or road) by moving off the treadmill (or road) to the left or the right excluding the situation of a planned exit from the treadmill (or road). −− Falling off the treadmill (or road) from the front or back.” [62]; Du negatively impacts the reward function of the learning algorithm when negative actions are taken by the ego vehicle)

Du does not teach [wherein the reward function] is based on at least one of:
an average speed of the one or more anomaly vehicles;
a number of lane changes of the one or more anomaly vehicles;
a number of braking events of the one or more anomaly vehicles; and
Emam does explicitly teach:

wherein [the reward function is based] on at least one of:
an average speed of the one or more anomaly vehicles; (“an abnormal situation on the road is detected (e.g., a car is speeding above 70 miles per hour)” [61]; Where the vehicle is being sensed as abnormal as its detected average speed over a time frame was found to be above a certain speed)
a number of lane changes of the one or more anomaly vehicles; 
a number of braking events of the one or more anomaly vehicles; and
a number of acceleration events of the one or more anomaly vehicles. 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Battles and Du to include the teachings of as taught by Emam to incorporate negatively impacting the reward function of the learning algorithm when the sensed anomaly vehicle is detected executing an unwanted action, such as having an average speed outside of a desired range. By combining with Emam “The reported data is collected and analyzed in a central traffic management system in order to take appropriate actions.” See at least [1].

Regarding claim 12:
	As shown in the rejection above, the combination of Battles and Du disclosed the limitations of claim 11. Claim 12 recites a method having substantially the same limitation as claim 4 above, therefore it is rejected for the same reason as claim 4.

Regarding claim 18

	As shown in the rejection above, the combination of Battles and Du disclosed the limitations of claim 17.

		Du further teaches: 
wherein the reinforcement- learning trained algorithm was trained to maximize a reward function, (Reinforcement learning is a family of decision making methods for finding an optimal policy for taking actions in an MDP to maximize expected rewards over time using only experience and rewards. [59])

	The remainder of Claim 18 recites a system having substantially the same limitation as claim 4 above, therefore it is rejected for the same reason as claim 4.


Conclusion             
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on 
                                                                                                         
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW PARULSKI whose telephone number is (571)272-5922.  The examiner can normally be reached on Mon-Fri 8:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James J Lee can be reached on (571) 270-5965.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.








/JAMES J LEE/Supervisory Patent Examiner, Art Unit 3668