Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign(s) mentioned in the description.
“sensor data 160”, listed in ¶[0028]
“a vehicle 306”, listed in ¶[0033]
“ordinal 6”, “ordinal 1”, listed in ¶[0052]
“FIG. 1B”, listed in ¶[0073]
“1215 a feature vector”, “1225 statistical distribution of the predicted output values”, “1230 the statistical distribution of the predicted output values”, listed in ¶[0079]

The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description.  
“1030”, “1035”, listed in FIG. 10

The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference characters "112" in FIG. 1 and "82" in ¶[0074] have both been used to designate the “model training system”.  

Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.


Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

The disclosure is objected to because of the following informalities.  
¶[0052] of the specification refers to ordinal values in FIG. 6, however there is no “ordinal 4” or “ordinal 1” in FIG. 6
Paragraph numbering starting on page 37 is inconsistent
¶[00100] starting on page 37 ends on page 44
¶[0100] starts on page 44, contains repeat paragraph numbers
¶[00117] continues out of order on page 44
Appropriate correction is required.


Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4-5, 8, 13-15 and 18-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 4, the phrase “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” renders the claim indefinite because it is unclear what is meant by “a histogram of a rate of occurrence”. To a person having ordinary skill in the art a histogram has an accepted meaning of a graph that shows the frequency of numerical data using rectangles.  
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “histogram” in claim 4 is used by the claim to mean “a distribution” while the accepted meaning is “a graphical representation.” The term is indefinite because the specification does not clearly redefine the term.
For the purposes of examination, the examiner will take “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” as — the statistical distribution is  a rate of occurrence of each ordinal value in user responses —, based on ¶[0085] of the specification.
Dependent claim 5 inherits and does not cure the deficiencies of claim 4 and is therefore rejected on the same basis as outlined above.

Regarding claim 5, the phrase “wherein the statistical distribution of user responses having a user response time above the threshold value is a first histogram and the statistical distribution of user responses having a user response time below the threshold value is a second histogram, wherein the measure of difference is an aggregate of the differences of the rate of occurrences of the ordinal values between the first histogram and the second histogram” renders the claim indefinite because it is unclear what is meant by “a first histogram... a second histogram... the measure of difference is an aggregate of the differences of the rate of occurrences of the ordinal values between the first histogram and the second histogram”. To a person having ordinary skill in the art a histogram has an accepted meaning of a graph that shows the frequency of numerical data using rectangles.  
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “histogram” in claim 5 is used by the claim to mean “a distribution” while the accepted meaning is “a graphical representation.” The term is indefinite because the specification does not clearly redefine the term.
For the purposes of examination, the examiner will take “wherein the statistical distribution of user responses having a user response time above the threshold value is a first histogram and the statistical distribution of user responses having a user response time below the threshold value is a second histogram, wherein the measure of difference is an aggregate of the differences of the rate of occurrences of the ordinal values between the first histogram and the second histogram” as —wherein the statistical distribution of user responses having a user response time above the threshold value is a first  distribution and the statistical distribution of user responses having a user response time below the threshold value is a second  distribution, wherein the measure of difference is an aggregate of the differences of the rate of occurrences of the ordinal values between the first  distribution and the second  distribution—, based on ¶[0085] of the specification.

Regarding claim 8, the phrase “wherein the hidden context represents a degree of awareness of the autonomous vehicle by a user represented by the traffic entity” renders the claim indefinite because it is unclear whether the degree of awareness is attributed to the autonomous vehicle or the user represented by the traffic entity. 
For the purposes of examination, the examiner will take “wherein the hidden context represents a degree of awareness of the autonomous vehicle by a user represented by the traffic entity” as — wherein the hidden context represents a degree of awareness of  a user represented by the traffic entity about the autonomous vehicle —, based on ¶[0064] of the specification.

Claim 13 recite(s) “the particular threshold”, “the determined measure of differences”, “the plurality of thresholds”, which lack antecedent basis. For the purposes of examining, the examiner will take the preamble of claim 13 “The non-transitory computer readable storage medium of claim 11” as —The non-transitory computer readable storage medium of claim [[11]] 12—, with reference to the recited terms in claim 12, based on similar claims 2 and 3.
Dependent claim 14 inherits and does not cure the deficiencies of claim 13 and is therefore rejected on the same basis as outlined above.

Regarding claim 14, the phrase “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” renders the claim indefinite because it is unclear what is meant by “a histogram of a rate of occurrence”. To a person having ordinary skill in the art a histogram has an accepted meaning of a graph that shows the frequency of numerical data using rectangles.  
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “histogram” in claim 14 is used by the claim to mean “a distribution” while the accepted meaning is “a graphical representation.” The term is indefinite because the specification does not clearly redefine the term.
For the purposes of examination, the examiner will take “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” as — the statistical distribution is  a rate of occurrence of each ordinal value in user responses —, based on ¶[0085] of the specification.

Claim 15 recite(s) “the machine learning based model” which lacks antecedent basis. For the purposes of examining, the examiner will take “the machine learning based model” as — the  neural network—, with reference to the recited term in claim 11, based on similar claim 10.

Claim 18 recite(s) “the particular threshold”, “the determined measure of differences”, “the plurality of thresholds”, which lacks antecedent basis. For the purposes of examining, the examiner will take the preamble of claim 18 “The computer system of claim 16” as — The computer system of claim [[16]] 17—, with reference to the recited terms in claim 17, based on similar claims 2 and 3.
Dependent claim 19 inherits and does not cure the deficiencies of claim 18 and is therefore rejected on the same basis as outlined above.

Regarding claim 19, the phrase “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” renders the claim indefinite because it is unclear what is meant by “a histogram of a rate of occurrence”. To a person having ordinary skill in the art a histogram has an accepted meaning of a graph that shows the frequency of numerical data using rectangles.  
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “histogram” in claim 19 is used by the claim to mean “a distribution” while the accepted meaning is “a graphical representation.” The term is indefinite because the specification does not clearly redefine the term.
For the purposes of examination, the examiner will take “the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses” as — the statistical distribution is  a rate of occurrence of each ordinal value in user responses —, based on ¶[0085] of the specification.

Claim 20 recite(s) “the machine learning based model” which lacks antecedent basis. For the purposes of examining, the examiner will take “the machine learning based model” as — the  neural network—, with reference to the recited term in claim 16, based on similar claim 10.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 6-7, and 9-20 are rejected under 35 U.S.C. 103 as being unpatentable over Huval (US 2018/0373980 A1) in view of Zhang et al. (US 2019/0072966 A1) and Dernoncourt et al. (US 2019/0384807 A1), henceforth known as Huval, Zhang, and Dernoncourt, respectively.

Regarding claim 1, Huval discloses:
A method comprising:
(Huval,
¶[0001]: “This invention relates... to a new and useful method for training and refining an artificial intelligence in the field of autonomous vehicles”)

receiving, by one or more autonomous vehicles, sensor data from sensors mounted on each autonomous vehicle; 
(Huval, FIG. 1; 
¶[0008]: “...receiving a first optical image recorded by an optical sensor integrated into a road vehicle in Block S130”;
¶[0011]: “...the computer system can execute Blocks of the method S100 to automatically: collect a large volume of optical data generated at various road vehicles (e.g., manually-operated and/or autonomous vehicles)”;
¶[0016]: “The method S100 can be executed by a computer system (e.g., a remote server) in conjunction with an autonomous vehicle. The autonomous vehicle can include: a suite of sensors configured to collect information about the autonomous vehicle's environment...”;
Where the method includes receiving, by road vehicles, i.e. autonomous vehicles (receiving, by one or more autonomous vehicles), optical image data (receiving, by one or more autonomous vehicles) from a suite of sensors integrated into the road vehicle (from sensors mounted on each autonomous vehicle))

storing a plurality of images extracted from the sensor data, each image displaying a traffic entity; 
(Huval, FIG. 1; FIG. 2B; FIG. 3;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data... the neural network can also detect and characterize dynamic objects—such as other vehicles, pedestrians, and cyclists—in the LIDAR and video feeds”;
¶[0021]: “...in Block S110, the remote computer system aggregates labeled optical data, such as stored in a remote database, into a training set on which a localization, perception, motion planning, and/or other neural network can then be trained”;
¶[0022]: “For example, the remote computer system can access: still LIDAR images, still color photographic images, LIDAR feeds, and/or video feeds recorded by multiple (e.g., a fleet) autonomous vehicles and/or manually-operated vehicles while in operation over time; and labels for static (e.g., localization) objects, dynamic (e.g., perception) objects...”;
Where the method includes aggregating labeled optical data and storing the labeled optical data in a remote database (storing a plurality of images extracted from the sensor data), where the labels correspond to static or dynamic objects the autonomous vehicle encounters during operation, i.e. traffic entities (each image displaying a traffic entity))

for each of the plurality of images, the image displaying a traffic entity: 
(Huval, FIG. 1; FIG. 2B; FIG. 3;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data... the neural network can also detect and characterize dynamic objects—such as other vehicles, pedestrians, and cyclists—in the LIDAR and video feeds”;
¶[0021]: “...in Block S110, the remote computer system aggregates labeled optical data, such as stored in a remote database, into a training set on which a localization, perception, motion planning, and/or other neural network can then be trained”;
¶[0022]: “For example, the remote computer system can access: still LIDAR images, still color photographic images, LIDAR feeds, and/or video feeds recorded by multiple (e.g., a fleet) autonomous vehicles and/or manually-operated vehicles while in operation over time; and labels for static (e.g., localization) objects, dynamic (e.g., perception) objects...”;
Where the method includes aggregating labeled optical data and storing the labeled optical data in a remote database, where the labels correspond to static or dynamic objects the autonomous vehicle encounters during operation, i.e. traffic entities (for each of the plurality of images, the image displaying a traffic entity))

sending the image for presentation to a set of users, and 
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0009]: “...receiving a first manual label attributed to the first optical image by a first human annotator at the local computer system in Block S142; in response to the first manual label differing from the first automated label, serving the first optical image, the first manual label, and the first automated label to a set of annotation portals for manual confirmation of one of the first manual label and the first automated label for the first optical image by a set of human annotators in Block S150 and receiving confirmations of one of the first manual label and the first automated label for the first optical image from the set of human annotators in Block S152...”;
Where the first optical image, i.e. each image of the aggregated optical data, is sent to human annotators in block S150 (sending the image for presentation to a set of users))

for each of the set of users, receiving a user response describing a hidden context attribute [for the traffic entity, wherein each user response is associated with a user response time]; 
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0079]: “...the remote computer system can return the optical image to the same human annotator and/or to other human annotators for confirmation of the object type and location of a label attributed to the optical image in Block S150...”;
¶[0099]: “As shown in FIGS. 4 and 5, one variation of the method S100 for training and refining an artificial intelligence includes: accessing a training set including discrete sequences of optical images in Block S110, each discrete sequence of optical images in the training set including a label linked to a navigational action represented in the discrete sequence of optical images... appending the training set... in Block S114; and retraining the neural network, with the training set, to identify navigational actions in discrete sequences of optical images in Block S124”;
¶[0107]: “...the annotation portal: enables the human annotator to activate and deactivate various predefined navigational actions...”;
Where the method includes for each of the human annotators (for each of the set of users) determining a navigational action (receiving a user response describing a hidden context attribute [...]) based on the optical data)

[...]
generating training data set using the [...] user responses; 
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0099]: “As shown in FIGS. 4 and 5, one variation of the method S100 for training and refining an artificial intelligence includes: ... appending the training set with the first sequence of optical images including one of the first manual label and the first automated label based on confirmation received from the set of human annotators in Block S114; and retraining the neural network, with the training set, to identify navigational actions in discrete sequences of optical images in Block S124”;
Where the method includes appending the training set with the optical images annotated by the set of human annotators (generating training data set using the [...] user responses))

training a neural network using the training data set, the neural network configured to receive an input image and predict the hidden context attribute for the input image; 
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0099]: “As shown in FIGS. 4 and 5, one variation of the method S100 for training and refining an artificial intelligence includes: ... appending the training set with the first sequence of optical images including one of the first manual label and the first automated label based on confirmation received from the set of human annotators in Block S114; and retraining the neural network, with the training set, to identify navigational actions in discrete sequences of optical images in Block S124”;
Where the method includes appending the training set with the optical images annotated by the set of human annotators and retraining the neural network (training a neural network using the training data set), where the neural network is configured to receive optical images (the neural network configured to receive an input image) and identify navigational actions for the optical images (and predict the hidden context attribute for the input image))

navigating an autonomous vehicle, based on the neural network.  
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4; ¶[0118]-¶[0120];
¶[0096]: “The remote computer system can then push a copy of the neural network (or a simplified version of the neural network) to the road vehicle for implementation during autonomous execution of a subsequent route”;
Where the method includes appending the training set with the optical images annotated by the set of human annotators, retraining the neural network, and navigating an autonomous vehicle along a route (navigating an autonomous vehicle) based on the retrained neural network (based on the neural network)).
Huval fails to disclose receiving a user response describing a hidden context attribute for the traffic entity, wherein each user response is associated with a user response time, 
determining a threshold value for user response times based on a difference between a statistical distribution of user responses having a user response time above the threshold value and a statistical distribution of user responses having a user response time below the threshold value; 
selecting a subset of user responses having a user response time above the determined threshold value; and 
generating training data set using the subset of user responses, the limitations bolded for emphasis.

However, in the same field of endeavor, Zhang teaches:
[for each of the set of users, receiving a user response describing a hidden context attribute] for the traffic entity, [...]; 
(Zhang, FIG. 2; FIG. 4; FIG. 5; 
¶[0048]: “The training data collection system 201 can thereby collect actual trajectories of vehicles and corresponding ground truth data under different scenarios and different driver actions and intentions in a context... The driver actions, behaviors, and intentions can correspond to a driver's short term driving goals, such as turning left or right, accelerating or decelerating, merging, making right turn at an intersection, making a U-turn, and the like”;
¶[0049]: “...the prediction-based trajectory planning system 202 can use the trained trajectory prediction module 175 and the real-world perception data 210 (shown in FIG. 6) in the operational phase to generate proximate vehicle or object trajectories”;
¶[0051]: “The directionality and rate behaviors of the proximate vehicles can be used when the training data is generated to enable the trajectory prediction module 175 to learn and thus, predict the likely behavior of proximate vehicles based on the human driving data embodied in the training data. For example, images included in the training data can be labeled using human labelers or automated processes to associate a label having behavior and direction information with each instance of a vehicle in the training data”;
Where the training data collection system 201 includes training data labeled by human labelers ([for each of the set of users, receiving a user response]) that includes a context or intention, such as turning, merging, accelerating, etc. ([describing a hidden context attribute]), of vehicles and objects proximate to the autonomous vehicle (for the traffic entity)).

It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the method of Huval with the features taught by Zhang because “...the unexpected behavior of a proximate dynamic obstacle may result in a collision with the conventional autonomous vehicle” (Zhang, ¶[0005]) and “...Once the predicted behavior of proximate vehicles and objects is determined, the example embodiments can use a motion planning process to generate the predicted trajectory for each of the proximate vehicles. The predicted trajectory for each of the proximate vehicles can be compared with the desired or proposed trajectory for the host vehicle and potential conflicts can be determined. The trajectory for the host vehicle can be modified to avoid the potential conflicts with the proximate vehicles.” (Zhang, ¶[0044]).

The combination of Huval and Zhang fails to teach [...] wherein each user response is associated with a user response time, 
determining a threshold value for user response times based on a difference between a statistical distribution of user responses having a user response time above the threshold value and a statistical distribution of user responses having a user response time below the threshold value; 
selecting a subset of user responses having a user response time above the determined threshold value; and 
generating training data set using the subset of user responses, the limitations bolded for emphasis.

However, in the same field of endeavor (generating training data for neural networks), Dernoncourt teaches:
[... ] wherein each user response is associated with a user response time; 
(Dernoncourt, FIG. 1; FIG. 2; FIG. 4B; FIG. 12; 
¶[0023]: “...the digital document annotation system can determine annotator performance data such as annotator questions, annotator responses, time spent by the annotators”;
¶[0089]: “...as shown in FIG. 4B, the digital document annotation system 110 can utilize the annotator client device 104a to track time periods of an annotator as annotation performance data...”;
¶[0094]: “Moreover, the digital document annotation system 110 can utilize the annotator client device 104a to track start times and completion times for the electronic document review...”;
Where the digital annotation system associates annotator responses with the time spent by annotators ([... ] wherein each user response is associated with a user response time))

determining a threshold value for user response times based on a difference between a statistical distribution of user responses having a user response time above the threshold value and a statistical distribution of user responses having a user response time below the threshold value; 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12; 
¶[0007]: “...the disclosed systems utilize user topic preferences to provide electronic documents to annotators (e.g., users of computing devices providing crowdsourced annotations for one or more documents)...”;
¶[0122]: “...the digital document annotation system 110 can utilize time thresholds to generate a final dataset of digital annotations... the digital document annotation system 110 can compare the time an annotator spent reviewing the entire electronic document to a time threshold that the digital document annotation system 110 has determined to be a minimum amount of time for an adequate review of the electronic document...”;
¶[0125]: “...the digital document annotation system 110 can generate a histogram of time periods for an annotator and compare the histogram to a distribution of review threshold time for the electronic document...”;
Where the digital document annotation system 110 determines a minimum amount of time for adequate review (determining a threshold value for user response times) based on a distribution of review threshold time for the electronic document and whether the annotator’s review time for the document is above or below the minimum  review time (based on a difference between a statistical distribution of user responses having a user response time above the threshold value and a statistical distribution of user responses having a user response time below the threshold value))

selecting a subset of user responses having a user response time above the determined threshold value;
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 5; FIG. 12; 
¶[0063]: “...the digital document annotation system 110 can filter annotators and corresponding annotations based on whether annotation performance data satisfies one or more performance thresholds...”;
¶[0119]: “...as shown in FIG. 5, the digital document annotation system 110 receives annotator digital annotations 502a-502d and annotation performance data 504a-504d and utilizes an annotation performance data filter 506 to generate a final set of digital annotations 508 for the electronic document”;
¶[0122]: “...the digital document annotation system 110 can compare the time an annotator spent reviewing the entire electronic document to a time threshold that the digital document annotation system 110 has determined to be a minimum amount of time for an adequate review of the electronic document...”;
Where the digital document annotation system 110 filters annotators and corresponding annotations based on whether the annotator spent a minimum threshold amount of time reviewing the document, i.e. only selects annotators and corresponding annotations based on the annotator review time meeting or exceeding a minimum review time (selecting a subset of user responses having a user response time above the determined threshold value))

[generating training data set using the] subset of [user responses]; 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 5; FIG. 12; 
¶[0063]: “...the digital document annotation system 110 can filter annotators and corresponding annotations based on whether annotation performance data satisfies one or more performance thresholds...”;
¶[0119]: “...as shown in FIG. 5, the digital document annotation system 110 receives annotator digital annotations 502a-502d and annotation performance data 504a-504d and utilizes an annotation performance data filter 506 to generate a final set of digital annotations 508 for the electronic document”;
¶[0063]: “...the digital document annotation system 110 performs an act 206 of generating a final set of digital annotations based on annotation performance data...”;
¶[0064]: “...the digital document annotation system 110 performs an act 208 of training a neural network. For example, the digital document annotation system 110 can provide the generated final set of digital annotations (i.e., as ground-truth digital annotations) and the electronic documents corresponding to the final set of digital annotations to a neural network...”;
Where the final dataset is used to train a neural network ([generating training data set]) with only the annotators and corresponding annotations that met the minimum review time ([using the] subset of [user responses])).

It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the method of Huval and Zhang with the features taught by Zhang because “...conventional electronic document management systems oftentimes utilize unreliable training data to test or train annotation models, which increases the time and computational resources to converge on a trained model” (Dernoncourt, ¶[0003]). “...the digital document annotation system can compare time spent by annotators to review time thresholds... In this manner, the digital document annotation system can determine reliability of annotators and corresponding annotations” (Dernoncourt, ¶[0028]) and “...by generating and utilizing reliable ground-truth annotation data, the digital document annotation system can train machine learning models with less training data and fewer training iterations. Consequently, the digital document annotation system can utilize less time and computational resources to accurately train a machine learning model” (Dernoncourt, ¶[0031]).


Regarding claim 2, Huval, Zhang, and Dernoncourt teach the method of claim 1. Dernoncourt further teaches:
wherein the threshold value is a particular threshold value, wherein determining the threshold value comprises: 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12; 
¶[0123]: “”...an annotator may have a recorded time of five minutes for the time spent reviewing the electronic document. The digital document annotation system 110 can compare this annotation performance data to a required threshold review time for the electronic document to determine if the annotator is reliable and/or accurate. For instance, the digital document annotation system 110 can determine that a minimum threshold review time for the electronic document is twenty-five minutes;
Where the minimum threshold time is, for example, twenty-five minutes (wherein the threshold value is a particular threshold value))

determining a plurality of threshold values for user response times; 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12; 
¶[0122]: “...the digital document annotation system 110 can utilize time thresholds to generate a final dataset of digital annotations... the digital document annotation system 110 can compare various time periods and/or timestamps from annotation performance data (collected in accordance to FIG. 4) of an annotator to predetermined time thresholds to determine if the annotator has produced reliable and/or accurate digital annotations”;
Where the digital document annotation system utilizes time thresholds for the time spent by the annotators to review the electronic document (determining a plurality of threshold values for user response times))

for each of the plurality of threshold values: 
determining a statistical distribution of user responses having a user response time above the threshold value and a statistical distribution of user responses having a user response time below the threshold value, and 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12; ¶[0007];
¶[0063]: “...the digital document annotation system 110 can filter annotators and corresponding annotations based on whether annotation performance data satisfies one or more performance thresholds...”;
¶[0122]: “...the digital document annotation system 110 can utilize time thresholds to generate a final dataset of digital annotations... the digital document annotation system 110 can compare the time an annotator spent reviewing the entire electronic document to a time threshold that the digital document annotation system 110 has determined to be a minimum amount of time for an adequate review of the electronic document...”;
¶[0125]: “...the digital document annotation system 110 can generate a histogram of time periods for an annotator and compare the histogram to a distribution of review threshold time for the electronic document...”;
Where the digital document annotation system determines a minimum amount of time for adequate review based on a distribution of review threshold time for the electronic document and determines which annotators meet or exceed the minimum review time threshold (determining a statistical distribution of user responses having a user response time above the threshold value) and which annotators do not meet or exceed the minimum review time threshold (and a statistical distribution of user responses having a user response time below the threshold value))

determining a measure of difference between the statistical distribution of user responses having a user response time above the threshold value and the statistical distribution of user responses having a user response time below the threshold value; 
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12;
¶[0125]: “...the digital document annotation system 110 can generate a histogram of time periods for an annotator based on the annotation performance data 502a-502b to utilize in filtering the digital annotation data and the annotators. For example, the digital document annotation system 110 can generate a histogram of time periods for an annotator and compare the histogram to a distribution of review threshold time for the electronic document. In one or more embodiments, the digital document annotation system 110 can determine if the annotator is accurate and/or reliable by determining the amount of variation between the time period histogram and the distribution of review threshold times”;
Where the digital document annotation system determines a variation (determining a measure of difference) between the annotator’s time period histogram and the distribution of review threshold times in order to determine whether the annotator is accurate or reliable based on the minimum review time threshold (between the statistical distribution of user responses having a user response time above the threshold value and the statistical distribution of user responses having a user response time below the threshold value))

selecting the particular threshold value based on the determined measure of differences for the plurality of threshold values.  
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12;
¶[0123]: “”...an annotator may have a recorded time of five minutes for the time spent reviewing the electronic document. The digital document annotation system 110 can compare this annotation performance data to a required threshold review time for the electronic document to determine if the annotator is reliable and/or accurate. For instance, the digital document annotation system 110 can determine that a minimum threshold review time for the electronic document is twenty-five minutes;
¶[0125]: “...the digital document annotation system 110 can generate a histogram of time periods for an annotator based on the annotation performance data 502a-502b to utilize in filtering the digital annotation data and the annotators. For example, the digital document annotation system 110 can generate a histogram of time periods for an annotator and compare the histogram to a distribution of review threshold time for the electronic document. In one or more embodiments, the digital document annotation system 110 can determine if the annotator is accurate and/or reliable by determining the amount of variation between the time period histogram and the distribution of review threshold times”;
Where the minimum threshold time is, for example, twenty-five minutes (selecting the particular threshold value) determined based on the amount of variation between the annotator’s time period histogram and the distribution of review threshold times (based on the determined measure of differences for the plurality of threshold values)).


Regarding claim 3, Huval, Zhang, and Dernoncourt teach the method of claim 2. Dernoncourt further teaches:
wherein selecting the particular threshold value based on the determined measure of differences for each of the plurality of threshold values comprises selecting the particular threshold value responsive to determining that the determined measure of differences for the plurality of threshold values indicates more than a threshold amount of difference.  
(Dernoncourt,  FIG. 1; FIG. 2; FIG. 4B; FIG. 12;
¶[0123]: “”...an annotator may have a recorded time of five minutes for the time spent reviewing the electronic document. The digital document annotation system 110 can compare this annotation performance data to a required threshold review time for the electronic document to determine if the annotator is reliable and/or accurate. For instance, the digital document annotation system 110 can determine that a minimum threshold review time for the electronic document is twenty-five minutes;
¶[0125]: “...the digital document annotation system 110 can generate a histogram of time periods for an annotator based on the annotation performance data 502a-502b to utilize in filtering the digital annotation data and the annotators. For example, the digital document annotation system 110 can generate a histogram of time periods for an annotator and compare the histogram to a distribution of review threshold time for the electronic document. In one or more embodiments, the digital document annotation system 110 can determine if the annotator is accurate and/or reliable by determining the amount of variation between the time period histogram and the distribution of review threshold times”;
Where the minimum threshold time is, for example, twenty-five minutes (wherein selecting the particular threshold value based on the determined measure of differences for each of the plurality of threshold values) determined based on the amount of variation between the annotator’s time period histogram and the distribution of review threshold times (comprises selecting the particular threshold value responsive to determining that the determined measure of differences for the plurality of threshold values indicates more than a threshold amount of difference); where in order to select a threshold amount of time based on the variation, the system must inherently have a minimum variation value (a threshold amount of difference) that triggers the threshold value determination).


Regarding claim 4, Huval, Zhang, and Dernoncourt teach the method of claim 2. Huval further discloses:
wherein the user response is an ordinal value selected from a plurality of ordinal values and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses.  
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0103]: “....the remote computer system can pass the video feed through the navigational neural network for attribution of an automated navigational label, such as representing one of various predefined navigational actions and states including: accelerating, coasting, actively braking, turning left, turning right, veering left, veering right, changing lanes, turning into a different lane, swerving, drifting out of a lane, wandering between lanes, stopped, reversing, clipping a curb, etc...”;
¶[0106]: “As in Blocks S140 and S142 described above, the remote computer system can then serve the new video feed to an annotation portal—executing on a local computer system—for manual labeling by a human annotator; and then receive a manual navigational label attributed to the new video feed by the human annotator”;
¶[0108]-¶[0109];
¶[0113]: “...as in Blocks S150 and S152 described above, the remote computer system can detect navigational label conflicts for a video feed, return this video feed to multiple human annotators for confirmation of these navigational labels or entire relabeling of the video feed with a new sequence of navigational actions, and then compile responses from these human annotators into a final sequence of navigational labels for the video feed”;
See claim 4 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where the method further includes the human annotator annotating a navigational action to the optical data including a speed context associated with the vehicle, such as stopped (wherein the user response is an ordinal value) selected from a plurality of speed contexts such as actively braking, coasting, cruising, accelerating (selected from a plurality of ordinal values), where the speed contexts represent ordinal values in an increasing amount of speed; and where, in the event of a conflict, the optical data is sent to multiple human annotators in order to confirm the labeling (and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses), where the additional human annotators confirming the labeling results in a dataset that includes a rate of occurrence of the labeled speed context, i.e.  a rate of occurrence of the ordinal value).


Regarding claim 6, Huval, Zhang, and Dernoncourt teach the method of claim 1. Zhang further teaches:
wherein the hidden context represents a state of mind of a user represented by the traffic entity.  
(Zhang, FIG. 2; FIG. 4; FIG. 5; 
¶[0048]: “The training data collection system 201 can thereby collect actual trajectories of vehicles and corresponding ground truth data under different scenarios and different driver actions and intentions in a context... The driver actions, behaviors, and intentions can correspond to a driver's short term driving goals, such as turning left or right, accelerating or decelerating, merging, making right turn at an intersection, making a U-turn, and the like”;
¶[0049]: “...the prediction-based trajectory planning system 202 can use the trained trajectory prediction module 175 and the real-world perception data 210 (shown in FIG. 6) in the operational phase to generate proximate vehicle or object trajectories”;
¶[0051]: “The directionality and rate behaviors of the proximate vehicles can be used when the training data is generated to enable the trajectory prediction module 175 to learn and thus, predict the likely behavior of proximate vehicles based on the human driving data embodied in the training data. For example, images included in the training data can be labeled using human labelers or automated processes to associate a label having behavior and direction information with each instance of a vehicle in the training data”;
Where the training data collection system 201 includes training data labeled by human labelers that includes a driver behavior or intention, i.e. a state of mind (wherein the hidden context represents a state of mind of a user) of vehicles proximate to the autonomous vehicle (of a user represented by the traffic entity)).


Regarding claim 7, Huval, Zhang, and Dernoncourt teach the method of claim 1. Zhang further teaches:
wherein the hidden context represents a task that a user represented by the traffic entity is planning on accomplishing.  
(Zhang, FIG. 2; FIG. 4; FIG. 5; 
¶[0048]: “The training data collection system 201 can thereby collect actual trajectories of vehicles and corresponding ground truth data under different scenarios and different driver actions and intentions in a context... The driver actions, behaviors, and intentions can correspond to a driver's short term driving goals, such as turning left or right, accelerating or decelerating, merging, making right turn at an intersection, making a U-turn, and the like”;
¶[0049]: “...the prediction-based trajectory planning system 202 can use the trained trajectory prediction module 175 and the real-world perception data 210 (shown in FIG. 6) in the operational phase to generate proximate vehicle or object trajectories”;
¶[0051]: “The directionality and rate behaviors of the proximate vehicles can be used when the training data is generated to enable the trajectory prediction module 175 to learn and thus, predict the likely behavior of proximate vehicles based on the human driving data embodied in the training data. For example, images included in the training data can be labeled using human labelers or automated processes to associate a label having behavior and direction information with each instance of a vehicle in the training data”;
Where the training data collection system 201 includes training data labeled by human labelers that includes a short term goal such as turning, merging, accelerating, etc. (wherein the hidden context represents a task), of the driver of vehicles proximate to the autonomous vehicle (that a user represented by the traffic entity is planning on accomplishing)).


Regarding claim 9, Huval, Zhang, and Dernoncourt teach the method of claim 1. Zhang further teaches:
wherein the hidden context represents a goal of a user represented by the traffic entity, wherein the user expects to achieve the goal within a threshold time interval.  
(Zhang, FIG. 2; FIG. 4; FIG. 5; 
¶[0048]: “The training data collection system 201 can thereby collect actual trajectories of vehicles and corresponding ground truth data under different scenarios and different driver actions and intentions in a context... The driver actions, behaviors, and intentions can correspond to a driver's short term driving goals, such as turning left or right, accelerating or decelerating, merging, making right turn at an intersection, making a U-turn, and the like. The driver actions, behaviors, and intentions can also correspond to a set of driver or vehicle control actions to accomplish a particular short term driving goal”;
¶[0049]: “...the prediction-based trajectory planning system 202 can use the trained trajectory prediction module 175 and the real-world perception data 210 (shown in FIG. 6) in the operational phase to generate proximate vehicle or object trajectories”;
¶[0051];
¶[0054]: “...The trained trajectory prediction module 175 serves to enable generation of predicted trajectories of proximate vehicles near the host vehicle...”;
Where the training data collection system 201 includes training data labeled by human labelers that includes a short term goal including a predicted trajectory such as turning, merging, accelerating, etc., (wherein the hidden context represents a goal) of the driver of vehicles proximate to the autonomous vehicle (of a user represented by the traffic entity), wherein the short term goal is a predicted trajectory which inherently requires the predicted maneuver to occur within a threshold time interval because a predicted trajectory requires a predicted vehicle movement within a certain time frame).


Regarding claim 10, Huval, Zhang, and Dernoncourt teach the method of claim 1. Huval further discloses:
wherein navigating the autonomous vehicle comprises: 
capturing an image; 
(Huval, FIG. 1; ¶[0008];
¶[0016]: “The method S100 can be executed by a computer system (e.g., a remote server) in conjunction with an autonomous vehicle. The autonomous vehicle can include: a suite of sensors configured to collect information about the autonomous vehicle's environment...”;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data substantially in real-time in order to localize the autonomous vehicle to a known location and orientation in real space, to interpret (or “perceive”) its surroundings, and to then select and execute navigational actions...”;
Where the method for navigating the autonomous vehicle (wherein navigating the autonomous vehicle comprises:) includes obtaining images from sensors on the autonomous vehicle (capturing an image))

providing the image as input to the neural network; and 
(Huval, FIG. 1; ¶[0016];
¶[0018]: “...a controller integrated into the autonomous vehicle can: pass LIDAR and video feeds into a localization/perception neural network to detect and characterize static objects—such as lane markers, lane reflectors, curbs, road signs, telephone poles, and building facades—near the autonomous vehicle substantially in real-time... the neural network can also detect and characterize dynamic objects—such as other vehicles, pedestrians, and cyclists—in the LIDAR and video feeds...”;
Where the method for navigating the autonomous vehicle includes obtaining images from sensors on the autonomous vehicle and processing the image data through a neural network (providing the image as input to the neural network) to detect static and dynamic objects)

determining signals sent to the controls of the autonomous vehicle based on the output of the neural network.  
(Huval, FIG. 1; 
¶[0016]: “The autonomous vehicle can include... a controller. The controller can: determine the location of the autonomous vehicle in real space based on sensor data collected from the suite of sensors and the localization map; determine the context of a scene around the autonomous vehicle based on these sensor data; elect a future action (e.g., a navigational decision) based on the context of the scene around the autonomous vehicle, the real location of the autonomous vehicle, and the navigation map, such as further based on a deep learning and/or artificial intelligence model; and control actuators within the vehicle (e.g., accelerator, brake, and steering actuators) according to elected decisions”;
¶[0018]: “...a controller integrated into the autonomous vehicle can: pass LIDAR and video feeds into a localization/perception neural network to detect and characterize static objects—such as lane markers, lane reflectors, curbs, road signs, telephone poles, and building facades—near the autonomous vehicle substantially in real-time... the neural network can also detect and characterize dynamic objects—such as other vehicles, pedestrians, and cyclists—in the LIDAR and video feeds...”;
Where the method for navigating the autonomous vehicle includes obtaining images from sensors on the autonomous vehicle, processing the image data through a neural network to detect static and dynamic objects, electing a future action for the vehicle and controlling actuators within the vehicle to implement the navigational action (determining signals sent to the controls of the autonomous vehicle) based on the image processing performed by the neural network (based on the output of the neural network)).


Regarding claim 11, the claim limitations recite a non-transitory computer readable storage medium having limitations similar to those of claim 1 and is therefore rejected on the same basis, as outlined above. 
Regarding the additional limitations recited in claim 11, Huval further teaches:
A non-transitory computer readable storage medium storing instructions that when executed by a computer processor, cause the computer processor to perform steps comprising: 
(Huval, FIG. 1; ¶[0016];
¶[0121]: “...systems and methods of the embodiment can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated by computer-executable components integrated with apparatuses and networks of the type described above. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component can be a processor...”;
Where instructions are stored on computer readable medium (A non-transitory computer readable storage medium storing instructions) that, when executed by a processor (that when executed by a computer processor), cause performance of the disclosed process (cause the computer processor to perform steps)).


Regarding claim 12, the claim limitations recite a non-transitory computer readable storage medium having limitations similar to those of claim 2 and is therefore rejected on the same basis, as outlined above.


Regarding claim 13, the claim limitations recite a non-transitory computer readable storage medium having limitations similar to those of claim 3 and is therefore rejected on the same basis, as outlined above.  See claim 13 interpretation based on the rejection under 35 U.S.C. 112(b), above.


Regarding claim 14, Huval, Zhang, and Dernoncourt teach the non-transitory computer readable storage medium of claim 13. Huval further discloses:
wherein the user response is an ordinal value selected from a plurality of ordinal values and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses.  
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0103]: “....the remote computer system can pass the video feed through the navigational neural network for attribution of an automated navigational label, such as representing one of various predefined navigational actions and states including: accelerating, coasting, actively braking, turning left, turning right, veering left, veering right, changing lanes, turning into a different lane, swerving, drifting out of a lane, wandering between lanes, stopped, reversing, clipping a curb, etc...”;
¶[0106]: “As in Blocks S140 and S142 described above, the remote computer system can then serve the new video feed to an annotation portal—executing on a local computer system—for manual labeling by a human annotator; and then receive a manual navigational label attributed to the new video feed by the human annotator”;
¶[0108]-¶[0109];
¶[0113]: “...as in Blocks S150 and S152 described above, the remote computer system can detect navigational label conflicts for a video feed, return this video feed to multiple human annotators for confirmation of these navigational labels or entire relabeling of the video feed with a new sequence of navigational actions, and then compile responses from these human annotators into a final sequence of navigational labels for the video feed”;
See claim 14 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where the method further includes the human annotator annotating a navigational action to the optical data including a speed context associated with the vehicle, such as stopped (wherein the user response is an ordinal value) selected from a plurality of speed contexts such as actively braking, coasting, cruising, accelerating (selected from a plurality of ordinal values), where the speed contexts represent ordinal values in an increasing amount of speed; and where, in the event of a conflict, the optical data is sent to multiple human annotators in order to confirm the labeling (and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses), where the additional human annotators confirming the labeling results in a dataset that includes a rate of occurrence of the labeled speed context, i.e.  a rate of occurrence of the ordinal value).


Regarding claim 15, Huval, Zhang, and Dernoncourt teach the non-transitory computer readable storage medium of claim 11. Huval further discloses:
wherein the machine learning based model is used for navigating an autonomous vehicle, wherein the instructions further cause the computer processor to perform steps comprising: 
(Huval, FIG. 1; ¶[0008]; ¶[0121];
¶[0016]: “The method S100 can be executed by a computer system (e.g., a remote server) in conjunction with an autonomous vehicle. The autonomous vehicle can include: a suite of sensors configured to collect information about the autonomous vehicle's environment... and a controller”;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data substantially in real-time in order to localize the autonomous vehicle to a known location and orientation in real space, to interpret (or “perceive”) its surroundings, and to then select and execute navigational actions...”;
See claim 15 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where the neural network is used to select and execute navigational actions for the autonomous vehicle (wherein the machine learning based model is used for navigating an autonomous vehicle), and where the autonomous vehicle implements the neural network using a controller (wherein the instructions further cause the computer processor to perform steps comprising))

capturing an image by the autonomous vehicle; 
(Huval, FIG. 1; ¶[0008];
¶[0016]: “The method S100 can be executed by a computer system (e.g., a remote server) in conjunction with an autonomous vehicle. The autonomous vehicle can include: a suite of sensors configured to collect information about the autonomous vehicle's environment... and a controller”;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data substantially in real-time in order to localize the autonomous vehicle to a known location and orientation in real space, to interpret (or “perceive”) its surroundings, and to then select and execute navigational actions...”;
Where navigating the autonomous vehicle includes obtaining images from sensors on the autonomous vehicle (capturing an image by the autonomous vehicle) and processing the image data through a neural network to detect static and dynamic objects)

providing the image as input to the machine learning based model; and 
(Huval, FIG. 1; ¶[0008];
¶[0016]: “The method S100 can be executed by a computer system (e.g., a remote server) in conjunction with an autonomous vehicle. The autonomous vehicle can include: a suite of sensors configured to collect information about the autonomous vehicle's environment... and a controller”;
¶[0018]: “The autonomous vehicle can also implement one or more local neural networks to process LIDAR feeds (i.e., sequences of LIDAR images), video feeds (or sequences of color photographic images), and/or other sensor data substantially in real-time in order to localize the autonomous vehicle to a known location and orientation in real space, to interpret (or “perceive”) its surroundings, and to then select and execute navigational actions...”;
See claim 15 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where navigating the autonomous vehicle includes obtaining images from sensors on the autonomous vehicle and processing the image data through a neural network (providing the image as input to the machine learning based model) to detect static and dynamic objects)

determining signals sent to controls of the autonomous vehicle based on the output of the machine learning based model.  
(Huval, FIG. 1; 
¶[0016]: “The autonomous vehicle can include... a controller. The controller can: determine the location of the autonomous vehicle in real space based on sensor data collected from the suite of sensors and the localization map; determine the context of a scene around the autonomous vehicle based on these sensor data; elect a future action (e.g., a navigational decision) based on the context of the scene around the autonomous vehicle, the real location of the autonomous vehicle, and the navigation map, such as further based on a deep learning and/or artificial intelligence model; and control actuators within the vehicle (e.g., accelerator, brake, and steering actuators) according to elected decisions”;
¶[0018]: “...a controller integrated into the autonomous vehicle can: pass LIDAR and video feeds into a localization/perception neural network to detect and characterize static objects—such as lane markers, lane reflectors, curbs, road signs, telephone poles, and building facades—near the autonomous vehicle substantially in real-time... the neural network can also detect and characterize dynamic objects—such as other vehicles, pedestrians, and cyclists—in the LIDAR and video feeds...”;
See claim 15 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where navigating the autonomous vehicle includes obtaining images from sensors on the autonomous vehicle, processing the image data through a neural network to detect static and dynamic objects, electing a future action for the vehicle and controlling actuators within the vehicle to implement the navigational action (determining signals sent to the controls of the autonomous vehicle) based on the image processing performed by the neural network (based on the output of the machine learning based model)).


Regarding claim 16, the claim limitations recite a computer system having limitations similar to those of claim 1 and is therefore rejected on the same basis, as outlined above. 
Regarding the additional limitations recited in claim 16, Huval further teaches:
A computer system comprising: 
(Huval, FIG. 1; 
¶[0016]: “The autonomous vehicle can include... a controller...”;
Where the autonomous vehicle includes a controller, i.e. a computer (A computer system))

a computer processor; and 
(Huval, FIG. 1; ¶[0016];
¶[0121]: “...systems and methods of the embodiment can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated by computer-executable components integrated with apparatuses and networks of the type described above. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component can be a processor...”;
Where instructions are stored on computer readable medium to be executed by a processor (a computer processor) in order to cause performance of the disclosed process)

a non-transitory computer readable storage medium storing instructions that when executed by the computer processor, cause the computer processor to perform steps comprising: 
(Huval, FIG. 1; ¶[0016];
¶[0121]: “...systems and methods of the embodiment can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated by computer-executable components integrated with apparatuses and networks of the type described above. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component can be a processor...”;
Where instructions are stored on computer readable medium (A non-transitory computer readable storage medium storing instructions) that, when executed by the processor (that when executed by a computer processor), cause performance of the disclosed process (cause the computer processor to perform steps)).


Regarding claim 17, the claim limitations recite a non-transitory computer readable storage medium having limitations similar to those of claim 2 and is therefore rejected on the same basis, as outlined above.


Regarding claim 18, the claim limitations recite a non-transitory computer readable storage medium having limitations similar to those of claim 3 and is therefore rejected on the same basis, as outlined above.  See claim 18 interpretation based on the rejection under 35 U.S.C. 112(b), above.


Regarding claim 19, Huval, Zhang, and Dernoncourt teach the computer system of claim 18. Huval further discloses:
wherein the user response is an ordinal value selected from a plurality of ordinal values and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses.  
(Huval, FIG. 1; FIG. 2B; FIG. 3; FIG. 4;
¶[0103]: “....the remote computer system can pass the video feed through the navigational neural network for attribution of an automated navigational label, such as representing one of various predefined navigational actions and states including: accelerating, coasting, actively braking, turning left, turning right, veering left, veering right, changing lanes, turning into a different lane, swerving, drifting out of a lane, wandering between lanes, stopped, reversing, clipping a curb, etc...”;
¶[0106]: “As in Blocks S140 and S142 described above, the remote computer system can then serve the new video feed to an annotation portal—executing on a local computer system—for manual labeling by a human annotator; and then receive a manual navigational label attributed to the new video feed by the human annotator”;
¶[0108]-¶[0109];
¶[0113]: “...as in Blocks S150 and S152 described above, the remote computer system can detect navigational label conflicts for a video feed, return this video feed to multiple human annotators for confirmation of these navigational labels or entire relabeling of the video feed with a new sequence of navigational actions, and then compile responses from these human annotators into a final sequence of navigational labels for the video feed”;
See claim 19 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where the method further includes the human annotator annotating a navigational action to the optical data including a speed context associated with the vehicle, such as stopped (wherein the user response is an ordinal value) selected from a plurality of speed contexts such as actively braking, coasting, cruising, accelerating (selected from a plurality of ordinal values), where the speed contexts represent ordinal values in an increasing amount of speed; and where, in the event of a conflict, the optical data is sent to multiple human annotators in order to confirm the labeling (and the statistical distribution is a histogram of a rate of occurrence of each ordinal value in user responses), where the additional human annotators confirming the labeling results in a dataset that includes a rate of occurrence of the labeled speed context, i.e.  a rate of occurrence of the ordinal value).


Regarding claim 20, the claim limitations recite a computer system having limitations similar to those of claim 15 and is therefore rejected on the same basis, as outlined above. See claim 20 interpretation based on the rejection under 35 U.S.C. 112(b), above.


Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Huval, Zhang, and Dernoncourt as applied to claim1 above, and in further view of Shalev-Shwartz et al. (US 2019/0291728 A1), henceforth known as Shalev-Shwartz.

Regarding claim 8, Huval, Zhang, and Dernoncourt teach the method of claim 1. The combination of Huval, Zhang, and Dernoncourt fails to explicitly teach the limitations of claim 8 as a whole. 
However, in the same field of endeavor, Shalev-Shwartz teaches:
wherein the hidden context represents a degree of awareness of the autonomous vehicle by a user represented by the traffic entity.  
(Shalev-Shwartz, FIG. 12; FIG. 13; FIG. 14; ¶[0007]; ¶[0093]-¶[0100]; ¶[0164];
¶[0247]: “...the machine learning system may be trained using a desired set of constraints as training guidelines and, therefore, the trained system may select an action in response to a sensed navigational state that accounts for and adheres to the limitations of applicable navigational constraints...”;
¶[0301]: “...the at least one navigational constraint relaxation factor may include a determination (based on image analysis) that the eyes of a pedestrian are looking in a direction of the host vehicle. In such cases, it may more safely be assumed that the pedestrian is aware of the host vehicle. As a result, a confidence level may be higher that the pedestrian will not engage in unexpected actions that cause the pedestrian to move into a path of the host vehicle...”;
See claim 8 interpretation based on the rejection under 35 U.S.C. 112(b), above.
Where the autonomous vehicle includes a neural network trained using machine learning in order to identify  whether a pedestrian is aware of the autonomous vehicle (wherein the hidden context represents a degree of awareness of the autonomous vehicle by a user represented by the traffic entity)).

It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the method of Huval, Zhang, and Dernoncourt with the features taught by Shalev-Shwartz because “...In such cases, it may more safely be assumed that the pedestrian is aware of the host vehicle. As a result, a confidence level may be higher that the pedestrian will not engage in unexpected actions that cause the pedestrian to move into a path of the host vehicle...” (Shalev-Shwartz, ¶[0301]). That is, if it can be determined a pedestrian is aware of the autonomous vehicle, the autonomous vehicle can plan a future action with higher confidence and safety. 


Allowable Subject Matter
Claim 5 is objected to as being dependent upon a rejected base claim, but would be allowable if amended to overcome the rejections under 35 U.S.C. 112(b) and rewritten in independent form including all of the limitations of the base claim and any intervening claims.
As allowable subject matter has been indicated, applicant's reply must either comply with all formal requirements or specifically traverse each requirement not complied with.  See 37 CFR 1.111(b) and MPEP § 707.07(a).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Martinson et al. (US 2018/0053102 A1) discloses a method that includes aggregating local sensor data from vehicle system sensors, detecting a driver action using the local sensor data, and extracting features related to predicting driver action from the local sensor data during the operation of the vehicle. The method may include adapting a stock machine learning-based driver action prediction model to a customized machine learning-based driver action prediction model using one or more of the extracted features and the detected driver action, the stock machine learning-based driver action prediction model initially generated using a generic model configured to be applicable to a generalized driving populace.
Ellenbogen et al. (US 2017/0099200 A1) discloses a system in which data is received characterizing a request for agent computation of sensor data. The request includes a required confidence and required latency for completion of the agent computation. Agents to query are determined based on the required confidence. Data is transmitted to query the determined agents to provide analysis of the sensor data and comprises an agent response time and averaging/weighting agent responses. 
Dorner (US 11,375,256 B1) discloses a machine learning system that builds and uses computer models for identifying or predicting intensity of emotional reactions elicited by a particular video. Such computer models may also determine which particular emotional reaction corresponds to certain times during the video, and whether these reactions are positive or negative for a particular user. The computer models can also predict emotional reactions likely to be elicited by new videos based on learned correlations between video features and elicited emotional reactions.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tawri M Matsushige whose telephone number is (571)272-3715. The examiner can normally be reached M-Th (0800-1400).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Lee can be reached on (571)270-5965. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/T.M.M./Examiner, Art Unit 3668                                                                                                                                                                                                        
/JAMES J LEE/Supervisory Patent Examiner, Art Unit 3668