DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 2/22/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1, and similarly 9 and 17, recite “executing, by the processor a machine learned model that is trained using self-supervised learning to identify change detections from inputs to the model, whether the differences satisfy change detection criteria for updating the sensor-based reference map”. “executing… whether the differences satisfy change detection criteria” is unclear and indefinite because it is unclear how executing . . . whether appears to be an incomplete thought. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 8-12, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Arditi (US2019147331A1)(hereinafter “Arditi”) in view of Zhang (US20190080203A1)(hereinafter “Zhang”).
	With respect to claim 1, and similarly 9 and 17,
	Arditi discloses:
receiving, from a sensor device of a vehicle, (Arditi ¶45 “a computing system of an autonomous vehicle driving on a road may obtain sensor data at a particular location (e.g., corresponding to an x latitude and y longitude).”)
an indication that a registered object is detected in proximity to the vehicle; (Arditi Fig. 7, 740 “Compare identified objects with HD map at the location”; Arditi Fig. 7, 745 “Object(s) exist in HD map at the location”; Arditi ¶46 “the computing system of the autonomous vehicle may process the sensor data to identify any objects of interest. In particular embodiments, the autonomous vehicle may use an object classifier to process the sensor data to detect and identify objects. Using the scenario depicted in FIG. 6 as an example, the object classifier, based on the sensor data ( e.g., camera or LiDAR data), may detect the existence of the objects (i.e., the box 620 and the pothole 630) in the road, as well as other objects such as buildings, road dividers, sidewalks, etc.”; Arditi ¶47 “the computing system may access the HD map stored on the autonomous vehicle for map data associated with the particular location ( e.g., x, y coordinates). Then at step 740, the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected in step 720 to determine whether the detected objects exist in the map data… At step 745, if the comparison results in a determination that the detected object(s) exists or is known in the HD map ( e.g., the confidence score in the object existing in the map is higher than a threshold)”; Arditi ¶45 “autonomous vehicle driving on a road”)
determining, by a processor of the vehicle, based on the indication, (Arditi ¶47 “the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected in step 720 to determine whether the detected objects exist in the map data… In particular embodiments, the system may generate a confidence score representing the likelihood of the detected object being accounted for in the map data.”; Arditi ¶80 “computer system 1000 includes a processor 1002”; Arditi ¶46 “the computing system of the autonomous vehicle may process the sensor data to identify any objects of interest.”)
differences between features of the registered object and features of a sensor-based reference map, (Arditi ¶47 “The confidence score may be based on, for example, a similarity comparison of the measured size, dimensions, classification, and/or location of the detected object with known objects in the map data.”)
the features of the sensor-based reference map comprising a map location that corresponds to a coordinate location of the registered object; (Arditi ¶47 “the computing system may access the HD map stored on the autonomous vehicle for map data associated with the particular location ( e.g., x, y coordinates). Then at step 740, the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object”)
executing, by the processor, a machine-learned model that is trained using self- supervised learning to identify change detections from inputs to the model, (Ariditi ¶48 “The data transmitted to the server may be processed by the server… In particular embodiments, the server may transform the received data into generated map data, using the trained machine-learning model (e.g., CNN(s) and DCNN) described elsewhere herein, and compare the generated map data with the corresponding map data in the current HD map. If no mismatch is detected, the server may decide to not update the HD map. On the other hand, if the server determines that the current HD map does not include the detected object, the server may update the server-copy of the HD map as well as the local copies of the HD map on autonomous vehicles.”) 
whether the differences satisfy change detection criteria for updating the sensor-based reference map; (Arditi ¶47 “The confidence score may be based on, for example, a similarity comparison of the measured size, dimensions, classification, and/or location of the detected object with known objects in the map data.”)
responsive to determining that the differences satisfy the change detection criteria, causing, by the processor, (Arditi ¶44 “In particular embodiments, whether the discovery of a new object would trigger the HD map to be updated may depend on the classification of the new object (e.g., as previously described, the machine-learning model may output map data with labeled objects). For example, the HD map may not be updated if the object is classified as being a person, an animal, a car, or any other object that is likely to move on its own. As another example, the HD map may be updated if the object is classified as a pothole, a broken-down car ( e.g., a stationary car with emergency lights on), a fallen tree branch, or any other inanimate object that is likely to remain stationary. In particular embodiments, inanimate objects may be detected based on a determination that data gathered at a particular location by different vehicles at different times within a time frame ( e.g., 1 minute, 5 minutes, 30 minutes, etc.) consistently include the same new object, which may indicate that the object is inanimate and may continue to remain in the street. As such, the object may be considered as a hazard, which may warrant the HD map to be updated.”)
the sensor-based reference map to be updated to reduce the differences; (Arditi ¶44 “As another example, the HD map may be updated if the object is classified as a pothole, a broken-down car ( e.g., a stationary car with emergency lights on), a fallen tree branch, or any other inanimate object that is likely to remain stationary. In particular embodiments, inanimate objects may be detected based on a determination that data gathered at a particular location by different vehicles at different times within a time frame ( e.g., 1 minute, 5 minutes, 30 minutes, etc.) consistently include the same new object, which may indicate that the object is inanimate and may continue to remain in the street. As such, the object may be considered as a hazard, which may warrant the HD map to be updated.”)”)
and causing, by the processor, the vehicle to operate in an autonomous mode that relies on the sensor-based reference map for navigating the vehicle (Arditi ¶48 “In particular embodiments, the server may prioritize updating the HD maps of vehicles that would most likely be impacted by the HD map update. For example, the server may prioritize autonomous vehicles that are in the region (e.g., within a threshold distance) of where the new object is detected or have trajectories that would result in the vehicles being in that region in the near future.”; Arditi ¶60 “In particular embodiments, the autonomous vehicle system 850 may include a driving/navigation module 860 for controlling the autonomous vehicle's driving and navigation functions. The driving/navigation module 860 may be configured to access a local HD map stored in an HD map data sore 876. In particular embodiments, the local HD map may be generated or updated using embodiments described herein, or it may be from a third-party map provider.”; Arditi ¶61 “Once updated, the HD map data 876 may continue to be used by the driving/navigation module 860 to assist with the vehicle's autonomous driving operations.”)
Arditi fails to explicitly disclose: 
Operating the vehicle in an autonomous mode while under the condition of being in proximity to the coordinate location of the registered object
However, Zhang, from the same field of endeavor, discloses:
Operating the vehicle in an autonomous mode while under the condition of being in proximity to the coordinate location of the registered object (Zhang ¶44 “During the driving of the autonomous vehicle, a lidar installed on the autonomous vehicle may collect information of the outside environment in real time, generate a laser point cloud, and transmit the laser point cloud to the electronic device. The electronic device may analyze and process the received laser point cloud to identify and track an obstacle in the environment around the vehicle, and predict the travel route of the obstacle for further path planning and driving control of the vehicle.”)
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the autonomous driving mode while in proximity to a registered object, as taught by Zhang, in the system of Arditi, in order to more easily distinguish between different types of objects. (Zhang ¶45 “Therefore, firstly, the electronic device may test each frame of the received laser point cloud to distinguish between which laser point data in the laser point cloud are used for describing an obstacle, which laser point data are used for describing a non-obstacle (e.g., a driving area), and which laser point data are used for describing a given obstacle.”) Therefore, the autonomous vehicle can navigate while avoiding these obstacles, which improves safety. (Zhang ¶43 “Here, the obstacle may include a static obstacle and a moving obstacle… An autonomous vehicle needs to avoid static obstacles and moving obstacles during driving.”)

	With respect to claim 2, and similarly 10 and 18,
	Arditi in view of Zhang discloses:
wherein the sensor device comprises a radar device (Arditi ¶40 “the computing system may generate map data by processing the accessed sensor data (e.g., LiDAR data) and associated metadata using the training machine-learning model. As indicated above, the machine learning model may also process additional types of data (e.g., camera and/or radar) gathered by the data-gathering vehicle at the same geographic location.”)
and the sensor-based reference map comprises a reference map at least partially derived from radar data. (Arditi ¶40 “the computing system may generate map data by processing the accessed sensor data”; Arditi ¶44 “HD map may be updated periodically in real-time using gathered sensor data”)

With respect to claim 3, and similarly 11 and 19,
Arditi in view of Zhang discloses:
wherein the sensor device comprises a lidar device (Arditi ¶40 “the computing system may generate map data by processing the accessed sensor data (e.g., LiDAR data)”)
and the sensor-based reference map comprises a reference map at least partially derived from point cloud data. (Arditi ¶14 “the detailed road-level information gathered by the sensors, together with the corresponding road-layout information, may be used to generate the three-dimensional model of a HD map”)

	With respect to claim 4, and similarly 12 and 20,
Arditi in view of Zhang discloses:
causing, by the processor, the machine-learned model to train using self-supervised learning by generating multiple change detection criteria for determining whether to update the sensor-based reference map. (Arditi ¶47 “the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected… The confidence score may be based on, for example, a similarity comparison of the measured size, dimensions, classification, and/or location of the detected object with known objects in the map data.”. This is multiple change criteria.)

With respect to claim 8 and similarly 16,
Arditi in view of Zhang discloses:
wherein the map location comprises a three-dimensional region of space, (Arditi ¶14 “the detailed road-level information gathered by the sensors, together with the corresponding road-layout information, may be used to generate the three-dimensional model of a HD map”; Arditi ¶74 “LiDAR is an effective sensor for measuring distances to targets, and as such may be used to generate a three-dimensional (3D) model of the external environment of the autonomous vehicle 940”)
and the coordinate location of the registered object comprises a three-dimensional coordinate location in space. (Arditi ¶47 “the computing system may access the HD map stored on the autonomous vehicle for map data associated with the particular location ( e.g., x, y coordinates). Then at step 7 40, the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected”)

Claims 5-6 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Arditi in view of Zhang, further in view of Gao (WO2016187472A1)(hereinafter “Gao”).
With respect to claim 5, and similarly 13,
Arditi in view of Zhang discloses:
wherein generating the multiple change detection criteria for determining whether to update the sensor-based reference map comprises self-supervised learning (Arditi ¶20 “The machine-learning model may be trained using any suitable training techniques, including using supervised machine learning to learn from labeled training data… A training sample may further be associated with a known, target output, which in particular embodiments may be existing HD map data 480 at that particular location (x, y), which may include labeled segments or bounding boxes that indicate particular types of objects (e.g., curbs, walls, dividers, buildings, structures, etc.) in the HD map data. As an example, a labeled segment or bounding box may indicate that a known lane divider, for example, is within a boundary of a particular three-dimensional region in the HD map. In the context of training of the machine-learning model, the HD map data 480, which serves as the target or desired output for the associated training sample, may be referred to as the label for that training sample. Through training, the machine-learning model is iteratively updated to become better at recognizing relationships between feature patterns observed in the input training data and the corresponding target outputs ( or labels). Once trained, the machine-learning model may be used to process a given input data, recognize the feature patterns of interest in the input data, and generate an output based on relationships between feature patterns and desired output learned from the training data.”)
Arditi in view of Zhangs fails to disclose:
Wherein the self-supervised learning is based on training data that includes pretext tasks that are specifically in a natural language. 
However, Gao, from the same field of endeavor, discloses:
Wherein the self-supervised learning is based on training data that includes pretext tasks that are specifically in a natural language. (Gao Fig. 5, 515 “Receive a question input in a natural language sentence related to the image input”; Gao Fig. 5, 520 “Encode the natural language sentence into a dense vector representation”; Gao ¶71 “the first LSTM component 210 of the model (i.e., the LSTM to extract the question embedding) was replaced with the average embedding of the words in the question using word2vec”)
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the pretext tasks in a natural language, as taught by Gao, in the system of Arditi in view of Zhang, so that this combination “greatly enhances the interaction between the computer and the user as compared to prior approaches.” (Gao ¶24). This enhancement of interaction increases the accuracy of the machine learning algorithm (Gao  ¶64 “each generated sentence with a score (the larger the better) is rated as described in
Subsection 2 below, which gives a more fine-grained evaluation”)

With respect to claim 6,
Arditi in view of Zhang discloses:
generating multiple change detection criteria for determining whether to update the sensor-based reference map. (Arditi ¶47 “the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected… The confidence score may be based on, for example, a similarity comparison of the measured size, dimensions, classification, and/or location of the detected object with known objects in the map data.”. This is multiple change criteria. Arditi ¶27 “the CNN 450 may also be configured to receive object classification information 445 generated based on any of the sensor data, since object classification information may be used in an HD map to identify objects of interest ( e.g., curbs, walls, dividers, walls, etc.)
Arditi in view of Zhang fail to disclose:
 wherein generating the multiple change detection criteria for determining whether to update the sensor-based reference map comprises self-supervised learning based on training data that further includes sensor-based questions and answers. hang fails to disclose:
However, Gao, from the same field of endeavor, discloses:
wherein generating the multiple change detection criteria for determining whether to update the sensor-based reference map comprises self-supervised learning based on training data that further includes sensor-based questions and answers. (Gao ¶39 “the mQA model receives an image input in step 505 and extracts a visual representation out of the image input in step 510. The mQA model also receives a question input in a natural language sentence related to the image input in step 515 and encodes the natural language sentence into a dense vector representation in step 520. It shall be noted that the mQA model may receive the image input and question input in different order or concurrently instead of receiving the image input first. In step 525, the mQA model extracts a representation of a current word in the answer and its linguistic context. In step 530, a next word in the answer is generated using a fusion comprising the dense vector representation, the visual representation, and the representation of the current word.”; Gao ¶71 “For the first variant (i.e., "mQA-avg-question"), the first LSTM component 210 of the model (i.e., the LSTM to extract the question embedding) was replaced with the average embedding of the words in the question using word2vec. This was used D show the effectiveness of the LSTM as a question embedding learner and extractor. For the second variant (i.e. "mQA-same-LSTMs"), two shared-weights LSTMs were used to model question and answer. This was used to show the effectiveness of the decoupling strategy of the weights of the LSTM(Q) and the LSTM(A) in embodiments of the model.”)
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the sensor-based questions and answers of Gao, in the system of Arditi in view of Zhang, in order to improve the quality, speed, and ease-of-use of the machine learned model. (Gao ¶36 “using separate different LSTMs enables a more "authentic" answer which provides a more natural and engaging interface for a user to use the image-question-answer model. On the other hand, by sharing weight matrix between these two components, the overall answer generation process may be streamlined and simplified, which allows faster computation time without scarifying the "authentic" quality of the answer generated.”)

With respect to claim 14,
Arditi in view of Zhang, further in view of Gao discloses:
wherein generating the multiple change detection criteria for determining whether to update the sensor-based reference map comprises self-supervised learning based on training data that further includes sensor-based questions and answers. (Gao ¶39 “the mQA model receives an image input in step 505 and extracts a visual representation out of the image input in step 510. The mQA model also receives a question input in a natural language sentence related to the image input in step 515 and encodes the natural language sentence into a dense vector representation in step 520. It shall be noted that the mQA model may receive the image input and question input in different order or concurrently instead of receiving the image input first. In step 525, the mQA model extracts a representation of a current word in the answer and its linguistic context. In step 530, a next word in the answer is generated using a fusion comprising the dense vector representation, the visual representation, and the representation of the current word.”; Gao ¶71 “For the first variant (i.e., "mQA-avg-question"), the first LSTM component 210 of the model (i.e., the LSTM to extract the question embedding) was replaced with the average embedding of the words in the question using word2vec. This was used D show the effectiveness of the LSTM as a question embedding learner and extractor. For the second variant (i.e. "mQA-same-LSTMs"), two shared-weights LSTMs were used to model question and answer. This was used to show the effectiveness of the decoupling strategy of the weights of the LSTM(Q) and the LSTM(A) in embodiments of the model.”; Arditi ¶47 “the computing system may compare the map data associated with the location ( e.g., x, y coordinates) with the object detected… The confidence score may be based on, for example, a similarity comparison of the measured size, dimensions, classification, and/or location of the detected object with known objects in the map data.”. This is multiple change criteria. Arditi ¶27 “the CNN 450 may also be configured to receive object classification information 445 generated based on any of the sensor data, since object classification information may be used in an HD map to identify objects of interest ( e.g., curbs, walls, dividers, walls, etc.))

Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Arditi in view of Zhang, further in view of Gao, and further in view of (US20200043478A1)(hereinafter “Kwangyang").
With respect to claim 7, 
Arditi in view of Zhang, further in view of Gao, discloses:
update the sensor-based reference map using self-supervised learning based on training data that further includes sensor-based questions and answers. (Gao ¶39 “the mQA model receives an image input in step 505 and extracts a visual representation out of the image input in step 510. The mQA model also receives a question input in a natural language sentence related to the image input in step 515 and encodes the natural language sentence into a dense vector representation in step 520. It shall be noted that the mQA model may receive the image input and question input in different order or concurrently instead of receiving the image input first. In step 525, the mQA model extracts a representation of a current word in the answer and its linguistic context. In step 530, a next word in the answer is generated using a fusion comprising the dense vector representation, the visual representation, and the representation of the current word.”; Gao ¶71 “For the first variant (i.e., "mQA-avg-question"), the first LSTM component 210 of the model (i.e., the LSTM to extract the question embedding) was replaced with the average embedding of the words in the question using word2vec. This was used D show the effectiveness of the LSTM as a question embedding learner and extractor. For the second variant (i.e. "mQA-same-LSTMs"), two shared-weights LSTMs were used to model question and answer. This was used to show the effectiveness of the decoupling strategy of the weights of the LSTM(Q) and the LSTM(A) in embodiments of the model.”)
Arditi in view of Zhao, further in view of Gao fails to disclose:
wherein the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment. 
However, Kwangyang, from the same field of endeavor, discloses:
wherein the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment. (Kwangyang ¶120 “The XR device 100c may analyzes three-dimensional point cloud data or image data acquired from various sensors or the external devices, generate position data and attribute data for the three-dimensional points, acquire information about the surrounding space or the real object, and render to output the XR object to be output.”; Kwangyang ¶121 “The XR device 100c may perform the above described operations by using the learning model composed of at least one artificial neural network. For example, the XR device 100c may recognize the real object from the three-dimensional point cloud data or the image data by using the learning model”; Kwangyang ¶40 “The supervised learning may refer to a method of learning an artificial neural network in a state in which a label for learning data is given, and the label may mean the correct answer ( or result value) that the artificial neural network must infer when the learning data is input to the artificial neural network.”; Kwangyang ¶185 “the speech-act step refers to a step of determining the intention of a sentence such as whether the user asks a question, makes a request, or expresses simple emotion.”; Kwangyang ¶186 “The dialog processing step refers to a step of determining whether to answer the user's utterance, respond to the user's utterance or question about more information”; Kwangyang ¶188 “For example, when the artificial intelligence apparatus 100 supports the speech-to-text conversion function, the artificial intelligence apparatus 100 may convert the speech data into the text data and transmit the converted text data to the NLP server 20.”) Arditi ¶27 “the CNN 450 may also be configured to receive object classification information 445 generated based on any of the sensor data, since object classification information may be used in an HD map to identify objects of interest ( e.g., curbs, walls, dividers, walls, etc.). 
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment, as taught by Kwangyang, in the system of Arditi in view of Zhao, further in view of Gao, in order to improve the effectiveness of a natural language machine learning model (Kwangyang ¶249 “it is possible to improve a natural language understanding ability for a speech command, by specifying an object referred to by anaphora included in the speech command.”)

With respect to claim 15,
Arditi in view of Zhang, further in view of Gao, discloses:
update the sensor-based reference map using self-supervised learning based on training data that further includes sensor-based questions and answers. (Gao ¶39 “the mQA model receives an image input in step 505 and extracts a visual representation out of the image input in step 510. The mQA model also receives a question input in a natural language sentence related to the image input in step 515 and encodes the natural language sentence into a dense vector representation in step 520. It shall be noted that the mQA model may receive the image input and question input in different order or concurrently instead of receiving the image input first. In step 525, the mQA model extracts a representation of a current word in the answer and its linguistic context. In step 530, a next word in the answer is generated using a fusion comprising the dense vector representation, the visual representation, and the representation of the current word.”; Gao ¶71 “For the first variant (i.e., "mQA-avg-question"), the first LSTM component 210 of the model (i.e., the LSTM to extract the question embedding) was replaced with the average embedding of the words in the question using word2vec. This was used D show the effectiveness of the LSTM as a question embedding learner and extractor. For the second variant (i.e. "mQA-same-LSTMs"), two shared-weights LSTMs were used to model question and answer. This was used to show the effectiveness of the decoupling strategy of the weights of the LSTM(Q) and the LSTM(A) in embodiments of the model.”)
Arditi in view of Zhao, further in view of Gao fails to disclose:
wherein the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment. 
However, Kwangyang, from the same field of endeavor, discloses:
wherein the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment. (Kwangyang ¶120 “The XR device 100c may analyzes three-dimensional point cloud data or image data acquired from various sensors or the external devices, generate position data and attribute data for the three-dimensional points, acquire information about the surrounding space or the real object, and render to output the XR object to be output.”; Kwangyang ¶121 “The XR device 100c may perform the above described operations by using the learning model composed of at least one artificial neural network. For example, the XR device 100c may recognize the real object from the three-dimensional point cloud data or the image data by using the learning model”; Kwangyang ¶40 “The supervised learning may refer to a method of learning an artificial neural network in a state in which a label for learning data is given, and the label may mean the correct answer ( or result value) that the artificial neural network must infer when the learning data is input to the artificial neural network.”; Kwangyang ¶185 “the speech-act step refers to a step of determining the intention of a sentence such as whether the user asks a question, makes a request, or expresses simple emotion.”; Kwangyang ¶186 “The dialog processing step refers to a step of determining whether to answer the user's utterance, respond to the user's utterance or question about more information”; Kwangyang ¶188 “For example, when the artificial intelligence apparatus 100 supports the speech-to-text conversion function, the artificial intelligence apparatus 100 may convert the speech data into the text data and transmit the converted text data to the NLP server 20.”) Arditi ¶27 “the CNN 450 may also be configured to receive object classification information 445 generated based on any of the sensor data, since object classification information may be used in an HD map to identify objects of interest ( e.g., curbs, walls, dividers, walls, etc.). 
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the sensor-based questions and answers include questions and answers related to point cloud data indicative of three-dimensional features of registered objects located at various map locations in an environment, as taught by Kwangyang, in the system of Arditi in view of Zhao, further in view of Gao, in order to improve the effectiveness of a natural language machine learning model (Kwangyang ¶249 “it is possible to improve a natural language understanding ability for a speech command, by specifying an object referred to by anaphora included in the speech command.”)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Carville A. Hollingsworth IV whose telephone number is (571)272-9812. The examiner can normally be reached Mon-Fri, 7:30am -5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Faris Almatrahi, can be reached on 313-446-4821. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/CARVILLE ALBERT HOLLINGSWORTH IV/Examiner, Art Unit 3667                               
/KENNETH J MALKOWSKI/Primary Examiner, Art Unit 3667