DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Status of Claims
The After-Final amendment filed 2/24/22 has been reviewed and is entered. Claims 1-18 are pending and have been examined below.

Response to Arguments
Applicant's amendments with respect to the claim objections have been considered and are persuasive. The objections are withdrawn.

Applicant's arguments with respect to 35 USC 103, specifically Applicant's assertion that Karras fails to qualify as prior art, have been considered and are persuasive. The claims are rejected under 35 USC 103, as detailed below, in view of newly found prior art.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 USC 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 7, 11, 13 and 17 are rejected under 35 USC 103 as being unpatentable over US20210261146 ("Inoue") in view of US20200272162 ("Hasselgren").

Claim 1
Inoue discloses a method for operating a self-driving vehicle (abstract, 0001, 0039 automatic operation for automatically controlling acceleration and deceleration control and/or steering control), comprising: 
receiving one or more images having a plurality of objects (0023 object recognition unit 16 executes a recognition process of a specific object based on image data imaged by the vehicle exterior imaging cameras 13, a distance to a specific object detected by the onboard sensors 14, and map data stored in the map database 15. That is, the object recognition unit 16 uses known techniques such as deep neural network DNNs that utilize deep learning methods to recognize a variety of objects, such as vehicles, bikes, bicycles, pedestrians, signal machines, intersections, crossroads, street trees); 
receiving a notification from an occupant of the self-driving vehicle (0019 voice recognition unit 12 performs a voice recognition process using a known method such as a statistical method from voice data inputted to the in-vehicle sound collection microphone, 0020 The word indicating the motion of the vehicle refer to a word relating to driving control, particularly steering operation, such as "turn left", "stop", "refuel/charge", "move to the right lane", "get on the expressway", "overtake" and "track", and this word indicating the motion of the vehicle is extracted from voice data. The modifier of an object is a word indicating a color, a shape, a size, specification, or the like of a specific object, and can be exemplified by "a red car", "a round building", "a large intersection", "a taxi", "a one-box car); 
generating an attention map highlighting at least some portions of the plurality of objects (0044 In image data shown in FIG. 6, image data of the vehicles V1 and V2 may be added, or image data of the corner convenience store C1 may be added. FIG. 8 is a diagram illustrating further another exemplary image data created by the object feature estimation unit 18 and displayed on the display. FIG. 8 is image data captured by the vehicle exterior imaging camera 13 to be displayed as it is for the utterance of the user "turn left in front of the taxi" and a solid line surrounding the taxi V1 which is an estimated object, the arrow display R1 representing the left turn, and dotted lines surrounding the other recognized vehicles V2, V3, V4 are overlapped with this image data., Fig. 6, Fig. 7, Fig. 8); and 
providing at least one of a steering control or a velocity control to operate the self- driving vehicle based on the attention map and the notification (0046 when the estimated object displayed on the display is correct, the user presses the "Yes" touch button or touches the displayed estimated object itself. In step S39, if the estimated object is the object intended by the user, the process proceeds to step S40, and the route change identification unit 19 changes a traveling route set preliminarily to a traveling route based on an action intended by the user (travel control) in step S40. For example, when the user utters "turn left the road in front of the taxi", the traveling route is changed to the traveling route for tuning left the road in front of the estimated taxi, although the preliminarily set traveling route is straight ahead., 0043 illustrates the subject vehicle V0, the vehicles V1 and V2 traveling ahead, the steering operation information R1 relating to steering operation of the subject vehicle extracted from the language (character string data) indicating the action (travel control), and the convenience store C1 located at the corner of the road., Fig. 6, Fig. 7, Fig. 8). 
Inoue fails to disclose generating a plurality of visually-descriptive latent vectors based on the one or more images; generating a latent vector based on the notification; and that the generation of the attention map highlighting the plurality of objects is based on at least one of the plurality of visually-descriptive latent vectors and the latent vector. However, Inoue does disclose generating an attention map highlighting the plurality of objects based on at least one of the one or more images and the notification (0044 In image data shown in FIG. 6, image data of the vehicles V1 and V2 may be added, or image data of the corner convenience store C1 may be added. FIG. 8 is a diagram illustrating further another exemplary image data created by the object feature estimation unit 18 and displayed on the display. FIG. 8 is image data captured by the vehicle exterior imaging camera 13 to be displayed as it is for the utterance of the user "turn left in front of the taxi" and a solid line surrounding the taxi V1 which is an estimated object, the arrow display R1 representing the left turn, and dotted lines surrounding the other recognized vehicles V2, V3, V4 are overlapped with this image data., Fig. 6, Fig. 7, Fig. 8), as well as the use of a deep neural network for the object recognition (0023 object recognition unit 16 uses known techniques such as deep neural network DNNs that utilize deep learning methods to recognize a variety of objects). Furthermore, Hasselgren teaches generating a plurality of visually-descriptive latent vectors based on the one or more images (0023); generating a latent vector based on the notification (0023); and that the generation of the attention map highlighting the plurality of objects is based on at least one of the plurality of visually-descriptive latent vectors and the latent vector (0023 the input 102 can be a data signal such as an audio signal, video signal, analog signal, digital signal, and/or variations thereof. In an embodiment, the autoencoder is used to sharpen, enhance, or denoise an input image. In an embodiment, the processed image produced by the autoencoder in input into a vision-based control system such as an autonomous vehicle, robotic control system, or image recognition system. Various embodiments of the quantizing autoencoder may be used to process image data in systems that are battery powered or otherwise have limited processing capabilities. In an embodiment, the quantizing autoencoder may be used to process images on a mobile phone. Other variations are also considered as being within the scope of the present disclosure). For clarity, Examiner notes that Inoue already discloses the use of a neural network for the object recognition, and based on the context, for the purpose of generating the attention map (0023, 0044). It is known in the art to use an autoencoder, a specific type of neural network, in machine learning applications as a way of learning. Inoue discloses the use 
	Inoue and Hasselgren both disclose systems of training an autonomous vehicle system based on image and audio data via a neural network. Thus, it would have been obvious to one having ordinary skill in the art before the effective filing date of Applicant's invention to modify the system in Inoue to include the teaching of Hasselgren in order to enhance the efficiency of the learning in the system in Inoue through the use of the autoencoder. The combination of Inoue and Karras would have made obvious and resulted in the subject matter of the claimed invention, specifically generating a plurality of visually-descriptive latent vectors based on the one or more images; generating a latent vector based on the notification; and that the generation of the attention map highlighting the plurality of objects is based on at least one of the plurality of visually-descriptive latent vectors and the latent vector.

Claim 5
 In image data shown in FIG. 6, image data of the vehicles V1 and V2 may be added, or image data of the corner convenience store C1 may be added. FIG. 8 is a diagram illustrating further another exemplary image data created by the object feature estimation unit 18 and displayed on the display. FIG. 8 is image data captured by the vehicle exterior imaging camera 13 to be displayed as it is for the utterance of the user "turn left in front of the taxi" and a solid line surrounding the taxi V1 which is an estimated object, the arrow display R1 representing the left turn, and dotted lines surrounding the other recognized vehicles V2, V3, V4 are overlapped with this image data., Fig. 6, Fig. 7, Fig. 8).  

Claim(s) 7, 11, 13 and 17 
Claim(s) 7, 11, 13 and 17 recite(s) subject matter similar to that/those of claim(s) 1, 5, 1 and 5, respectively, and is/are rejected under the same grounds.

Claims 2, 8 and 14 are rejected under 35 USC 103 as being unpatentable over Inoue in view of Karras, in further view of US20200175326 ("Shen").

Claim 2
Inoue fails to explicitly disclose preprocessing the one or more images by performing at least one of a down-sampling, resizing, or normalization. However, Inoue does disclose processing the image data (0044). Furthermore, Shen teaches an autonomous vehicle system with object detection, including preprocessing the one or more images by  receiving one or more pieces of data relating to a high resolution image; determining a field of view (FOV) based on the pieces of data; cropping the FOV to generate a high resolution crop of the image; downsampling the rest of the image to the size of the cropped region to generate a low resolution image; sending a batch of the high resolution crop and the low resolution image to a detector; and processing the images via the detector to generate an output of detected objects.).
	Inoue and Shen both disclose autonomous vehicle systems with image processing. Thus, it would have been obvious to one having ordinary skill in the art before the effective filing date of Applicant's invention to modify the system in Inoue to include the teaching of Shen since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. The combination of Inoue and Shen would have made obvious and resulted in the subject matter of the claimed invention, specifically preprocessing the one or more images by performing at least one of a down-sampling, resizing, or normalization.

Claim(s) 8 and 14 
Claim(s) 8 and 14 recite(s) subject matter similar to that/those of claim(s) 2 and is/are rejected under the same grounds.

s 3, 9 and 15 are rejected under 35 USC 103 as being unpatentable over Inoue in view of Karras, in further view of US20190126942 ("Goto").

Claim 3
Inoue fails to explicitly disclose wherein the steering control includes controlling a steering angle of a steering wheel of the self-driving vehicle and the velocity control includes a controlling a force applied to an accelerator of the self-driving vehicle. Such means of control of a vehicle were well-known in the art at the effectively filing date of the invention. Nevertheless, Inoue does disclose general control of the vehicle along a trajectory, which would required steering and velocity control (0046). Furthermore, Goto teaches a system of controlling an autonomous vehicle, wherein the steering control includes controlling a steering angle of a steering wheel of the self-driving vehicle and the velocity control includes a controlling a force applied to an accelerator of the self-driving vehicle (0048 The vehicle control ECU 40 includes a steering ECU which performs steering control, a power unit control ECU which performs acceleration and deceleration control, and a brake ECU. The vehicle control ECU 40 acquires a detection signal output from each of sensors mounted on the vehicle HV, such as an accelerator position sensor, a brake pedal force sensor, a steering angle sensor, and a wheel speed sensor and outputs a control signal to each of traveling control devices such as an electronic control throttle, a brake actuator, and an electric power steering (EPS) motor.).
	Inoue and Goto both disclose autonomous vehicle control. Thus, it would have been obvious to one having ordinary skill in the art before the effective filing date of 

Claim(s) 9 and 15 
Claim(s) 9 and 15 recite(s) subject matter similar to that/those of claim(s) 3 and is/are rejected under the same grounds.

Claims 4, 10 and 16 are rejected under 35 USC 103 as being unpatentable over Inoue in view of Karras, in further view of US20070005609 ("Breed").

Claim 4
Inoue fails to explicitly disclose applying the one or more images to a cellular neural network to identify the plurality of objects. However, Inoue does disclose applying the one or more images to a deep neural network to identify the plurality of objects (0023 object recognition unit 16 executes a recognition process of a specific object based on image data imaged by the vehicle exterior imaging cameras 13, a distance to a specific object detected by the onboard sensors 14, and map data stored in the map database 15. That is, the object recognition unit 16 uses known techniques such as deep neural network DNNs that utilize deep learning methods to recognize a variety of objects, such as vehicles, bikes, bicycles, pedestrians, signal machines, intersections). Furthermore, Breed teaches an autonomous vehicle object detection system including applying the one or more images to a cellular neural network to identify the plurality of objects (0092 A "combination neural network" as used herein will generally apply to any combination of two or more neural networks that are either connected together or that analyze all or a portion of the input data. A combination neural network can be used to divide up tasks in solving a particular object sensing and identification problem. For example, one neural network can be used to identify an object occupying a space at the side of an automobile and a second neural network can be used to determine the position of the object or its location with respect to the vehicle, for example, in the blind spot. In another case, one neural network can be used merely to determine whether the data is similar to data upon which a main neural network has been trained or whether there is something significantly different about this data and therefore that the data should not be analyzed. Combination neural networks can sometimes be implemented as cellular neural networks.).
	Inoue and Breed both disclose vehicle systems that detect objects by applying images to a neural network. Thus, it would have been obvious to one having ordinary skill in the art before the effective filing date of Applicant's invention to modify the system in Inoue to include the teaching of Breed since the claimed invention is merely a combination of old elements, and in the combination each element merely would have 

Claim(s) 10 and 16 
Claim(s) 10 and 16 recite(s) subject matter similar to that/those of claim(s) 4 and is/are rejected under the same grounds.

Claims 6, 12 and 18 are rejected under 35 USC 103 as being unpatentable over Inoue in view of Karras and Breed, in further view of US9715233 ("Mandeville-Clarke").

Claim 6
Inoue fails to disclose prior to generating the attention map, performing a speech-to-text conversion to generate a textual representation of the notification. However, Inoue does disclose generating the attention map (0044, Fig. 7). Furthermore, Mandeville-Clarke discloses an autonomous vehicle control system, including performing a speech-to-text conversion to generate a textual representation of the notification (col. 24 lines 25-35 The natural language converter may comprise a speech-to-text conversion module, which converts electrical signals generated as a direct result of a recorded speech pattern into a text or image or motion-image format recognizable by the first set of processor readable programmatic instructions and/or the intermediate aspect and/or an appropriate substantially autonomous vehicle controller.). One of ordinary skill would have found it obvious, by the combination on Inoue and Mandeville-Clarke, prior to generating the attention map, to perform a speech-to-text conversion to generate a textual representation of the notification.
	Inoue and Mandeville-Clarke both disclose autonomous vehicle control systems utilizing speech recognition systems in the control. Thus, it would have been obvious to one having ordinary skill in the art before the effective filing date of Applicant's invention to modify the system in Inoue to include the teaching of Mandeville-Clarke since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. The combination of Inoue and Mandeville-Clarke would have made obvious and resulted in the subject matter of the claimed invention, specifically prior to generating the attention map, performing a speech-to-text conversion to generate a textual representation of the notification.

Claim(s) 12 and 18 
Claim(s) 12 and 18 recite(s) subject matter similar to that/those of claim(s) 6 and is/are rejected under the same grounds.

Contact Information
US20200293041 ("Palanisamy") discloses the use of an autoencoder in a vehicle control system. Palanisamy clarifies how latent vectors based on input data are  Likewise, a second low-dimensional embedding can be represented as E.sub.2(O; θ.sub.2), where O is the observed vehicle state, and the θ.sub.2 represents the parameters (e.g., weights) used for mapping the observed vehicle state to a low-dimensional embedding for the second encoder module 204-2. In at least some embodiments, the encoder modules 204-1 to 204-N are used to map the observed vehicle state O (indicated at 202) to a feature space or latent vector Z, which is represented by the low-dimensional embeddings. The feature space or latent vector Z (referred to herein as feature space Z) can be constructed using various techniques, including encoding as a part of a deep autoencoding process or technique. Thus, in one embodiment, the low-dimensional embeddings E.sub.1(O; θ.sub.1) to E.sub.N(O; θ.sub.N) are each associated with a latent vector Z.sub.1 to Z.sub.N that is the output of the encoder modules 204-1 to 204-N.)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Examiner KRISHNAN RAMESH whose telephone number is (571)272-6407. The examiner can normally be reached Monday-Friday 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KRISHNAN RAMESH/
Primary Examiner, Art Unit 3663