Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in response to Applicant’s filing on May 4th, 2021.  Claims 1 to 20 are pending and examined below.

Claim Interpretation
	Claim 5 recites the element of an “audio event”.  The Examiner is interpreting this term as it is defined by example in ¶[0059] of the specification.
	Claims 6 and 15 are obfuscated and appear to be stating a simpler idea in language that is more complicated than necessary.  The Examiner interprets the first part of claims 6 and 15 as amounting to a determination that the background noise or soundscape is distracting or not.  The Examiner interprets the second part as amounting to a determination of the sentiment data based on this degree of distraction.
	Claim 2, 12, and 18 recite the element of “recommendation data”.  The Examiner interprets this to refer to “route recommendation data”, which is the term that the specification almost always uses.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 to 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) mental process of identifying emotions and creating routes based on the emotions. This judicial exception is not integrated into a practical application because the claims amount to merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because there are no significant additional elements.
	The examiner offers the following analysis of the independent claims based on subject matter eligibility test of MPEP 2106 to support this analysis.
Step 1: Is the claim to a process, machine, manufacture or composition of matter?
Yes, claim 1 is to a process (“method…”), claims 11 and 17 are to a machines (“system…” and “non-transitory machine-readable medium…”).
Step 2A: Is the claim directed to a law of nature, a natural phenomenon (product of nature), or an abstract idea?
Step 2A, prong one: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes.  The claims recite an abstract idea, specifically the mental process of identifying emotions and creating routes based on the emotions.  This mental process involves evaluation (“determining … sentiment data …”) and judgment (“generating … a navigation route …”).
Step 2A, prong two: Does the claim recite additional elements that integrate the judicial exception into a practical application?
No.  The practical application is navigation.  The additional element is “extracting … features of sensor data captured by a sensor associated with a vehicle”, which is just a obfuscated way of saying that some nebulous form of data are collected or measured.  This additional element amounts to mere data gathering and is a form of insignificant extra-solution activity as described in MPEP 2106.05(g).  As such, there are no significant additional elements that can integrate the judicial exception into a practical application.  The claims amount to merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f).  Mere instructions-to-apply-an-exception do not integrate the judicial exception as well.
Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no significant additional elements.
This analysis does not change for the dependent claims.
Claims 2, 12, and 18 recite the transmission and receipt of data, elements which are well-understood, conventional and routine in this art area.  The analysis does not change.
Claims 3, 7, 13, and 19 involving specifying more detail about the nature of the sensors.  These claims amount to mere data gather and as such the additional elements in this claims are insignificant and amount to mere data gathering, a form of insignificant extra-solution activity.  The analysis does not change.
Claims 4 to 6, 8, 10, 14 to 15, and 20 merely add additional mental process steps.  The analysis does not change.
Claims 9 or 16 introduce either a “mobile device” (claim 9) or a “wearable device” (claim 16), elements which are well-understood, conventional and routine in this art area.  The analysis does not change.
In conclusion, the claims are rejected under 35 U.S.C. 101.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 3, 11, 13, 17, and 19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sekizawa et al. (US 20180172464 A1), hereinafter known as Sekizawa, or Shintani et al. (US 20200309548 A1), hereinafter known as Shintani.
	The Examiner notes that both Sekizawa and Shintani disclose these claims.  The Examiner uses both references here for the purposes of compact prosecution.  The Examiner merely presents the citations from these references together below to save space.  The citations individually for each reference disclose the invention; no combination of references is needed and the Examiner is not combining these references despite the superficial appearance as such.  Again, the Examiner is merely seeking compact prosecution and is presenting two independently anticipatory references in the same space.
Regarding claim 1, Sekizawa or Shintani disclose a method, comprising:
extracting, by a system comprising a processor, features of sensor data captured by a sensor associated with a vehicle, wherein the sensor data is representative of a subject selected from a group of subjects comprising an occupant of the vehicle and an environment in which the vehicle is located, resulting in extracted features; (Sekizawa, ¶[0010] to ¶[0013], "The in-vehicle device according to the first aspect of the disclosure may further include an in-vehicle camera configured to image the inside of a vehicle cabin of the host vehicle. The feeling state estimation unit may be configured to detect a user from an image captured by the in-vehicle camera.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include face information of the user. The feeling state estimation unit may be configured to recognize a facial expression of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized facial expression.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include a gesture of the user. The feeling state estimation unit may be configured to recognize a gesture of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized gesture.  [¶] The in-vehicle device according to the first aspect of the disclosure may further include a microphone configured to detect sound inside a vehicle cabin of the host vehicle. The biological information may include speech of the user. The feeling state estimation unit may be configured to recognize speech from sound inside the vehicle cabin detected by the microphone and estimate a feeling state of the host vehicle user based on a feature of the recognized speech."; Shintani, ¶[0041], "An input device 93 is a switch group that is arranged at a position where the driver can perform an operation, is used to issue an instruction to the vehicle 1, and may also include a voice input device such as a microphone."; and Shintani, ¶[0048], "The device control unit 206 controls devices connected to the control unit 200. For example, the device control unit 206 controls a speaker 215 and a microphone 216 to make them output a predetermined voice message such as a message for a warning or navigation or detect a voice signal uttered by the occupant in the vehicle and acquire voice data.")
determining, by the system, sentiment data representative of an emotional condition of the occupant of the vehicle based on an analysis of the extracted features; and (Sekizawa, ¶[0045] to ¶[0047], "The feeling estimation unit 108 of the navigation device 100 estimates a feeling of the user based on a facial expression, a gesture, and a tone of the user who is on the vehicle 10.  [¶] Specifically, the feeling estimation unit 108 detects a user (detects at least a face area of the user) from an image captured by the in-vehicle camera 12 and recognizes a facial expression of the detected user. The feeling estimation unit 108 calculates the degree of each of a plurality of feelings (for example, “neutral”, “happy”, “anger”, “fear”, “fatigue”, and the like) based on a feature (for example, a feature of a shape of each of both eyes, an eyebrow, and a mouth) of the recognized facial expression. The feeling estimation unit 108 detects a user from an image captured by the in-vehicle camera 12 and recognizes a gesture (that is, motion) of the detected user. The feeling estimation unit 108 calculates the degree of each of the feelings based on a feature (for example, a facial expression, a positional relationship between a face and a hand, a line of sight or a face direction, or the like) of the recognized gesture.  [¶] The feeling estimation unit 108 recognizes speech from sound inside the vehicle cabin detected by the microphone 11 and calculates the degree of each of the feelings based on a feature (for example, a frequency distribution or the like) of the recognized speech (that is, a tone). It is desirable that the microphone 11 is a directional microphone. Then, it is desirable that a plurality of microphones 11 are provided in the vehicle 10. With such a configuration, since it is possible to specify a generation source (that is, a sound source) of speech from the directivity of the microphones, even in a case where a plurality of people is on the vehicle 10 or a case where a car audio is operated, it is possible to recognize the speech of the user."; and Shintani, ¶[0052], "A vehicle information analysis unit 304 acquires vehicle information, for example, GPS position information and speed information from the vehicle 104, and analyzes the behavior. A voice recognition unit 305 performs voice recognition processing based on voice data obtained by converting a voice signal uttered by the occupant of the vehicle 104 and transmitting it. For example, the voice recognition unit 305 classifies words uttered by the occupant of the vehicle 104 into feelings such as joy, anger, grief, and pleasure, and stores the classification result as a voice recognition result 320 (voice information) of user information 319 in association with a result of analysis (the position, time, and the like of the vehicle 104) by the vehicle information analysis unit 304.")
generating, by the system, a navigation route for the vehicle from an origin point to a destination point based on the sentiment data. (Sekizawa, ¶[0008], "It is possible to estimate a road that the host vehicle user will prefer and a road that the host vehicle user should avoid from a distribution of positive feelings (for example, “joy”, “pleasure”) indicated by the first feeling map and a distribution of negative feelings (for example, “anger”, “grief”) indicated by the first feeling map. Furthermore, it is possible to estimate a road that the host vehicle user will generally prefer and a road that the host vehicle user should avoid from a distribution of positive feelings indicated by the second feeling map and a distribution of negative feelings indicated by the second feeling map. For this reason, the traveling route from the current location to the destination searched using the first feeling map and the second feeling map is expected to be a traveling route preferable to the host vehicle user. Accordingly, with the in-vehicle device, it is possible to present an appropriate traveling route to the host vehicle user."; and Shintani, ¶[0054] to ¶[0055], "A user information analysis unit 308 performs various kinds of analysis for the user information 319 stored in a storage unit 314. For example, based on the voice recognition result 320 and the image recognition result 321 of the user information 319, the user information analysis unit 308 acquires the contents of an utterance from the occupant concerning the neighborhood (for example, a seaside roadway) of the traveling route of the vehicle 104 or a place (a destination or a way point) that the vehicle 104 has visited, or analyzes the feeling of the occupant from the tone or tempo of a conversation, the facial expression of the occupant, and the like. In addition, for example, based on the contents that the occupant has uttered concerning the neighborhood of the traveling route of the vehicle 104 or the place that the vehicle 104 has visited, and a feeling acquired from the voice recognition result 320 and the image recognition result 321 at that time, the user information analysis unit 308 analyzes the taste (the tendency of the taste) of the user, for example, that the user has satisfied the place that the user has visited or traveled. The analysis result obtained by the user information analysis unit 308 is stored as the user information 319 and used for, for example, selection of a destination or learning after the end of the navigation service.  [¶] A route generation unit 309 generates a route for traveling of the vehicle 104. A navigation information generation unit 310 generates navigation display data to be displayed on the navigation device 218 of the vehicle 104 based on the route generated by the route generation unit 309. For example, the route generation unit 309 generates a route from the current point to the destination based on the destination acquired from the vehicle 104. In this embodiment, for example, when a destination is input to the navigation device 218 in the place of departure, for example, a route passing along by a sea, on which the taste of the occupant of the vehicle 104 is reflected, is generated. For example, if it is estimated during the movement to the destination that the vehicle cannot arrive at the destination in time because of traffic congestion or the like, an alternate route to the destination is generated. For example, if a fatigue state of the occupant of the vehicle 104 is recognized during the movement of the destination, a rest place is searched for, and a route to the rest place is generated.")
Claims 11 and 17 are substantially similar to claim 1 and are rejected via substantially the same arguments as used for claim 1.
Regarding claim 3, Sekizawa or Shintani disclose the method of claim 1, wherein the sensor comprises an audio sensor, and wherein the sensor data comprises audio data captured by the audio sensor. (Sekizawa, ¶[0013], "The in-vehicle device according to the first aspect of the disclosure may further include a microphone configured to detect sound inside a vehicle cabin of the host vehicle. The biological information may include speech of the user. The feeling state estimation unit may be configured to recognize speech from sound inside the vehicle cabin detected by the microphone and estimate a feeling state of the host vehicle user based on a feature of the recognized speech."; Shintani, ¶[0041], "An input device 93 is a switch group that is arranged at a position where the driver can perform an operation, is used to issue an instruction to the vehicle 1, and may also include a voice input device such as a microphone."; and Shintani, ¶[0048], "The device control unit 206 controls devices connected to the control unit 200. For example, the device control unit 206 controls a speaker 215 and a microphone 216 to make them output a predetermined voice message such as a message for a warning or navigation or detect a voice signal uttered by the occupant in the vehicle and acquire voice data.")
Claims 13 and 19 are substantially similar to claim 3 and are rejected via substantially the same arguments as used for claim 3.

Claim(s) 4, 14, and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shintani.
Regarding claim 4, Shintani discloses the method of claim 3,
wherein the audio data comprises data representative of speech originating from the occupant of the vehicle, and (Shintani, ¶[0041], "An input device 93 is a switch group that is arranged at a position where the driver can perform an operation, is used to issue an instruction to the vehicle 1, and may also include a voice input device such as a microphone.": and Shintani, ¶[0048], "The device control unit 206 controls devices connected to the control unit 200. For example, the device control unit 206 controls a speaker 215 and a microphone 216 to make them output a predetermined voice message such as a message for a warning or navigation or detect a voice signal uttered by the occupant in the vehicle and acquire voice data.")
wherein extracting the features of the sensor data comprises determining a property of the speech, the property being selected from a group of properties comprising voice tone and speech content. (Shintani, ¶[0052], "A vehicle information analysis unit 304 acquires vehicle information, for example, GPS position information and speed information from the vehicle 104, and analyzes the behavior. A voice recognition unit 305 performs voice recognition processing based on voice data obtained by converting a voice signal uttered by the occupant of the vehicle 104 and transmitting it.  For example, the voice recognition unit 305 classifies words uttered by the occupant of the vehicle 104 into feelings such as joy, anger, grief, and pleasure, and stores the classification result as a voice recognition result 320 (voice information) of user information 319 in association with a result of analysis (the position, time, and the like of the vehicle 104) by the vehicle information analysis unit 304."; and Shintani, ¶[0054], "A user information analysis unit 308 performs various kinds of analysis for the user information 319 stored in a storage unit 314. For example, based on the voice recognition result 320 and the image recognition result 321 of the user information 319, the user information analysis unit 308 acquires the contents of an utterance from the occupant concerning the neighborhood (for example, a seaside roadway) of the traveling route of the vehicle 104 or a place (a destination or a way point) that the vehicle 104 has visited, or analyzes the feeling of the occupant from the tone or tempo of a conversation, the facial expression of the occupant, and the like.")
Claims 14 and 20 are substantially similar to claim 4 and are rejected via substantially the same arguments as used for claim 4.

Claim(s) 7 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sekizawa.
Regarding claim 7, Sekizawa discloses the method of claim 1, wherein the sensor comprises a video sensor, and wherein the sensor data comprises video data captured by the video sensor. (Sekizawa, ¶[0010] to ¶[0013], "The in-vehicle device according to the first aspect of the disclosure may further include an in-vehicle camera configured to image the inside of a vehicle cabin of the host vehicle. The feeling state estimation unit may be configured to detect a user from an image captured by the in-vehicle camera.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include face information of the user. The feeling state estimation unit may be configured to recognize a facial expression of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized facial expression.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include a gesture of the user. The feeling state estimation unit may be configured to recognize a gesture of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized gesture.  [¶] The in-vehicle device according to the first aspect of the disclosure may further include a microphone configured to detect sound inside a vehicle cabin of the host vehicle. The biological information may include speech of the user. The feeling state estimation unit may be configured to recognize speech from sound inside the vehicle cabin detected by the microphone and estimate a feeling state of the host vehicle user based on a feature of the recognized speech.")

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 2, 9, 12, 16, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sekizawa or Shintani as applied to claims 1, 11, and 17 above, and further in view of Nakamura et al. (US 20170205240 A1), hereinafter known as Nakamura.
Regarding claim 9, Sekizawa or Shintani do not teach the particular limitations of claim 9 but Nakamura teaches the following: The method of claim 1, further comprising: obtaining, by the system, device data from a mobile device associated with the occupant of the vehicle, and wherein determining the sentiment data comprises determining the sentiment data further based on the device data. (Nakamura, ¶[0141], "Thereafter, the clients 3 acquires the data for estimating the feeling of the user, in step S506. Specifically, in the case of the client 3b configured with the smartphone illustrated in FIG. 1 for example, the acquisition is performed by a feeling sensor unit that contacts with the hand of the user that holds the client 3b and detects the perspiration, the temperature, the pulse, or the like of the user, and an in-camera provided toward inside to capture an image of the face of the user that browses photographs and videos.")
The user is analogous to occupant of the vehicle, since both are traveling a route.
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the mobile device of Nakamura, since a mobile device may have more detailed data about a particular user in more situations and therefore using mobile device data may improve the sentiment predictions.
Regarding claim 16, Sekizawa or Shintani do not teach the particular limitations of claim 11 but Nakamura teaches the following: The system of claim 11, wherein the operations further comprise: obtaining occupant data from a wearable device associated with the occupant of the vehicle, wherein generating the condition data comprises generating the condition data based on the occupant data.  (Nakamura, ¶[0002], "In recent years, a smartphone, a mobile phone terminal, a tablet terminal, a digital camera, or the like which has a camera function becomes widespread, and photographs captured by these are uploaded to servers in large amounts. Also, metadata that includes position information (image capturing site information) is generally attached to the uploaded photograph data. Also, a wearable device, such as a watch terminal, a list band terminal, and an eyeglass HMD, is also start becoming widespread, and it becomes easy to acquire action logs (also referred to as life logs) on a day-to-day basis, and a large amount of acquired action logs are utilized variously."; and Nakamura, ¶[0141], "Thereafter, the clients 3 acquires the data for estimating the feeling of the user, in step S506. Specifically, in the case of the client 3b configured with the smartphone illustrated in FIG. 1 for example, the acquisition is performed by a feeling sensor unit that contacts with the hand of the user that holds the client 3b and detects the perspiration, the temperature, the pulse, or the like of the user, and an in-camera provided toward inside to capture an image of the face of the user that browses photographs and videos.")
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the mobile device of Nakamura, since a mobile device may have more detailed data about a particular user in more situations and therefore using mobile device data may improve the sentiment predictions.
Regarding claim 18, Sekizawa or Shintani do not teach the particular limitations of claim 11 but Nakamura teaches the following: The non-transitory machine-readable medium of claim 17, wherein the operations further comprise:
transmitting the sentiment data to a remote server via a communication network; (Nakamura, ¶[0075], "The environment information acquired by the environment information acquisition unit is transmitted to the server 2, together with the data (face image, biometric information, etc.) for estimating the feeling of the user. As described above, in the present embodiment, the data collection clients 3 transmits the environment information to the server 2 together with the data (the face image, the biometric information, etc.) for estimating the feeling of the user, so that the server 2 side performs the feeling estimation of the user, and the association between the estimated feeling and the environment information (the generation and the learning of the environment-feeling DB).")
receiving, from the remote server, recommendation data generated by the remote server based on the sentiment data; and (Nakamura, ¶[0080], "The communication control unit 10b controls data transmission from the communication unit 11. The communication control unit 10b transmits the present position information acquired by the position information acquisition unit 15 from the communication unit 11 to the server 2 for example, and requests the feeling navigation that leads the user to the predetermined feeling, to the server 2. In response to such a request, the server 2 side generates the guiding information for guiding the user to the site and the route in which the predetermined feeling is associated on the basis of the present position information of the user and the feeling map, and returns the generated guiding information to the client 1."; and Nakamura, ¶[0090], "Also, the communication control unit 10b according to the present embodiment may transmit, to the server 2, the content data of the content (photograph, video, music, etc.) that the user views at the present moment or the present surrounding environment information, and request the transmission of the guiding information for leading the user to the predetermined feeling.  For example, when the content data is transmitted, the server 2 side generates the guiding information that recommends another content that leads the user to the predetermined feeling in response to the present feeling of the user which is estimated on the basis of the content data, and returns the generated guiding information to the client 1 Also, when the environment information is transmitted, the server 2 side generates the guiding information that proposes improvement to another environment information that leads the user to the predetermined feeling in response to the present feeling of the user which is estimated on the basis of the environment information, and returns the generated guiding information to the client 1.")
preparing the route data further based on the recommendation data. (Nakamura, ¶[0102] to ¶[0106], "Thereafter, the client 1 recognizes the route search execution instruction, in step S206. The route search execution instruction is performed by gesture input, audio input, or tap operation by the user, for example.  [¶] Thereafter, in step S209, the client 1 transmits the information of the present position and the destination place to the server 2 and performs the route search request to the destination place.  [¶] Subsequently, in step S212, the guiding information generation unit 20b of the server 2 searches for a plurality of routes to the destination place, on the basis of the information of the received present position and the destination place.  [¶] Thereafter, in step S215, the guiding information generation unit 20b decides the route via the site and the area in which the degree of happiness is high, among a plurality of searched routes, with reference to the plurality of searched routes and the happiness map stored in the feeling map DB 24, and generates the guiding information to the destination place.  [¶] Thereafter, in step S218, the supply unit 20d executes control to transmit the guiding information generated by the guiding information generation unit 20b from the communication unit 21 to the client 1.")
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the servers of Nakamura, since the servers may allow for more complicated calculations and computations to be computed in a shorter time than a local system may allow.
Claims 2 and 12 are substantially similar to claim 18 and are rejected via substantially the same arguments as used for claim 18.

Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sekizawa or Shintani as applied to claim 3 above, and further in view of Mishra et al. (US 20180144746 A1), hereinafter known as Mishra.
Regarding claim 5, Sekizawa or Shintani do not teach the particular limitations of claim 5 but Mishra teaches the following: The method of claim 3, wherein extracting the features of the sensor data comprises detecting an audio event present in the audio data, and wherein determining the sentiment data comprises: classifying the audio event, resulting in an audio event classification; and determining the sentiment data based on the audio event classification. (Mishra, ¶[0018], "Video data analysis and audio signal analysis can be used to accomplish improved facial analysis for cognitive content. Audio signal analysis can serve a variety of purposes such as differentiation between music and speech, voice identification, audio events such as thunderstorms or car horns, and so on. The audio signal analysis is based audio classifiers which can be learned based on analyzing a face within the video data. By capturing audio data as well as the video data, the audio data can be synchronized with the video data. The synchronization or association of the audio data and the video data augments the analysis of the cognitive content. A video of the face of a person while yawning, while occluded by a hand of the person covering their mouth, can be augmented by non-speech sounds such as inhalation, sighs, and so on."; Mishra, ¶[0043], "In the disclosed techniques, video data including images of one or more people are obtained. The image data can include video, frames from a video, still images, or another medium suitable for image capture. The video data can include a plurality of images that can include a plurality of people.  Audio data corresponding to the video data is obtained. The audio data can be obtained using a microphone, an audio transducer, or another audio capture technique. A face within the video data is identified. The identifying can be accomplished using one or more classifiers. The identifying can be performed using a remote server, a cloud-based server, a personal electronic device, and so on. A first voice, from the audio data, can be associated with the face within the video data. The first voice can be synchronized with video data. The face within the video data can be analyzed for emotional content. The emotional content can include detection of one or more of sadness, stress, happiness, anger, frustration, confusion, disappointment, hesitation, cognitive overload, focusing, engagement, attention, boredom, exploration, confidence, trust, delight, disgust, skepticism, doubt, satisfaction, excitement, laughter, calmness, curiosity, humor, depression, envy, sympathy, embarrassment, poignancy, fatigue, drowsiness, or mirth. An audio classifier can be learned based on the analyzing of the face within the video data. The audio classifier can be used for analysis of further audio for emotional content within that audio."; and Mishra, ¶[0048], "The flow 100 includes learning an audio classifier 160, on a third computing device, based on the analyzing of the face within the video data. The computing device can include a handheld electronic device, a wearable device, a laptop computer, a server, and so on. In embodiments, the second computing device and the third computing device are a common device. The audio classifier can be used for analyzing audio data for cognitive content. The learning the audio classifier is based on analyzing a plurality of faces 162 within the video data. The learning can be based on language analysis. The learning can be dependent on language content. The language content can include keywords, key phrases, syntactic and semantic parses as well as other triggers that can be used to direct the learning. In embodiments, the learning is independent of language content. The learning can be based on the presence or absence of sounds. The learning can be accomplished using deep learning with unlabeled data.  The unlabeled data can be uploaded by a user, downloaded from the Internet, publicly available, and so on. Deep learning can include algorithms for modeling high level abstractions of data. Deep learning can comprise various types of deep neural networks including components of artificial neural nets (ANNs), convolutional neural nets (CNNs), recurrent neural nets (RNNs), various combinations of these, and so on. The learning can be accomplished using supervised learning with labeled data. The labeled data can include test data, known good data, and so on. The supervised learning can be used for various purposes including training a support vector machine (SVM) as well as other types of machine learning techniques. Once trained, results of these machine learning techniques can be used to classify audio data, video data, and so on. In embodiments, the learning further encompasses a plurality of audio classifiers. The plurality of audio classifiers can be used to determine emotional content of audio data, video data, etc. In the flow 100, the learning further encompasses learning a second audio classifier 164. The second audio classifier can be applied to a second audio feature, to cognitive content, etc.")
In effect, this reference describes a process that trains the audio event data based on video data for a corresponding emotion.  Once trained, it then can identify the emotion based on the audio data, without the need for any video data.
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the audio classifier of Mishra, since this classifier may help predict the user's response to external stimuli without requiring more complicated calculations related to voice recognition or emotional analysis from voice signals, or from video analysis.  The classifier may provide a fuller picture of all the stimuli present and provide an alternative means of ascertaining their effect on the user.

Claim(s) 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sekizawa or Shintani as applied to claims 3 and 13 above, and further in view of Biswal et al. (US 20150160019 A1), hereinafter known as Biswal and Erickson et al. (US 20180061237 A1), hereinafter known as Erickson.
Regarding claim 6, Sekizawa or Shintani do not teach the following limitations of claim 6 but Biswal teaches the following: The method of claim 3, wherein extracting the features of the sensor data comprises comparing an amount of audio activity present in the audio data to a defined baseline amount of audio activity, resulting in an audio activity comparison value, and wherein determining the sentiment data comprises: determining an estimated level of distraction associated with the vehicle based on the audio activity comparison value. (Biswal, ¶[0034], "A microphone 302 may be included in the in-vehicle computing system 300 to receive voice commands from a user and/or to measure ambient noise in the vehicle, and a speech processing unit 304 may process the received voice commands. In some embodiments, in-vehicle computing system 300 may also be able to receive voice commands and sample ambient vehicle noise using a microphone included in an audio system 332 of the vehicle. "; Biswal, ¶[0037], "The specific data requests may include requests for determining where the user is geographically located, an ambient noise level and/or music genre at the user's location, an ambient weather condition (temperature, humidity, etc.) at the user's location, etc. Mobile device application 344 may send control instructions to components (e.g., microphone, etc.) or other applications (e.g., navigational applications) of mobile device 342 to enable the requested data to be collected on the mobile device. Mobile device application 344 may then relay the collected information back to in-vehicle computing system 300."; and Biswal, ¶[0051], "The method 500 includes determining if a cognitive load of the driver (e.g., based on the driver state determined at 512) is greater than a cognitive load threshold, as indicated at 518. The cognitive load threshold may be predetermined and/or customized for the particular driver based on historical data. For example, the cognitive load and associated threshold may be defined as a number of distractions for a driver (e.g., assistance from a navigational system, phone calls, audio output, detected ambient noise indicating a conversation with a passenger, inclement weather, etc.), a total intensity of distractions (e.g., an intensity level of each distraction may be determined and aggregated to define the total intensity), and/or any other suitable attribute for defining cognitive load and an associated cognitive load threshold. The evaluation of the cognitive load of the driver may be performed by the in-vehicle computing system, the mash-up server, and/or any combination of computing systems. For example, since navigation data may include real-time information, the in-vehicle computing system may determine the cognitive load of the driver based on information generated by the navigational system within the in-vehicle computing system. The in-vehicle computing system may store and/or generate data such as the cognitive load threshold or receive the cognitive load threshold from the mash-up server (e.g., in response to sending a request for the threshold data). Accordingly, the in-vehicle computing system may compare the determined cognitive load of the driver to the generated or received cognitive load threshold at 512.")
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the audio activity of Biswal, since establishing the baseline audio activity may help identify distracting environments, which are often loud or at least intermittently loud or have distracting noises.
Sekizawa or Shintani in view of Biswal do not teach the following limitations of claim 6 but Erickson teaches the following: wherein determining the sentiment data comprises: determining the sentiment data based on the estimated level of distraction. (Erickson, ¶[0052] to ¶[0057], "Other features of nearby drivers may include moment-to-moment behavior that enables prediction of how much attention those drivers are devoting to the task of driving (for example, a highly skilled driver who is distracted is likely to perform differently than a highly skilled driver who is not distracted).  Features of the driver and environment that can contribute to analysis of attention focus include: [¶] Observations of driving characteristics, particularly changes in variance of driving behaviors (for example, a large variance in reaction time to stoplights may indicate a distracted state); [¶] Observations of driver non-driving behaviors, such as talking, gesturing, using a cell phone, looking in many directions, not looking at the road, operating other functions of a vehicle (for example, adjusting the rear view mirror); [¶] Observing proximal human behavior that may distract the driver, such as a companion who is speaking or gesturing, or a group of children near the side of the road that may distract the driver from watching vehicles ahead; [¶] Detecting proximal events that may distract the driver such as an accident site, a public disturbance, or a beautiful view; and [¶] Sensing of physiological characteristics of drivers which can affect their level of attention such as signs of drowsiness, intoxication, emotional arousal."; Erickson, ¶[0084], "The level of the driver's attention to driving is estimated based on any of: high variance of reaction time to particular events; non-driving behaviors such as talking, looking around, adjusting a mirror; detection of proximal human behavior such as a companion who is talking or gesturing; detection of potentially distracting proximal events such as a roadside accident; physiological signs of attentional deficits such as drowsiness, intoxication, etc."; and Erickson, ¶[0105], "An example of a further embodiment, which can be referred to as item 16, is the method of item 1, where the one or more features comprises a level of the driver's attention to driving, estimated upon at least one of the following: high variance of reaction time to particular events; non-driving behaviors such as talking, looking around, adjusting a mirror; detection of proximal human behavior such as a companion who is talking or gesturing; detection of potentially distracting proximal events such as a roadside accident; and physiological signs of attentional deficits such as drowsiness, intoxication, etc.")
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani in view of Biswal with the emotional determination of Erickson, since distracting noises may leave to emotional issues in humans and attempting to determine these affects would help better establish how to route vehicles to avoid such issues.
Claim 15 is substantially similar to claim 6 and is rejected via substantially the same arguments as used for claim 6.

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sekizawa as applied to claim 6 above, and further in view of Rahal-Arabi et al. (US 20180164108 A1), hereinafter known as Rahal-Arabi.
Regarding claim 8, Sekizawa do not teach all of the particular limitations of claim 6 but Rahal-Arabi (with some elements from Sekizawa) teaches the following: The method of claim 7,
wherein the video data comprises a depiction of the occupant of the vehicle, wherein extracting the features of the sensor data comprises extracting a property of the video data from the depiction of the occupant in the video data, (Sekizawa, ¶[0010] to ¶[0013], "The in-vehicle device according to the first aspect of the disclosure may further include an in-vehicle camera configured to image the inside of a vehicle cabin of the host vehicle. The feeling state estimation unit may be configured to detect a user from an image captured by the in-vehicle camera.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include face information of the user. The feeling state estimation unit may be configured to recognize a facial expression of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized facial expression.  [¶] In the in-vehicle device according to the first aspect of the disclosure, the biological information may include a gesture of the user. The feeling state estimation unit may be configured to recognize a gesture of the detected user and estimate a feeling state of the host vehicle user based on a feature of the recognized gesture.  [¶] The in-vehicle device according to the first aspect of the disclosure may further include a microphone configured to detect sound inside a vehicle cabin of the host vehicle. The biological information may include speech of the user. The feeling state estimation unit may be configured to recognize speech from sound inside the vehicle cabin detected by the microphone and estimate a feeling state of the host vehicle user based on a feature of the recognized speech.")
the property being selected from a group of properties comprising movement of the occupant and posture of the occupant, and wherein determining the sentiment data comprises determining the sentiment data based on the property of the video data. (Rahal-Arabi, ¶[0030], "Potential stress indicators from dynamic sensor measurements from vehicle sensors 108 can include, for example, driver pulse and blood pressure (measured by a wearable device, through the steering wheel, or a seat based biometric sensor, for example), driver hand position (for example, measured by steering wheel proximity sensors identifying things such as one handed driving which implies less stress than two handed driving, driver with hands at 10 and 2 positions implies more stress than one handed driving or two hands is more casual locations, etc.), time history and angle of the steering wheel, driver posture (for example, measured by seat pressure sensors and/or infrared or IR cabin sensor identifying conditions such as where nervous drivers leans forward and/or sits upright more than non-nervous drivers), fluidity of steering (for example, measured by a steering wheel position indicator identifying situations such as jerky steering movements that imply driver stress), cabin audio detection, cameras, and/or driver eyesight direction monitors identifying changes using gaze/eye tracking cameras in the cabin in order to obtain an indication of stress, etc. Additionally, in some embodiments, potential stress indicators can be measured using other current or future emerging stress measurement devices (for example, using EKG, etc.).")
It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa with the movement and posture information of Rahal-Arabi, since the movement and posture information may help identify different emotions that are unclear under other methods like audio or voice analysis.

Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sekizawa or Shintani as applied to claim 1 above.

Regarding claim 10, the language in claim 10 amounts to merely repeating or iterating the steps of claim 1.  Mere “duplication of parts” (here, repetition of steps) is something that case law has found to be obvious in the past according to MPEP 2144.04.  The “duplication of parts” forms an obviousness rationale to reject claim 10.  It would have been obvious to a person having ordinary skill in the art to combine the method of Sekizawa or Shintani with the repetition or iteration of the existing steps in claim 1, since repetition or iteration are various common practices which the court has held normally require only ordinary skill in the art and hence are considered routine expedients as per MPEP 2144.04.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Aizawa et al. (US 20200309549 A1)
Bender et al. (US 20170370732 A1)
Chun et al. (US 20140218187 A1)
Colby (US 20190049261 A1)
French et al. (US 20120150430 A1)
Gallagher et al. (US 20190332902 A1)
Glasgow et al. (US 20170186315 A1)
Miyajima (US 20170370744 A1)
Penilla et al. (US 20160104486 A1)
Penilla et al. (US 20200152197 A1)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW JAMES TRETTEL whose telephone number is (571)272-6576. The examiner can normally be reached M-F 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivek Koppikar can be reached on (571)272-5109. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.












Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANDREW JAMES TRETTEL/Examiner, Art Unit 3667                                                                                                                                                                                                        

/VIVEK D KOPPIKAR/Supervisory Patent Examiner, Art Unit 3667                                                                                                                                                                                                        
November 2, 2022