DETAILED ACTION

Introduction
This office action is in response to Applicant’s submission filed on 12/28/2020. Claims
1-20 are pending in the application. As such, claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) is acknowledged.  Provisional application number 62,955,872, filed on 12/31/2019.

Drawings
The drawings filed on 12/28/2020 have been accepted and considered by the Examiner.

Claim Objections
Claims 2 and 10 are objected to because of the following informalities: 
 In claim 2, line 6, “by the another user.” Should read “by another user.”
In claim 10, line 3, “by the another user.” Should read “by another user.”
Appropriate correction is required.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-2, 4,10, and 13-14 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-2, 4 and 16 of copending Application No. 17/134,912. Although the claims at issue are not identical, they are not patentably distinct from each other because they both claim a computer-implemented method and system for training a machine learning model for detection of inappropriate behavior comprising: receiving an audio segment comprising a portion of audio captured by a microphone located within the vehicle; converting the audio segment to a text segment; providing at least the text segment to a trained text classification model to obtain an inappropriate behavior prediction; and determining that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
The combination of Claims 1 and 2 of the instant application are similar to claim 1 of 17/134,912. 
Claim 4 of the instant application are similar to claim 2 of 17/134,912.
Claim 10 of the instant application are similar to claim 4 of 17/134,912.
The combination of claims 13 and 14 of the instant application are similar to claim 16 of 17/134,912.
Current App. (17/135,338)
1. A computer-implemented method for detecting a request for contact information in a vehicle, the computer-implemented method comprising: receiving an audio segment comprising a portion of audio captured by a microphone located within the vehicle; converting the audio segment to a text segment; providing at least the text segment to a trained text classification model to obtain an inappropriate behavior prediction; and determining that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction.

2. The computer-implemented method of Claim 1, further comprising: providing the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment; and determining based at least in part on the inappropriate behavior prediction and the detected emotion that a user is being subjected to inappropriate behavior by another user.

Co-pending App. (17/134,912)
1. A computer-implemented method of predicting an occurrence of harassment of a user of a ride-sharing application, the computer-implemented comprising: as implemented by an interactive computing system comprising one or more hardware processors and configured with specific computer-executable instructions, receiving an audio segment comprising a portion of audio captured by a microphone located within a vehicle providing a ride to a user of a ride- sharing application associated with a ride-sharing service; converting the audio segment to a text segment; accessing a prediction model associated with verbal harassment detection; providing at least the text segment to the prediction model to obtain a harassment prediction; providing the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment; and determining based at least in part on the harassment prediction and the detected emotion that the user is being harassed.


Current App. (17/135,338)
4. The computer-implemented method of Claim 1, wherein the trained text classification model comprises one of a trained hierarchical attention network (HAN) or a trained convolutional neural network (CNN) model.

10. The computer-implemented method of Claim 1, further comprising causing a countermeasure to be initiated in response to the determination that the user is being subjected to the inappropriate behavior by  another user.

13. A system comprising: a data store comprising a trained text classification model; and a processor in communication with the data store, the processor configured with computer-executable instructions that, when executed, cause the processor to: obtain an audio segment comprising a portion of audio captured by a microphone located within a vehicle; convert the audio segment to a text segment from the data store; retrieve the trained text classification mode; provide at least the text segment to the trained text classification model to obtain an inappropriate behavior prediction; and determine that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction.

14. The system of Claim 13, wherein the computer-executable instructions, when executed, further cause the processor to: provide the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment; and determine based at least in part on the inappropriate behavior prediction and the detected emotion that a user is being subjected to inappropriate behavior by another user.
Co-pending App. (17/134,912)
2. The computer-implemented method of claim 1, wherein the prediction model comprises at least one of a hierarchical attention network model, a fastText model, or a convolutional neural network model.


4. The computer-implemented method of claim 1, further comprising initiating an intervention process upon determining that the user is being harassed.


16. A system configured to predict an occurrence of harassment of a user of a ride-sharing application, the system comprising: a non-volatile storage configured to store one or more prediction models useable to predict the occurrence of the harassment of the user; and a hardware processor of an interactive computing system in communication with the non-volatile storage, the hardware processor configured to execute specific computer-executable instructions to at least: receive an audio segment comprising a portion of audio captured by a microphone located within a vehicle; convert the audio segment to a text segment; access a prediction model associated with verbal harassment detection from the non-volatile storage; provide at least the text segment to the prediction model to obtain a harassment prediction; provide the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment; and determine based at least in part on the harassment prediction and the detected emotion that the user is being harassed.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-3, 10-13, 15, 18-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The independent claims 1, 13, and 18 recites “receiving or obtain an audio segment comprising a portion of audio captured by a microphone located within the vehicle; converting the audio segment to a text segment; providing at least the text segment to a trained text classification model to obtain an inappropriate behavior prediction; and determining that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction.”
The limitation of “receiving or obtain…”, “converting…”, “providing…”, and “determining…” as drafted covers a mental process that “can be performed in the human mind or by a human using a pen and paper.  More specifically, an application of a person listening to an audio from a speaker/user, writing down what is said in increments, mentally determining if what is said is considered to be some form of inappropriate content, based on what is covered in a guideline/ handbook, try making determination/decision that a person subject to inappropriate behavior or conduct.  
This judicial exception is not integrated into a practical application. In particular, independent claims 1, 13 and 18 recite additional elements of “processor”, and/or “memory and/or computer-readable storage media”, “trained text classification model”, “microphone” and “data store”.  For example, in [00145] of the as filed specification, there is description of using a general purpose computer. As such, a general purpose computer would contain a processor, memory and computer-readable storage media.   Regarding providing an already trained text classification model, it can be considered a generic computer model.  For example, in [0030] of the as filed specification, “(e.g., a hierarchical attention network (HAN), a convolutional neural network (CNN), a machine learning model, a neural network, or any other type of artificial intelligence model)”.  Regarding the microphone, nothing of specificity or details is discussed regard them, hence a general or conventional microphone could be used.  For example, in [00121] of the as filed specification, “For example, the microphone that captures the audio may be the microphone of a user device operated by a passenger or the microphone of a user device operated by a driver in the vehicle.”  [0034], “For example, the user devices 102 can correspond to a computing device, such as a smart phone, tablet, laptop, smart watch, or any other device that can communicate over the network 110 with the server 130.” Regarding data store, it can be considered as data stored in a hard disk in the general computer mentioned earlier.  For example in Brown et al. (US patent application publication # 20020099730 A1) hereinafter as Brown, in para [0053], “The various modules described with reference to FIG. 1 can be implemented as routines in software and the data stores can comprise conventional storage media such as a hard disk, floppy disk, or CD ROM.” Accordingly, these additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Thus, the claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer as noted. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  Similarly, the additional element of using a trained text classifier or generic computer model does not provide an inventive concept.  Likewise, the same concept also applies to a microphone, whether it is located in a car and/or located in a cellphone as described.  Finally, the additional element of data store which equates to saving files/data to a hard disk also does not provide an inventive concept.  Further, the additional limitations in the claims noted above are directed towards insignificant solution activity. Thus, the claims are not patent eligible.
With respect to claims 2 and 19, the claim relates to providing the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment; and determining based at least in part on the inappropriate behavior prediction and the detected emotion that a user is being subjected to inappropriate behavior by another user.  This reads on a human listening to an audio, trying to understand and to feel/sense the emotion from speaker who made that audio, and to make an determination or judgement call that based on what is heard and feel in the audio, that the speaker is subject to inappropriate behavior.  No additional limitations are present.  With respect to claims 3 and 15, the claim relates to wherein the inappropriate behavior comprises a request for contact information of the user. This reads on a human hearing one person asking for the contact information of another in an inappropriate way or unwarranted manner or situation. No additional limitations are present.  Regarding claim 10, the claims relate to causing a countermeasure to be initiated in response to the determination that the user is being subjected to the inappropriate behavior by another user. This reads on a human calling 911, the police, or the rideshare company to get help when an abuse or unlawful activities is determined in a human’s mind.  No additional limitations and present.  Regarding claim 11 and 12, the claim relates to wherein a user device operated by a passenger/driver in the vehicle comprises the microphone.  This reads on a rider/driver using or having access to a smart phone while in a vehicle.  No other limitations are present.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4, 10-14, 16, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Hodge et al. (US Patent Application Publication No: US 20200349666 A1) hereinafter as Hodge, in view of Xu et al. (CN 109256150 A) with reference to English machine translation provided, hereinafter as Xu, and further in view of  Li et al. (US Patent No: US 11354900 B1) hereinafter as Li.

Regarding claim 1, Hodge discloses: A computer-implemented method for detecting a request for contact information in a vehicle, the computer-implemented method comprising: receiving an audio segment comprising a portion of audio captured by a microphone located within the vehicle ([0069] In addition, user input may be received through one or more microphones 212. In one embodiment, microphone 212 is a digital microphone connected to audio module 206 to receive user spoken input, such as user instructions or commands. Microphone 212 may also be used for other functions, such as user communications, audio component of video recordings, or the like.);
converting the audio segment to a text segment ([0080]  The user's utterance is processed by a speech-to-text algorithm and the resulting text is stored as metadata associated with the video clip.);
Hodge does not explicitly, but Xu discloses: audio segment ([pg. 2, 5th para] cut-off sentence module, for receiving the recording data transmitted from the recording module, the recording data is cut into sections according to relevant characteristic of phonetic;)
Hodge and Xu are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge to combine the teaching of Xu, to incorporate providing audio segment. Combining the disclosures improve the accuracy and stability in the prediction process, and can be used to predict characteristic, as suggested by Xu (pg. 3, 4th para).
Hodge in view of Xu does not explicitly, but Li discloses: providing at least the text segment to a trained text classification model to obtain an inappropriate behavior prediction; and determining that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction ([col. 20, lines 62-67- col 21, lines 1-4] The detected words 220 may be input into a text classifier 228 trained to detect and/or classify one or more text features 230 in the detected words 220. For instance, the text classifier 228 may be configured to ... utterances included in the audio signal of the item of content 202 by analyzing the transcription 226 of the utterances, ... In some cases, the text classifier 228 may determine a likelihood of offensiveness associated with the semantic meaning of the detected words 220,...   [col. 21, lines 10-25] In some examples, the text classifier 228 outputs ... and classified based on a likelihood that the detected words 220 correspond to a text feature type. ... output likelihoods for the different text features 230 on a scale of 0 to 1, ... scale may be used (e.g., 0 to 10, 0 to 100, etc.). The text features 230 indicate that the item of content 202 likely does not include bullying (e.g., likelihood of 0.2) but is relatively likely to include hate speech (e.g., likelihood 0.7), spam (e.g., likelihood of 0.7), and sexual references (e.g., likelihood 0.7). ... the text classifier 228 may be trained to detect and/or classify any number of text feature types.).
Hodge, Xu and Li are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, to combine the teaching of Li, to incorporate providing text segment to a trained text classifier model to obtain inappropriate prediction, and to determine a user is subjected to inappropriate behavior.  Combining the disclosures because it may provide accurate and reliable content classification in identifying offensive content before offensive content is shared with other users, as suggested by Li (col. 3, lines 13-15).

Regarding claim 2, Hodge in view of Xu, and further in view of Li discloses: The computer-implemented method of Claim 1, 
Hodge further discloses: and determining based at least in part on the inappropriate behavior prediction ([0104] According to another embodiment, client device 101 records video and audio of the driver and how he or she interacts with the passenger during a ride. Client device 101 continuously monitors the driver (as well as the passenger, as described above) for any uncomfortable actions and conversations towards the passenger, including any threats or sexual comments, swearing, smoking, or similar actions. Audio and image recognition algorithms continuously analyze the audio and video from he cabin-facing camera as further described above. If any recognizable events occur, client device can announce to the driver that the inappropriate behavior is being recorded and stored in the cloud server and cannot be erased . . . and that such behavior should stop.) 
Xu further discloses: further comprising: providing the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment ([pg. 2, 4th-7th para] A voice emotion identification system based on machine learning, comprising a recording module, sentence breaking module, speaker recognition module, a characteristic extracting module and emotion identification module, wherein, recording module for obtaining the recording data, …; cut-off sentence module, for receiving the recording data transmitted from the recording module, the recording data is cut into sections according to relevant characteristic of phonetic; speaker recognition module for receiving the cut-off sentence transmitted from the module segments using a machine learning algorithm classifies the segment, and identifying the speaker according to the classification; a characteristic extracting module for receiving a segment transmitted from the cut-off sentence module, … emotional recognition module, feature extraction module for receiving the generated segment features through machine learning algorithm training the sentiment prediction model, and using integrated algorithm to integrate the prediction result of each model.);
Li further discloses: and the detected emotion that a user is being subjected to inappropriate behavior by another user ([col. 21, lines 64-67, col. 22, line 1] As discussed above, the meta-classifier 236 may output one or more content classifications 238 based at least in part on the scores associated with the likelihoods included in the acoustic events 210, the visual features 214, and/or the text features 230, along with the user features 234. Also see Fig. 2, the path from detected words to content classifications.). [detected emotion is covered by the Xu reference, please see above]

Regarding claim 4, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Li additionally discloses: wherein the trained text classification model comprises one of a trained hierarchical attention network (HAN) or a trained convolutional neural network (CNN) model ([col. 11, lines 23-46] Although specific machine-learned models are described above, other types of machine-learned models can additionally or alternatively be used. For example, machine learning algorithms can include, but are not limited to, …  Convolutional Neural Network (CNN), ...).

Regarding claim 10, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Hodge additionally discloses: further comprising causing a countermeasure to be initiated in response to the determination that the user is being subjected to the inappropriate behavior by another user ([0104] Audio and image recognition algorithms continuously analyze the audio and video from he cabin-facing camera as further described above. If any recognizable events occur, client device can announce to the driver that the inappropriate behavior is being recorded and stored in the cloud server and cannot be erased . . . and that such behavior should stop. Client device 101 can summon a police officer if the inappropriate behavior continues, or distressed responses and telling gestures from the passenger are detected.).

Regarding claim 11, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Hodge additionally discloses: wherein a user device operated by a passenger in the vehicle comprises the microphone ([0037] “According to another embodiment, a passenger can connect his or her smartphone wirelessly to a vehicle-mounted client device during a ride for live-streaming video.” Also see Fig. 8, a user holding a cell phone in the vehicle.).  [Microphone is an inherent feature of the cell phone.]

Regarding claim 12, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Hodge additionally discloses: wherein a user device operated by a driver of the vehicle comprises the microphone ([0069] In one embodiment, client device 101 also includes a touchscreen 211. In alternative embodiments, other user input devices (not shown) may be used, such a keyboard, mouse, stylus, or the like. Touchscreen 211 may be a capacitive touch array controlled by touchscreen module 208 to receive touch input from a user. Other touchscreen technology may be used in alternative embodiments of touchscreen 211, such as for example, force sensing touch screens, resistive touchscreens, electric-field tomography touch sensors, radio-frequency (RF) touch sensors, or the like. In addition, user input may be received through one or more microphones 212.).

Regarding claim 13, Hodge discloses: A system comprising: and a processor in communication with the data store, the processor configured with computer-executable instructions that, when executed, cause the processor to: (See Fig. 2 where it displays a computer system comprising hardware processor, memory and etc.)
obtain an audio segment comprising a portion of audio captured by a microphone located within a vehicle ([0069] In addition, user input may be received through one or more microphones 212. In one embodiment, microphone 212 is a digital microphone connected to audio module 206 to receive user spoken input, such as user instructions or commands. Microphone 212 may also be used for other functions, such as user communications, audio component of video recordings, or the like.);
convert the audio segment to a text segment from the data store ([0080]  The user's utterance is processed by a speech-to-text algorithm and the resulting text is stored as metadata associated with the video clip.) [data store is discussed in the Li reference see below];
Hodge does not explicitly, but Xu discloses: audio segment ([pg. 2, 5th para] cut-off sentence module, for receiving the recording data transmitted from the recording module, the recording data is cut into sections according to relevant characteristic of phonetic;)
Hodge and Xu are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge to combine the teaching of Xu, to incorporate providing audio segment. Combining the disclosures improve the accuracy and stability in the prediction process, and can be used to predict characteristic, as suggested by Xu (pg. 3, 4th para).
Hodge in view of Xu does not explicitly, but Li discloses: a data store comprising a trained text classification model ([col. 13, lines 53-58] “In particular examples, one or more servers of the social networking system 106 may be authorization/privacy servers for enforcing privacy settings. In response to a request from the user 102(1) (or other entity) for a particular object stored in a data store, the social networking system 106 may send a request to the data store for the object.”  [col. 20, lines 62-67- col 21, lines 1-4] “The detected words 220 may be input into a text classifier 228 trained to detect and/or classify one or more text features 230 in the detected words 220. For instance, the text classifier 228 may be configured to ... utterances included in the audio signal of the item of content 202 by analyzing the transcription 226 of the utterances, ... In some cases, the text classifier 228 may determine a likelihood of offensiveness associated with the semantic meaning of the detected words 220, ...”  Also see fig. 2, which displays a text classifier(228));
retrieve the trained text classification mode ([col. 22, lines 14-16] In some examples, the content classification(s) 238 may be used by the social networking system to output search results when a user searches for specific content.);
provide at least the text segment to the trained text classification model to obtain an inappropriate behavior prediction; and determine that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction ([col. 20, lines 62-67- col 21, lines 1-4] “The detected words 220 may be input into a text classifier 228 trained to detect and/or classify one or more text features 230 in the detected words 220. For instance, the text classifier 228 may be configured to ... utterances included in the audio signal of the item of content 202 by analyzing the transcription 226 of the utterances, ... In some cases, the text classifier 228 may determine a likelihood of offensiveness associated with the semantic meaning of the detected words 220, ...”   [col. 21, lines 10-25] “In some examples, the text classifier 228 outputs ... and classified based on a likelihood that the detected words 220 correspond to a text feature type. ... output likelihoods for the different text features 230 on a scale of 0 to 1, ... scale may be used (e.g., 0 to 10, 0 to 100, etc.). The text features 230 indicate that the item of content 202 likely does not include bullying (e.g., likelihood of 0.2) but is relatively likely to include hate speech (e.g., likelihood 0.7), spam (e.g., likelihood of 0.7), and sexual references (e.g., likelihood 0.7). ... the text classifier 228 may be trained to detect and/or classify any number of text feature types.”).
Hodge, Xu and Li are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, to combine the teaching of Li, to incorporate providing text segment to a trained text classifier model to obtain inappropriate prediction, and to determine a user is subjected to inappropriate behavior.  Combining the disclosures because it may provide accurate and reliable content classification in identifying offensive content before offensive content is shared with other users, as suggested by Li (col. 3, lines 13-15).

Regarding claim 14, they recite elements of the computer-implemented method claim 2, as a system. Thus, the analysis in rejecting claim 2 is equally applicable to claim 14.
Regarding claim 16, they recite elements of the computer-implemented method claim 4, as a system. Thus, the analysis in rejecting claim 4 is equally applicable to claim 16.

Regarding claim 18, Hodge discloses: 18. Non-transitory, computer-readable storage media comprising computer executable instructions for detecting a request for contact information in a vehicle, wherein the computer-executable instructions, when executed by a computing system, cause the computing system to: ([0061] Reference will now be made in detail to several embodiments, examples of which are illustrated in the accompanying figures. The above and other needs are met by the disclosed methods, a non-transitory computer-readable storage medium storing executable code, and systems for streaming and playing back immersive video content.)
obtain an audio segment comprising a portion of audio captured by a microphone located within the vehicle ([0069] In addition, user input may be received through one or more microphones 212. In one embodiment, microphone 212 is a digital microphone connected to audio module 206 to receive user spoken input, such as user instructions or commands. Microphone 212 may also be used for other functions, such as user communications, audio component of video recordings, or the like.);
converting the audio segment to a text segment from the data store ([0080]  The user's utterance is processed by a speech-to-text algorithm and the resulting text is stored as metadata associated with the video clip.) [data store is discussed in the Li reference below];
Hodge does not explicitly, but Xu discloses: audio segment ([pg. 2, 5th para] cut-off sentence module, for receiving the recording data transmitted from the recording module, the recording data is cut into sections according to relevant characteristic of phonetic;)
Hodge and Xu are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge to combine the teaching of Xu, to incorporate providing audio segment. Combining the disclosures improve the accuracy and stability in the prediction process, and can be used to predict characteristic, as suggested by Xu (pg. 3, 4th para).
Hodge in view of Xu does not explicitly, but Li discloses: data store ([col. 13, lines 53-58] In particular examples, one or more servers of the social networking system 106 may be authorization/privacy servers for enforcing privacy settings. In response to a request from the user 102(1) (or other entity) for a particular object stored in a data store, the social networking system 106 may send a request to the data store for the object.)
provide at least the text segment to a trained text classification model to obtain an inappropriate behavior prediction; and determining that a user is being subjected to inappropriate behavior by another user in the vehicle based at least in part on the inappropriate behavior prediction ([col. 20, lines 62-67- col 21, lines 1-4] The detected words 220 may be input into a text classifier 228 trained to detect and/or classify one or more text features 230 in the detected words 220. For instance, the text classifier 228 may be configured to ... utterances included in the audio signal of the item of content 202 by analyzing the transcription 226 of the utterances, ... In some cases, the text classifier 228 may determine a likelihood of offensiveness associated with the semantic meaning of the detected words 220,...   [col. 21, lines 10-25] In some examples, the text classifier 228 outputs ... and classified based on a likelihood that the detected words 220 correspond to a text feature type. ... output likelihoods for the different text features 230 on a scale of 0 to 1, ... scale may be used (e.g., 0 to 10, 0 to 100, etc.). The text features 230 indicate that the item of content 202 likely does not include bullying (e.g., likelihood of 0.2) but is relatively likely to include hate speech (e.g., likelihood 0.7), spam (e.g., likelihood of 0.7), and sexual references (e.g., likelihood 0.7). ... the text classifier 228 may be trained to detect and/or classify any number of text feature types.).
Hodge, Xu and Li are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, to combine the teaching of Li, to incorporate providing text segment to a trained text classifier model to obtain inappropriate prediction, and to determine a user is subjected to inappropriate behavior.  Combining the disclosures because it may provide accurate and reliable content classification in identifying offensive content before offensive content is shared with other users, as suggested by Li (col. 3, lines 13-15).

Regarding claim 19, they recite elements of the computer-implemented method claim 2, as a non-transitory, computer-readable storage media. Thus, the analysis in rejecting claim 2 is equally applicable to claim 19.

Claims 3 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Hodge in view of Xu, further in view of  Li, and furthermore in view of Kotake et al. (US Patent Application Publication No: US 20190376803 A1) hereinafter as Kotake.

Regarding claim 3, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Hodge in view of Xu, further in view of Li does not explicitly, but Kotake discloses: wherein the inappropriate behavior comprises a request for contact information of the user ([0007] receiving first information which is information requesting provision of contact information indicating a contact address of a second user, transmitted from a first user terminal which is a terminal used by a first user;).
Hodge, Xu, Li and Kotake are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, and further in view of Li, to combine the teaching of Kotake, to incorporate request for contact information of the user.  Combining the disclosures because it may enable the two parties to coordinate and exchange information, as suggested by Kotake (Summary).

Regarding claim 15, they recite elements of the computer-implemented method claim 3, as a system. Thus, the analysis in rejecting claim 3 is equally applicable to claim 15.

Claims 5, 6, 17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hodge in view of Xu, further in view of  Li, and furthermore in view of Epstein et al. (US Patent Application Publication No: US 20150309987 A1) hereinafter as Epstein.

Regarding claim 5, Hodge in view of Xu, further in view of Li discloses: The computer-implemented method of Claim 1,
Hodge further discloses: receiving a second audio segment comprising a portion of second audio associated with a ride-share event ([0069] In addition, user input may be received through one or more microphones 212. In one embodiment, microphone 212 is a digital microphone connected to audio module 206 to receive user spoken input, such as user instructions or commands. Microphone 212 may also be used for other functions, such as user communications, audio component of video recordings, or the like.);  [audio segment already covered with the Xu reference in claim 1]
converting the second audio segment to a second text segment ([0080]  The user's utterance is processed by a speech-to-text algorithm and the resulting text is stored as metadata associated with the video clip.);
Hodge in view of Xu, further in view of Li does not explicitly, but Epstein discloses: obtaining one or more patterns associated with inappropriate behavior detection; determining that the second text segment matches at least one of the one or more patterns ([0057] Generally, the process 200 can train the classifier by analyzing patterns in various non-context context information associated with text samples in the training set to determine which pieces of information tend to be associated with text samples that are labeled as being offensive and which pieces of information tend to be associated with text samples that are labeled as being non-offensive.);
labeling the second text segment as corresponding to inappropriate behavior ([0057] Generally, the process 200 can train the classifier by analyzing patterns in various non-context context information associated with text samples in the training set to determine which pieces of information tend to be associated with text samples that are labeled as being offensive and which pieces of information tend to be associated with text samples that are labeled as being non-offensive.);
pre-training a text classification model using at least in part the labeled second text segment ([0043] In some implementations, the first set of text samples and their respective labels can be used as a starting set to initially train the classifier. The first set of text samples may be used by a training engine to determine initial probabilities for particular signals that indicate whether a potentially offensive term in a given text sample is or is not offensive in that text sample.);
obtaining manually labeled data associated with inappropriate behavior detection ([0025] Some implementations of the techniques described herein may achieve one or more of the following advantages. A classifier that labels text samples having one or more potentially offensive terms can be trained with a relatively small number of pre-labeled text samples. In some implementations where the pre-labeled text samples have been manually evaluated and labeled by users, the training techniques described in this paper can be used to train a highly accurate offensive words classifier with a minimal number of manually labeled text samples.);
and training the pre-trained text classification model using at least in part the manually labeled data to form the trained text classification model ([0046] The initial training iteration may be limited by the size of the first set of text samples in some implementations. For example, the first set of text samples may be manually labeled by human users. Manual labeling of the first set of text samples may allow users to train the classifier initially based on labels that were determined based on sophisticated reasoning rooted in human judgment and experience. In some implementations, supervised machine learning techniques using the manually labeled first set of text samples may be used to initially train the offensive words classifier.).
Hodge, Xu, Li and Epstein are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, and further in view of Li, to combine the teaching of Epstein, to incorporate obtaining pattern, manually labelling text segments, pre-training text classifier, and training the text classifier model to detect inappropriate behaviors.  Combining the disclosures because it may improve the relevancy and accuracy of the text classification model, as suggested by Epstein (Summary).

Regarding claim 6, Hodge in view of Xu, further in view of Li, and furthermore in view of Epstein discloses: The computer-implemented method of Claim 5,
Epstein additionally discloses: wherein the one or more patterns each comprise one or more rules that, if satisfied, indicate that inappropriate behavior has occurred ([0046] At stage 208, an offensive words classifier is trained using the labeled first set of text samples. Stage 208 can be the first of multiple training iterations in training the classifier. In this first iteration, initial rules and signals may be determined so as to configure the classifier to be able to recognize one or more signals (or features) associated with a text sample and to generate an offensiveness label for the text sample.).

Regarding claim 17, they recite elements of the computer-implemented method claim 5, as a system. Thus, the analysis in rejecting claim 5 is equally applicable to claim 17.
Regarding claim 20, they recite elements of the computer-implemented method claim 5, as a non-transitory, computer-readable storage media. Thus, the analysis in rejecting claim 5 is equally applicable to claim 20.

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Hodge in view of Xu, further in view of  Li, and furthermore in view of Penilla et al. (US Patent No: US 10453453 B2) hereinafter as Penilla.

Regarding claim 7, Hodge in view of Xu, further in view of Li, discloses: The computer-implemented method of Claim 1,
Hodge in view of Xu, further in view of Li does not explicitly, but Penilla discloses: further comprising filtering noise from the audio segment prior to converting the audio segment to the text segment ([col. 4, lines 56-60] Optionally, the captured audio sample can be processed to remove noise, such as ambient noise, voice noise of other passengers, music playing in the vehicle, tapping noises, road noise, wind noise, etc. The audio sample is then processed to produce an audio signature.).
Hodge, Xu, Li and Penilla are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, and further in view of Li, to combine the teaching of Penilla, to incorporate an audio filter to filter the background noise.  Combining the disclosures because removing the ambient noise would make the audio samples easier to analyze, as suggested by Penilla (Summary).

Regarding claim 8, Hodge in view of Xu, further in view of Li, and furthermore in view of Penilla, discloses: The computer-implemented method of Claim 7,
Penilla further discloses: wherein filtering noise from the audio segment further comprises filtering, from the audio segment, at least one of a non- utterance, audio related to a navigation system, or audio uttered by a user other than a user present inside the vehicle ([col. 4, lines 56-60] Optionally, the captured audio sample can be processed to remove noise, such as ambient noise, voice noise of other passengers, music playing in the vehicle, tapping noises, road noise, wind noise, etc. The audio sample is then processed to produce an audio signature.).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Hodge in view of Xu, further in view of  Li, furthermore in view of Penilla, and furthermore in view of Rose (DE 10311587 A1) with reference to English machine translation provided, hereinafter as Rose.

Regarding claim 9, Hodge in view of Xu, further in view of Li, furthermore in view of Penilla does not explicitly, but Rose discloses: wherein filtering noise from the audio segment further comprises filtering, from the audio segment, audio associated with spoken directions based on a known output from a navigation application ([pg. 2 last para to pg.3 first para] Now there is the interior of a moving automobile a variety of noise sources, which makes the detection of the language of vehicle occupants very difficult. Many of these noise sources are vehicle-specific and in particular usage-dependent.  By exploiting vehicle-specific information, background noise can be better filtered out. This vehicle-specific information is for example the knowledge about The type of vehicle, ie the interior acoustics of the motor vehicle, - the status of the engine whose noise is dependent on engine speed, gear, etc. The status of the navigation device, ie a navigation announcement is currently being made, which may be formed from stored announcement texts or a synthetic speech taking into account the influence of the position of the microphone, ...).
Hodge, Xu, Li, Penilla and Rose are considered analogous art because they are all in the related art of speech recognition.  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Hodge, in view of Xu, further in view of Li, and furthermore in view of Penilla, to combine the teaching of Rose, to incorporate an audio filter to filter the background noise including the spoken navigation directions.  Combining the disclosures because remove background noise will enable higher voice quality and improvement in speech recognition, as suggested by Rose (pg. 2, 2nd para).



Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Chen et al. (US Patent Application Publication No: US 20210191398 A1.) hereinafter as Chen.  Chen discloses a method and system to protect drivers and passengers during rideshare.  “One example aspect of the present disclosure is directed to a computer-implemented method. The method can include obtaining sensor data associated with an interior of an autonomous vehicle. The method can include determining using the sensor data that the interior of the autonomous vehicle contains one or more passengers. The method can include, in response to determining that the interior of the autonomous vehicle contains one or more passengers, analyzing the sensor data to determine whether the one or more passengers are violating one or more passenger policies. The method can include, in response to determining that the one or more passengers are violating one or more passenger policies, automatically initiating a remote assistance session with a remote operator.” (Chen, [0005]). Also see para (0018-0023, 0040, and 0118), and Fig. 1, 4 and 7 for more details.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Phillip H Lam whose telephone number is (571)272-1721. The examiner can normally be reached 10 AM-6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/PHILIP H LAM/Examiner, Art Unit 2656                                                                                                                                                                                                        
	/EDGAR X GUERRA-ERAZO/            Primary Examiner, Art Unit 2656