DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/25/2022 has been entered.

Response to Arguments
Regarding the remarks filed 07/25/2022, applicant’s arguments with respect to claims 1-5, 7-20, 22-35 and 37-45 have been considered but are moot because of the new ground of rejection in view of Bilac and Wexler for claims 1, 2, 4, 5, 7-9, 11, 12, 14, 16, 17, 19, 20, 22-24, 26, 27, 29, 31, 32, 34, 35, 37-39, 41, 42, 44 and 46-48; and Bilac, Wexler and Kang for claims 13, 28 and 43; and Bilac, Wexler and Baker for claims 6, 21 and 36.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4, 5, 7-9, 11, 12, 14, 16, 17, 19, 20, 22-24, 26, 27, 29, 31, 32, 34, 35, 37-39, 41, 42, 44 and 46-48 are rejected under 35 U.S.C. 103 as being unpatentable over Bilac (US PG Pub 20210056966) in view of Wexler (US PG Pub 20200296521).

As per claims 1, 16 and 31, Bilac discloses a method, system and non-transitory computer-readable medium storing one or more instructions, which, when executed by one or more processors of an electronic device (Bilac; p. 0099 - The logic device 601 may include one or more processors configured to execute software instructions), cause the device to perform a method comprising: 	receiving, via a microphone, an audio signal, wherein the audio signal comprises voice activity (Bilac; p. 0033 - The robot 122 is provided with a microphone 124, by means of which the utterance may be captured and rendered in a processable form); 	classifying, via a first classifier, audio data corresponding to the audio signal (Bilac; p. 0038 – classifying user's intention to relinquish the conversational floor);	determining whether the audio signal comprises a pause in the voice activity (Bilac; p. 0034 - An energy intensity threshold may also be defined, where sound input levels below this threshold are considered to belong to a pause period); 	responsive to determining that the audio signal comprises a pause in the voice activity, based on the classifying of the audio data, determining whether the pause in the voice activity corresponds to an end point of the voice activity (Bilac; p. 0035 - the utterance ends with a period of silence 130, which enables the processor 121 to identify the termination of the utterance; p. 0038 – classifying user's intention to relinquish the conversational floor); and 	responsive to determining that the pause in the voice activity corresponds to an end point of the voice activity, presenting a response to a user based on the voice activity (Bilac; p. 0069 - If it is determined at step 320 that the first intention indicator and the second intention indicator taken together are consistent with the human interlocutor ceding control of said dialog, the method proceeds to step 325 at which the material may be injected into the dialog).
Bilac, however, fails to disclose that the microphone is a microphone of a head-wearable device; receiving, via one or more sensors of the head-wearable device, non-verbal sensor data corresponding to a user of the head-wearable device; and wherein determining whether the pause in the voice activity corresponds to an end point of the voice activity comprises: determining a probability of interest based on the classifying of the audio data and based further on applying the non-verbal sensor data as input to a machine learning process, determining whether the probability of interest exceeds a threshold, in accordance with a determination that the probability of interest exceeds the threshold, determining that the pause in the voice activity corresponds to the end point of the voice activity, and in accordance with a determination that the probability of interest does not exceed the threshold, forgoing determining that the pause in the voice activity corresponds to the end point of the voice activity.	Wexler does teach that the microphone is a microphone of a head-wearable device (Wexler; p. 0117 - a user 100 wearing an apparatus 110 that is physically connected (or integral) to glasses 130, consistent with the disclosed embodiments; p. 0141 - apparatus 110, including image sensor 220 and one or more microphones 443); receiving, via one or more sensors of the head-wearable device, non-verbal sensor data corresponding to a user of the head-wearable device (Wexler; p. 0141 - apparatus 110, including image sensor 220 and one or more microphones 443); and wherein determining whether the pause in the voice activity corresponds to an end point of the voice activity comprises: determining a probability of interest based on the classifying of the audio data (Wexler; p. 0518 - Database access module 4106 may use the determined audioprints to obtain at least one indicator of a type associated with each of the plurality of sound-emanating objects. The at least one indicator may be indicative of a level of interest user 100 has with the type associated with each of the plurality of sound-emanating objects; p. 0547) and based further on applying the non-verbal sensor data as input to a machine learning process (Wexler; p. 0544 - hearing aid system 4400 may use image data captured by wearable camera 4402 during time period T to determine the importance of the audio signals. For example, hearing aid system 4400 may determine from the image data that user 100 is sitting in his office and use this information to identify the woman based on her voice as his supervisor; p. 0578 - Accordingly, identifying the sound emanating object may be based on an output of a trained neural network associated with database 4705. The trained neural network may be continuously improved as hearing aid system 4700 continues to identify objects. For example, user 100 may confirm or manually edit the identity of objects identified by processor 4703 and the neural network may be adjusted or further developed based on the feedback from user 100), determining whether the probability of interest exceeds a threshold (Wexler; p. 0551 - the hearing aid system may determine if the importance level is greater than a threshold based on retrieved information), in accordance with a determination that the probability of interest exceeds the threshold, determining that the pause in the voice activity corresponds to the end point of the voice activity, and in accordance with a determination that the probability of interest does not exceed the threshold, forgoing determining that the pause in the voice activity corresponds to the end point of the voice activity (Wexler; p. 0485 - In some embodiments, when the first individual pauses for up to a predetermined length, processor 3803 may determine it as an end of a speech by the first individual and attempt to detect speech of other individuals).
Therefore, it would have been obvious to one of ordinary skill in the art to modify the method, system and non-transitory computer-readable medium of Bilac to include that the microphone is a microphone of a head-wearable device; receiving, via one or more sensors of the head-wearable device, non-verbal sensor data corresponding to a user of the head-wearable device; and wherein determining whether the pause in the voice activity corresponds to an end point of the voice activity comprises: determining a probability of interest based on the classifying of the audio data and based further on applying the non-verbal sensor data as input to a machine learning process, determining whether the probability of interest exceeds a threshold, in accordance with a determination that the probability of interest exceeds the threshold, determining that the pause in the voice activity corresponds to the end point of the voice activity, and in accordance with a determination that the probability of interest does not exceed the threshold, forgoing determining that the pause in the voice activity corresponds to the end point of the voice activity, as taught by Wexler, in order to provide useful information to users of wearable apparatuses by leveraging information gathered by the apparatuses such as images and audio (Cartwright; p. 0509).

As per claims 2, 17 and 32, Bilac in view of Kang and Cartwright disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31 further comprising: responsive to determining that the pause in the voice activity does not correspond to an end point of the voice activity, forgoing presenting a response to a user based on the voice activity (Bilac; p. 0069 - If it is determined that the first intention indicator and the second intention indicator are not together consistent with the human interlocutor ceding control of the dialog, the method reverts to step 305 of detecting the termination of an utterance from the human interlocutor, which in the present embodiment is reached via the step 305).

As per claims 3, 18 and 33, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31 further comprising: responsive to determining that the audio signal does not comprise a pause in the voice activity, forgoing determining whether the pause in the voice activity corresponds to an end point of the voice activity (Bilac; p. 0033 - An utterance is determined to terminate only in a case where the duration of a pause in the utterance is detected to have exceeded a predetermined threshold duration).

As per claims 4, 19 and 34, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31, wherein determining whether the audio signal comprises a pause in the voice activity comprises determining whether an amplitude of the audio signal falls below a threshold for a predetermined period of time (Bilac; p. 0034 - An energy intensity threshold may also be defined, where sound input levels below this threshold are considered to belong to a pause period; also see p. 0040 - a time window 131 of a predetermined duration at the end of an utterance (but during the utterance) may be assessed for the detection of such first intention indicators).

As per claims 5, 20 and 35, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 4, 19 and 34 further comprising: responsive to determining that the pause in the voice activity does not correspond to an end point of the voice activity, determining whether the audio signal comprises a second pause corresponding to the end point of the voice activity (Bilac; p. 0069 - If it is determined that the first intention indicator and the second intention indicator are not together consistent with the human interlocutor ceding control of the dialog, the method reverts to step 305 of detecting the termination of an utterance from the human interlocutor, which in the present embodiment is reached via the step 305; thus after the method reverts to step 305, a “second pause” would be detected in the first intention indicator).

As per claims 7, 22 and 37, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31, wherein determining whether the audio signal comprises a pause in the voice activity comprises determining whether the audio signal comprises one or more verbal cues corresponding to an end point of the voice activity (Bilac; p. 0038 - The presentation of an utterance that is syntactically or conceptually complete be taken as an indicator of the user's intention to relinquish the conversational floor. A given word or syllable may be pronounces more slowly at the end of a speaking turn).

As per claims 8, 23 and 38, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 7, 22 and 37, wherein the one or more verbal cues comprise a characteristic of the user's prosody (Bilac; p. 0038 - The presentation of an utterance that is syntactically or conceptually complete be taken as an indicator of the user's intention to relinquish the conversational floor. A given word or syllable may be pronounces more slowly at the end of a speaking turn).

As per claims 9, 24 and 39, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 7, 22 and 37, wherein the one or more verbal cues comprise a terminating phrase (Bilac; p. 0039 - an analysis of filler sound from the human interlocutor, a detection of the pitch of sound from the human interlocutor, or a semantic component of the utterance).

As per claims 11, 26 and 41, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 10, 25 and 40, wherein the non-verbal sensor data is indicative of the user's gaze (Bilac; p. 0045 - Gaze direction after the end of a speech utterance, may thus be considered to distinguish whether the interlocutor is trying to keep or relinquish the conversational floor).

As per claims 12, 27 and 42, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 10, 25 and 40, wherein the non-verbal sensor data is indicative of the user's facial expression (Bilac; p. 0055 - Facial action units may comprise a component of the second intention indicator. For instance narrowing the eyes can be taken as a thinking behavior of the user which indicates he wants to keep the floor).

As per claims 14, 29 and 44, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31, wherein determining whether the pause in the voice activity corresponds to an end point of the voice activity comprises identifying one or more interstitial sounds (Bilac; p. 0037 - while the utterance 111a may be determined to have terminated, it may be also be determined that a first intention indicator in the form of filler speech occurred towards the end of the utterance).		As per claims 46-48, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 1, 16 and 31 depend. 
And further, Wexler discloses wherein the machine learning process comprises one or more of an artificial neural network, a genetic algorithm, a nearest neighbor interpolation, and a support vector machine (Wexler; p. 0578 - Accordingly, identifying the sound emanating object may be based on an output of a trained neural network associated with database 4705. The trained neural network may be continuously improved as hearing aid system 4700 continues to identify objects. For example, user 100 may confirm or manually edit the identity of objects identified by processor 4703 and the neural network may be adjusted or further developed based on the feedback from user 100).
Therefore, it would have been obvious to one of ordinary skill in the art to modify the method, system and non-transitory computer-readable medium of Bilac to include wherein the machine learning process comprises one or more of an artificial neural network, a genetic algorithm, a nearest neighbor interpolation, and a support vector machine, as taught by Wexler, in order to provide useful information to users of wearable apparatuses by leveraging information gathered by the apparatuses such as images and audio (Cartwright; p. 0509).

Claims 13, 28 and 43 are rejected under 35 U.S.C. 103 as being unpatentable over Bilac in view of Wexler and further in view of Kang (US PG Pub 20200064921).

As per claims 13, 28 and 43, Bilac in view of Wexler disclose the method, system and non-transitory computer-readable medium of claims 10, 25 and 40, upon which claims 13, 28 and 43.	And further, Kang discloses wherein the non-verbal sensor data is indicative of the user's heart rate (Kang; p. 0064 - The ECG signal is an electric signal generated when a heart contracts and expands and the most representative biological signal that may be easily and quickly measured on a body surface. The exercise of a heart is displayed as beats per minute (bpm) and the change of an autonomic nervous system may be known through the change heart rates. The ECG signal may be measured on user's face and the biological signal input unit 110 may receive an electric signal detected from an electrode attached at various parts as the EOG signal).	Therefore, it would have been obvious to one of ordinary skill in the art to modify the method, system and non-transitory computer-readable medium of Bilac and Wexler to include a wherein the non-verbal sensor data is indicative of the user's heart rate, as taught by Kang, in order to improve user authentication using a voice signal of a user in noisy environments (Kang; p. 0006).

Claims 6, 21 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Bilac in view of Wexler and further in view of Baker (US PG Pub US 20130204607).	
As per claims 6, 21 and 36, Bilac in view of Wexler disclose the method, system and computer-readable medium of claims 4, 19 and 34, upon which claims 6, 21 and 36 depend.	Bilac in view of Kang and Cartwright, however, fail to disclose responsive to determining that the pause in the voice activity does not correspond to an end point of the voice activity, prompting the user to speak.
Baker does teach responsive to determining that the pause in the voice activity does not correspond to an end point of the voice activity, prompting the user to speak (Baker; p. 0096 - if no differential in volumes is apparent during initial sound volume comparisons, then a prompt may be used to illicit a response from the other party. This prompt may be used to then motivate the customer to provide a sample of volume differential).	Therefore, it would have been obvious to one of ordinary skill in the art to modify the method, system and non-transitory computer-readable medium of Bilac to include a responsive to determining that the pause in the voice activity does not correspond to an end point of the voice activity, prompting the user to speak, as taught by Baker, in order to improve phone call efficiency by recognizing that a call has not been answered by a live person so that unanswered calls or recorded voices are not routed to agents for sales activity (Baker; p. 0003).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art made of record and not relied upon includes:
Mistica (US PG Pub 20180358021) discloses that a biometric signal processor can derive contextual biometric information about the biometric signal based on stored information and received, user-provided information. The dialog system can be used to interface with a user to receive contextual data about the user's status or activities, and can be used to interact with the user to request information and provide the contextualized biometric information (Mistica; Abstract). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Rodrigo A Chavez whose telephone number is (571)270-0139.  The examiner can normally be reached on Monday - Friday 9-6 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 5712727602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/RODRIGO A CHAVEZ/Examiner, Art Unit 2658
/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658