Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/08/2021 is considered by the examiner.
Drawings
The drawing submitted on 03/15/2021 is considered by the examiner.
Election/Restrictions
Restriction to one of the following inventions is required under 35 U.S.C. 121:
I. Claims 1-16, drawn to keyword recognition in a wearable device, classified in G10L21/0232.
II. Claims 17-20, drawn to training a keyword classifier in a wearable device, classified in G10L17/04.
During a telephone conversation with Mr. Kyle Rule on 11/29/2022 a provisional election was made with traverse to prosecute the invention of Group I, claims 1-16.  Affirmation of this election must be made by applicant in replying to this Office action.  Claims 17-20, withdrawn from further consideration by the examiner, 37 CFR 1.142(b), as being drawn to a non-elected invention.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a classifier configured to” in claim 9; “an equalizer configured to process signals” in claim 13.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 13 recites the limitation "the second filtered signal" in 3.  There is insufficient antecedent basis for this limitation in the claim. For examination purpose examiner will interpret the limitation as “the second signal”. 


Claim Rejections - 35 USC § 103
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-16 are rejected under 35 U.S.C. 103 as being unpatentable over Asada et al.(US 2018/0176681 A1)  in view of Lesso (US 2019/0295554 A1).

Regarding Claim 1, Asada et al. teach:  A method for keyword recognition in a wearable device, the method comprising ([0018] According to the present technique, a microphone (the internal microphone) that collects emitted speech voice is located in a space that is substantially sealed off from outside and connects to an ear canal of the wearer (the speaker). As the microphone is located in a space sealed off from outside, influence of noise can be effectively reduced. As emitted speech voice that propagates through an ear canal of the wearer is collected, the emitted speech voice can be collected at a higher S/N ratio than that in a case where a conventional earpiece microphone (FIG. 16) is employed to collect speech voice that is emitted from the wearer and propagates in the external air.): generating an audio signal from a spoken word (speech voice) detected by a microphone ([0143] The sound collection system of the third embodiment differs from the sound collection system of the first embodiment in that an external microphone 1C is added to the attachment unit 1, and a signal processing unit 25 is provided in place of the signal processing unit 2. [0144] First, the external microphone 1C is a microphone that is installed to collect sound generated outside the housing of the attachment unit 1. [0149] The HPF 27 performs a high-pass filtering process on a sound collection signal that has been generated by the external microphone 1C and has been amplified by the microphone amplifier 26. [0157] The external microphone 1C collects speech voice emitted from the mouth of the wearer H through the outside (the external air). At the same time, the external microphone 1C collects environmental noise.); generating a vibration signal from the spoken word detected by a vibration sensor, the vibration signal having a frequency component below frequencies of the audio signal ((Abstract) With the microphone being located in the space sealed off from outside, emitted speech voice that propagates through the ear canal of the wearer is collected. In a sound collection signal obtained through the ear canal, the emitted speech voice component is dominant over the noise component particularly at low frequencies. Therefore, the S/N ratio of an emitted speech voice collection signal can be improved by extracting the low-frequency component of the sound collection signal with the use of a LPF, for example. [0052] First, the sound collection system of this embodiment is based on the premise that collection of speech voice is performed while the attachment unit 1 is attached to an ear of the wearer H. [0053] When the wearer H speaks while the attachment unit 1 is in an attached state, the vibrations accompanying the speaking are transmitted to the ear canal HA from the vocal cords of the wearer H via bones and the skin (as indicated by an arrow with a dashed line). [0054] Accordingly, the speech voice obtained via the ear canal HA of the wearer H as described above can be collected by the internal microphone IB. [0094] Accordingly, the S/N ratio of emitted speech voice collection signals can be further improved by performing a filtering process on sound collection signals generated by the internal microphone IB as described above, and extracting the low-frequency components of the sound collection signals (the components in the voice dominant band of the internal microphone IB). [0152] The adder 29 is provided so as to add a sound collection signal that has been generated by the internal microphone 1B and has been subjected to a low-pass filtering process by the LPF 14, to a sound collection signal that has been generated by the external microphone 1C and has been subjected to a high-pass filtering process by the HPF 27. [0156] As can be understood from the above description, in the third embodiment, the external microphone 1C is provided for the attachment unit 1, and a signal generated by performing a high-pass filtering process of the HPF 27 on a sound collection signal generated by the external microphone 1C is added, by the adder 29, to a sound collection signal that has been generated by the internal microphone IB and has passed through the LPF 14. [0158] The HPF 27 performs a high-pass filtering process on a sound collection signal generated by the external microphone 1C, because the emitted speech voice component in the sound collection signal generated by the external microphone 1C is dominant over the noise component at mid and high frequencies (in the mid- and high-frequency bands), which is the opposite of the case with a sound collection signal generated by the internal microphone IB.).
Asada et al., however do not teach: determining whether a keyword was spoken by a wearer of the wearable device based on the audio signal and the vibration signal, wherein the keyword is rejected responsive to a determination the keyword was not spoken by the wearer of the wearable device.
Lesso teaches: determining whether a keyword (voice print or particular word or passphrase, password) was spoken by a wearer of the wearable device based on the audio signal and the vibration signal, wherein the keyword is rejected responsive to a determination the keyword was not spoken by the wearer of the wearable device ([0003] Authentication may be based on the particular word or phrase spoken during enrolment (text-dependent), or on speech which differs from that spoken during enrolment (text-independent). Authentication comprises the extraction of one or more biometric features from an input audio signal, and the comparison of those features with the stored voice prints. A determination that the acquired data match or are sufficiently close to a stored voice print results in successful authentication of the user. Successful authentication of a user may result in a user being permitted to carry out a restricted action or being granted access to a restricted area or device (for example). If the acquired features do not match or are not sufficiently close to a stored voice print, then the user is not authenticated and the authentication attempt is unsuccessful. An unsuccessful authentication attempt may prevent a user from being permitted to perform the restricted action or the user may be denied access to the restricted area or device. [0035] As the user speaks, his or her voice is carried through the air to the voice microphone 110 where it is detected. In addition, the voice signal is carried through part of the user's skeleton or skull, such as the jaw bone, and coupled to the ear canal. The microphones in the headphones 102, 106 thus detect a bone-conducted voice signal. [0047] The system 300 comprises a first microphone 302, which may belong to a personal audio device (i.e. as described above). The first microphone 302 may be configurable for placement within or adjacent to a user's ear in use, and is termed “ear microphone 302” hereinafter. The ear microphone 302 may be operable to detect bone-conducted voice signals from the user, as described above. [0049] The system 300 further comprises a second microphone 310, which may belong to the personal audio device 202 (i.e. as described above). The second microphone 310 may be configurable for placement external to the user's ear in use. The second microphone 310 is termed “voice microphone 310” hereinafter. The voice microphone 310 may be operable to detect air-conducted voice signals from the user, as described above. [0050] The output of the ADC 304 (i.e. the bone-conducted audio signal) is passed to an enablement module 306. The output of the ADC 312 (i.e. the air-conducted audio signal) is optionally also passed to the enablement module 306. [0061]  The enable module 306 may generate an output control signal to the biometric module 316, as described above, when both the air-conducted audio signal and the bone-conducted audio signal contain a voice. In this embodiment, it will be appreciated that the control signal may be generated when portions of the air-conducted audio signal and the bone-conducted audio signal which overlap in time (or are concurrent) both contain a voice. In this way, it may be assumed that the voice in the bone-conducted audio signal and the voice in the air-conducted audio signal both originate from the same person (i.e. the user). [0064] In one embodiment, the authentication module 320 comprises, or is the same as, the biometric module 316. Thus, the system 300 may be utilized to authenticate a user based on the air-conducted audio signal. The biometric module 316 performs a biometric authentication algorithm on the air-conducted audio signal, and compares one or more features extracted from the air-conducted audio signal to a stored voiceprint for an authorized user. On the basis of that comparison, an output is generated which is indicative of a decision as to whether the user of the system 300 is the authorized user or not. This output may be used generally by the system 300 or the personal audio device to permit one or more restricted actions. [0065] Additionally or alternatively, the authentication module 320 may comprise one or more alternative authentication mechanisms. For example, the authentication module 320 may implement authentication based on one or more alternative biometrics, such as ear biometrics, fingerprints, iris or retina scanning. For example, the authentication module 320 may implement an input-output mechanism for accepting and authorizing the user based on a passphrase, password, or pin number entered by the user and associated with the authorized user.).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Asada et al. to include the teaching of Lesso above in order to determine voice in the bone-conducted audio signal and the air-conducted audio signal both originate from a same person/user and further to perform a biometric authentication, comparing features extracted from the air-conducted audio signal to a stored voiceprint indicative of a decision whether the user of the system  is the authorized user or not.

Regarding Claim 2, Asada et al. teach: The method of claim 1, wherein generating the audio signal includes filtering an output of the microphone using a high-pass filter and generating the vibration signal includes filtering the output of the vibration sensor using a low-pass filter, the method further comprising combining the audio signal and the vibration signal prior to determining whether the keyword was spoken by the wearer (See rejection of claim 1 and specifically Asada et al. [0166] As described above, in the third embodiment, the adder 29 adds a sound collection signal that has passed through the HPF 27, to a sound collection signal that has passed through the LPF 14. That is, the band in which emitted speech voice is dominant is selected for each of the output signals from the external and internal sound collection microphones, and the components in the selected bands are combined. [0167] With the above described configuration as the third embodiment, usable information not only in the low frequency band but also in the mid- and high-frequency bands of emitted speech voice can be added as an emitted speech voice collection signal, and as a result, the person at the other end of the line can hear emitted speech voice with higher sound quality.) .

Regarding Claim 3, Asada et al. teach: The method of claim 2, wherein generating the vibration signal further comprises processing the low-frequency component of the vibration signal with an equalizer (See rejection of claim 1 and [0061] Therefore, to correct the sound collection signal response characteristics in the lower bands, it is preferable to provide a signal processing means as an equalizer (EQ) as shown in FIG. 3A. [0062] Specifically, in the configuration shown in FIG. 3A, a collection sound signal generated by the internal microphone IB is amplified by the microphone amplifier 10, and an equalizing process (a characteristics correction process) is then performed by an equalizer 11. [0067] As can be seen from FIG. 4A, the sound collection signal generated by the internal microphone 1B .square-solid. the dot-and-dash line) is actually higher in the lower bands than the sound collection signal generated by the microphone located outside (.box-tangle-solidup. & the dashed line). [0071] Specifically, a filter (or the equalizer 11) expressed by the transfer function shown in FIG. 4B is prepared, and the frequency characteristics of the sound collection signal of the internal microphone 1B are corrected by the filter. That is, the sound collection signal frequency characteristics of the internal microphone 1B are corrected by the equalizer 11 having high-frequency emphasizing (low-frequency suppressing) filter characteristics as shown in FIG. 4B. [0073] In FIG. 4A, the set of .circle-solid. marks and a solid line indicates the frequency characteristics of the sound collection signal of the internal microphone 1B after correction performed by the equalizer 11 having the filter characteristics shown in FIG. 4B. [0155] Specifically, the equalizer 11 in this example is located between the microphone amplifier 10 and the LPF 14, and is designed to perform an equalizing process on a sound collection signal that has been generated by the internal microphone 1B and has been amplified by the microphone amplifier 10.).

Regarding Claim 4, Asada et al. teach:  The method of claim 2, wherein the high-pass filter and the low-pass filter have a common cutoff frequency (See rejection of claim 1 and also See Fig.11A  (external microphone voice dominant band using HPF) and Fig.11B(internal microphone voice dominant band using LPF), where it shows overlap common cutoff frequency portion  (315Hz-1.3KHz) between HPF and LPF filter. [0100] The cutoff frequency of the LPF 14 is appropriately set so as to extract the components in the “internal microphone voice dominant band” shown in FIGS. 5A and 5B. [0168] It should be noted the cutoff frequency of the HPF 27 is appropriately set so that the components in the mid- and high-frequency voice dominant bands shown in FIG. 11A can be extracted.).

Regarding Claim 5, Asada et al. teach: The method of claim 4, wherein the cutoff frequency is approximately 600 Hz (See rejection of claim 4 and the common cutoff frequency 315Hz-1.3KHz. Further, It is to be noted here that the  claimed range  approximately 600Hz overlaps the range taught by Asada et al. 315Hz-1.3KHz.  In the case where the claimed ranges “overlap or lie inside ranges disclosed by the prior art” a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990).
Note: approximately term is broader and not exact, and could be interpreted as a frequency range from less than 600Hz to more than 600 Hz. In this case examiner interpreted as a frequency range from less than 600 Hz to more than 600 Hz.).

Regarding Claim 6:  The method of claim 1, wherein determining whether a keyword (voice print or particular word or passphrase, password) was spoken by the wearer of the wearable device further comprises using a classification model (See rejection of claim 1 and Lesso teaching [0003] Voice biometric systems authenticate a user based on the user's speech. Before using a voice biometric system for authentication, a user first enrolls with the system. During enrolment, the voice biometric system acquires biometric data that are characteristic of the user's voice and stores the data as a voice model or voice print. Authentication may be based on the particular word or phrase spoken during enrolment (text-dependent), or on speech which differs from that spoken during enrolment (text-independent). Authentication comprises the extraction of one or more biometric features from an input audio signal, and the comparison of those features with the stored voice prints.).

Regarding Claim 7: The method of claim 6, wherein the classification model is trained using a negative training set comprising speech samples simulating non-wearers of the device(See rejection of claim 1. Also see Asada et al. teaching of “[0018] According to the present technique, a microphone (the internal microphone) that collects emitted speech voice is located in a space that is substantially sealed off from outside and connects to an ear canal of the wearer (the speaker).”; and  Lesso teaching: [0038] The personal audio device 202 may be any device which is suitable for, or configured to detect bone-conducted and air-conducted voice signals from a user. The bone-conducted voice signals, by their nature, originate essentially from a single user (i.e. the user of the personal audio device). The personal audio device may be wearable, and comprise headphones for each of the user's ears. Alternatively, the personal audio device may be operable to be carried by the user, and held adjacent to the user's ear or ears during use. The personal audio device may comprise headphones or a mobile phone handset, as described above with respect to any of FIGS. 1a to 1f. [0042] Some examples of suitable biometric processes include biometric enrolment and biometric authentication. Enrolment comprises the acquisition and storage of biometric data which is characteristic of an individual. In the present context, such stored data may be known as a “voice print”. Authentication comprises the acquisition of biometric data from an individual, and the comparison of that data to the stored data of one or more enrolled or authorized users. A positive comparison (i.e. the acquired data matches or is sufficiently close to a stored voice or ear print) results in the individual being authenticated. For example, the individual may be permitted to carry out a restricted action, or granted access to a restricted area or device. A negative comparison (i.e. the acquired data does not match or is not sufficiently close to a stored voice or ear print) results in the individual not being authenticated. For example, the individual may not be permitted to carry out the restricted action, or granted access to the restricted area or device. [0064] On the basis of that comparison, an output is generated which is indicative of a decision as to whether the user of the system 300 is the authorized user or not.).

Regarding Claim 8: The method of claim 1, wherein the keyword is a trigger keyword(voice print or particular word or passphrase, password), the method further comprising sending a control signal to a processing circuit responsive to the determination the keyword was spoken by the wearer (See rejection of claim 1).

Regarding Claim 9, Asada et al. teach: A wearable apparatus comprising([0017] The earhole-wearable sound collection device also includes either a low-frequency extraction filter unit that performs a filtering process on a sound collection signal from the internal microphone to extract a low-frequency component, or an equalizing unit that performs an equalizing process of a high-frequency emphasizing type on the sound collection signal from the internal microphone. [0018] According to the present technique, a microphone (the internal microphone) that collects emitted speech voice is located in a space that is substantially sealed off from outside and connects to an ear canal of the wearer (the speaker). As the microphone is located in a space sealed off from outside, influence of noise can be effectively reduced. As emitted speech voice that propagates through an ear canal of the wearer is collected, the emitted speech voice can be collected at a higher S/N ratio than that in a case where a conventional earpiece microphone (FIG. 16) is employed to collect speech voice that is emitted from the wearer and propagates in the external air.): a microphone (Fig.10, external microphone 1C) configured to measure acoustic signals from the air([0143] The sound collection system of the third embodiment differs from the sound collection system of the first embodiment in that an external microphone 1C is added to the attachment unit 1, and a signal processing unit 25 is provided in place of the signal processing unit 2. [0144] First, the external microphone 1C is a microphone that is installed to collect sound generated outside the housing of the attachment unit 1. [0149] The HPF 27 performs a high-pass filtering process on a sound collection signal that has been generated by the external microphone 1C and has been amplified by the microphone amplifier 26. [0157] The external microphone 1C collects speech voice emitted from the mouth of the wearer H through the outside (the external air). At the same time, the external microphone 1C collects environmental noise.); a vibration sensor configured to measure vibration signals from the body of a user of the apparatus (Fig.10, internal microphone 1B); and a combiner configured to (Fig.10, adder 29): receive a first signal from the microphone and a second signal from the vibration sensor, the second signal comprising frequencies below frequencies of the first signal; combine the first signal and the second signal to generate a processed speech signal ((Abstract) With the microphone being located in the space sealed off from outside, emitted speech voice that propagates through the ear canal of the wearer is collected. In a sound collection signal obtained through the ear canal, the emitted speech voice component is dominant over the noise component particularly at low frequencies. Therefore, the S/N ratio of an emitted speech voice collection signal can be improved by extracting the low-frequency component of the sound collection signal with the use of a LPF, for example. [0052] First, the sound collection system of this embodiment is based on the premise that collection of speech voice is performed while the attachment unit 1 is attached to an ear of the wearer H. [0053] When the wearer H speaks while the attachment unit 1 is in an attached state, the vibrations accompanying the speaking are transmitted to the ear canal HA from the vocal cords of the wearer H via bones and the skin (as indicated by an arrow with a dashed line). [0054] Accordingly, the speech voice obtained via the ear canal HA of the wearer H as described above can be collected by the internal microphone IB. [0094] Accordingly, the S/N ratio of emitted speech voice collection signals can be further improved by performing a filtering process on sound collection signals generated by the internal microphone IB as described above, and extracting the low-frequency components of the sound collection signals (the components in the voice dominant band of the internal microphone IB). [0152] The adder 29 is provided so as to add a sound collection signal that has been generated by the internal microphone 1B and has been subjected to a low-pass filtering process by the LPF 14, to a sound collection signal that has been generated by the external microphone 1C and has been subjected to a high-pass filtering process by the HPF 27. [0156] As can be understood from the above description, in the third embodiment, the external microphone 1C is provided for the attachment unit 1, and a signal generated by performing a high-pass filtering process of the HPF 27 on a sound collection signal generated by the external microphone 1C is added, by the adder 29, to a sound collection signal that has been generated by the internal microphone IB and has passed through the LPF 14. [0158] The HPF 27 performs a high-pass filtering process on a sound collection signal generated by the external microphone 1C, because the emitted speech voice component in the sound collection signal generated by the external microphone 1C is dominant over the noise component at mid and high frequencies (in the mid- and high-frequency bands), which is the opposite of the case with a sound collection signal generated by the internal microphone IB.). 
Asada et al. do not teach: a classifier configured to: determine whether a keyword was spoken by the user of the apparatus based on the processed speech signal.
Lesso teaches: a classifier (Fig.3, Authentication module 320) configured to: receive a first signal from the microphone and a second signal from the vibration sensor; combine (cross-correlate) the first signal and the second signal to generate a processed speech (features extracted from the air-conducted audio signa) signal and determine whether a keyword(voice print or particular word or passphrase, password)  was spoken by the user of the apparatus based on the processed speech signal ([0003] Authentication may be based on the particular word or phrase spoken during enrolment (text-dependent), or on speech which differs from that spoken during enrolment (text-independent). Authentication comprises the extraction of one or more biometric features from an input audio signal, and the comparison of those features with the stored voice prints. A determination that the acquired data match or are sufficiently close to a stored voice print results in successful authentication of the user. Successful authentication of a user may result in a user being permitted to carry out a restricted action or being granted access to a restricted area or device (for example). If the acquired features do not match or are not sufficiently close to a stored voice print, then the user is not authenticated and the authentication attempt is unsuccessful. An unsuccessful authentication attempt may prevent a user from being permitted to perform the restricted action or the user may be denied access to the restricted area or device. [0035] As the user speaks, his or her voice is carried through the air to the voice microphone 110 where it is detected. In addition, the voice signal is carried through part of the user's skeleton or skull, such as the jaw bone, and coupled to the ear canal. The microphones in the headphones 102, 106 thus detect a bone-conducted voice signal. [0047] The system 300 comprises a first microphone 302, which may belong to a personal audio device (i.e. as described above). The first microphone 302 may be configurable for placement within or adjacent to a user's ear in use, and is termed “ear microphone 302” hereinafter. The ear microphone 302 may be operable to detect bone-conducted voice signals from the user, as described above. [0049] The system 300 further comprises a second microphone 310, which may belong to the personal audio device 202 (i.e. as described above). The second microphone 310 may be configurable for placement external to the user's ear in use. The second microphone 310 is termed “voice microphone 310” hereinafter. The voice microphone 310 may be operable to detect air-conducted voice signals from the user, as described above. [0050] The output of the ADC 304 (i.e. the bone-conducted audio signal) is passed to an enablement module 306. The output of the ADC 312 (i.e. the air-conducted audio signal) is optionally also passed to the enablement module 306. [0061]  The enable module 306 may generate an output control signal to the biometric module 316, as described above, when both the air-conducted audio signal and the bone-conducted audio signal contain a voice. In this embodiment, it will be appreciated that the control signal may be generated when portions of the air-conducted audio signal and the bone-conducted audio signal which overlap in time (or are concurrent) both contain a voice. In this way, it may be assumed that the voice in the bone-conducted audio signal and the voice in the air-conducted audio signal both originate from the same person (i.e. the user). [0062] Additionally, or alternatively, the enable module 306 may cross-correlate the bone-conducted audio signal with the air-conducted audio signal. Upon a determination that the bone-conducted audio signal comprises a voice, the enable module 306 may cross-correlate the bone-conducted audio signal (and particularly that portion of the bone-conducted audio signal comprising the voice) with the air-conducted audio signal (particularly that portion of the air-conducted audio signal which is concurrent with the portion of the bone-conducted audio signal comprising the voice), to determine a level of correlation between the two signals. Responsive to a determination that the two signals correlate (e.g. the correlation exceeds a threshold value), the enable module 306 may output a control signal to the biometric module 316 enabling updates to the stored voice model.  [0063] The decision to enable updates to the stored voice model may further be based on authentication of the user of the personal audio device 202 as an authorized user. Thus, in the illustrated embodiment, the system 300 further comprises an authentication module 320 coupled to the enable module 306. [0064] In one embodiment, the authentication module 320 comprises, or is the same as, the biometric module 316. Thus, the system 300 may be utilized to authenticate a user based on the air-conducted audio signal. The biometric module 316 performs a biometric authentication algorithm on the air-conducted audio signal, and compares one or more features extracted from the air-conducted audio signal to a stored voiceprint for an authorized user. On the basis of that comparison, an output is generated which is indicative of a decision as to whether the user of the system 300 is the authorized user or not. This output may be used generally by the system 300 or the personal audio device to permit one or more restricted actions. [0065] Additionally or alternatively, the authentication module 320 may comprise one or more alternative authentication mechanisms. For example, the authentication module 320 may implement authentication based on one or more alternative biometrics, such as ear biometrics, fingerprints, iris or retina scanning. For example, the authentication module 320 may implement an input-output mechanism for accepting and authorizing the user based on a passphrase, password, or pin number entered by the user and associated with the authorized user.).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Asada et al. to include the teaching of Lesso above in order to determine voice in the bone-conducted audio signal and the air-conducted audio signal both originate from a same person/user and further to perform a biometric authentication, comparing features extracted from the air-conducted audio signal to a stored voiceprint indicative of a decision whether the user of the system  is the authorized user or not.

Regarding Claim 9, Asada et al. teach:  The apparatus of claim 9, wherein the apparatus is a device configured to be worn in or near the ear of the user (See rejection of claim 9).

Regarding Claim 11, Asada et al. teach: The apparatus of claim 10, wherein the vibration sensor is configured to measure vibrations from the inside of the ear of the user (See rejection of claim 9).

Regarding Claim 12, Asada et al. teach:  The apparatus of claim 9 further comprising: a high-pass filter (HPF) coupled to an output of, and configured to process signals from, the microphone; a low-pass filter (LPF) coupled to an output of, and configured to process signals from, the vibration signal; and a digital signal processor implementing classification of the processed speech signal (See rejection of claim 9 and Asada et al. teaching: [0088] In this embodiment, the various kinds of signal processing on sound collection signals may be performed by an analog electrical circuit, or may be performed by digital signal processing via an ADC (A/D converter). Lesso teaching : [0045] The system 300 comprises processing circuitry 324, which may comprise one or more processors, such as a central processing unit or an applications processor (AP), or a digital signal processor (DSP). The system 300 further comprises memory 326, which is communicably coupled to the processing circuitry 324. The memory 326 may store instructions which, when carried out by the processing circuitry 324, cause the processing circuitry to carry out one or more methods as described below (see FIG. 4 for example).).

Regarding Claim 13, Asada et al. teach: The apparatus of claim 12, further comprises an equalizer( equalizer (EQ) as shown in FIG. 3A), the equalizer coupled to the output of, and configured to process signals from, the low-pass filter, wherein the equalizer changes the amplitude of one or more frequency bands in the second filtered signal(See rejection of claim 9 and  [0017] The earhole-wearable sound collection device also includes either a low-frequency extraction filter unit that performs a filtering process on a sound collection signal from the internal microphone to extract a low-frequency component, or an equalizing unit that performs an equalizing process of a high-frequency emphasizing type on the sound collection signal from the internal microphone. [0058] However, in a case where the sealability is relatively high as in a case with a conventional canal-type earphone, for example, gain (response) in the ear canal HA becomes greater in lower bands than in a normal free space. [0061] Therefore, to correct the sound collection signal response characteristics in the lower bands, it is preferable to provide a signal processing means as an equalizer (EQ) as shown in FIG. 3A. [0062] Specifically, in the configuration shown in FIG. 3A, a collection sound signal generated by the internal microphone IB is amplified by the microphone amplifier 10, and an equalizing process (a characteristics correction process) is then performed by an equalizer 11. [0067] As can be seen from FIG. 4A, the sound collection signal generated by the internal microphone 1B .square-solid. the dot-and-dash line) is actually higher in the lower bands than the sound collection signal generated by the microphone located outside (.box-tangle-solidup. & the dashed line). [0071] Specifically, a filter (or the equalizer 11) expressed by the transfer function shown in FIG. 4B is prepared, and the frequency characteristics of the sound collection signal of the internal microphone 1B are corrected by the filter. That is, the sound collection signal frequency characteristics of the internal microphone 1B are corrected by the equalizer 11 having high-frequency emphasizing (low-frequency suppressing) filter characteristics as shown in FIG. 4B. [0073] In FIG. 4A, the set of .circle-solid. marks and a solid line indicates the frequency characteristics of the sound collection signal of the internal microphone 1B after correction performed by the equalizer 11 having the filter characteristics shown in FIG. 4B. [0155] Specifically, the equalizer 11 in this example is located between the microphone amplifier 10 and the LPF 14, and is designed to perform an equalizing process on a sound collection signal that has been generated by the internal microphone 1B and has been amplified by the microphone amplifier 10.).

Regarding Claim 14: The apparatus of claim 12, wherein the digital signal processor is configured to send a control signal to an application processor responsive to a determination the keyword was spoken by the user of the apparatus (See rejection of claim 12, specifically Lesso teaching: [0003] Authentication comprises the extraction of one or more biometric features from an input audio signal, and the comparison of those features with the stored voice prints. A determination that the acquired data match or are sufficiently close to a stored voice print results in successful authentication of the user. Successful authentication of a user may result in a user being permitted to carry out a restricted action or being granted access to a restricted area or device (for example). [0042] Some examples of suitable biometric processes include biometric enrolment and biometric authentication. Enrolment comprises the acquisition and storage of biometric data which is characteristic of an individual. In the present context, such stored data may be known as a “voice print”. Authentication comprises the acquisition of biometric data from an individual, and the comparison of that data to the stored data of one or more enrolled or authorized users. A positive comparison (i.e. the acquired data matches or is sufficiently close to a stored voice or ear print) results in the individual being authenticated. For example, the individual may be permitted to carry out a restricted action, or granted access to a restricted area or device. A negative comparison (i.e. the acquired data does not match or is not sufficiently close to a stored voice or ear print) results in the individual not being authenticated. For example, the individual may not be permitted to carry out the restricted action, or granted access to the restricted area or device.).

Regarding Claim 15, Asada et al. teach: The apparatus of claim 12, wherein, the high-pass filter and the low-pass filter have a common cutoff frequency(See rejection of claim 1 and also See Asada et al. Fig.11A  (external microphone voice dominant band using HPF) and Fig.11B(internal microphone voice dominant band using LPF), where it shows overlap common cutoff frequency portion  (315Hz-1.3KHz) between HPF and LPF filter. [0100] The cutoff frequency of the LPF 14 is appropriately set so as to extract the components in the “internal microphone voice dominant band” shown in FIGS. 5A and 5B. [0168] It should be noted the cutoff frequency of the HPF 27 is appropriately set so that the components in the mid- and high-frequency voice dominant bands shown in FIG. 11A can be extracted.).

Regarding Claim 16, Asada et al. teach: The apparatus of claim 15, wherein the cutoff frequency is approximately 600 Hz(See rejection of claim 4 and the common cutoff frequency 315Hz-1.3KHz. Further, It is to be noted here that the  claimed range  approximately 600Hz overlaps the range taught by Asada et al. 315Hz-1.3KHz.  In the case where the claimed ranges “overlap or lie inside ranges disclosed by the prior art” a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990).
Note: approximately term is broader and not exact, and could be interpreted as a frequency range from less than 600Hz to more than 600 Hz. In this case examiner interpreted as a frequency range from less than 600 Hz to more than 600 Hz.).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Lesso (US 2020/0074055 A1) teach: responsive to detection of a trigger event indicative of a user interaction with the electronic device, generating an audio probe signal to play through an audio transducer of the electronic device; receiving a first audio signal comprising a response of the user's ear to the audio probe signal; receiving a second audio signal comprising speech of the user; and applying an ear biometric algorithm to the first audio signal and a voice biometric algorithm to the second audio signal to authenticate the user as an authorized user.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656