Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. PCT/CA2018/051369, filed on 10/29/2018.
Drawings
The drawing submitted on 04/27/2020 is considered by the examiner.
Response to Amendment
Claims 1-20 are currently pending in the application and among them claims 1 and 13 are independent claims and claims 1, 2, and 13 has been amended.
Response to Arguments
Applicant's arguments filed 0427/2022 have been fully considered but they are not persuasive. The following are the applicant’s arguments and the examiner response to arguments:

Applicant’s Argument 1: The Applicant submits that the amended claim 13 do not recite limitations that recites function without reciting sufficient structure. Finally, the Applicant acknowledges the Examiner's interpretation under 35 USC §112(f) of the limitation as recited in claim 19.
Examiner’s Response 1: Examiner would like to withdraw claim 13 interpretation under 112(f) based on the applicant’s statement that the terms used in the claim recitation (i.e. sampler for) are structure and should not be interpreted under 112(f), even though the terms used are  not well-known structure known in the art and reciting generic place holders coupled with functional language.
However, the applicant accepting claim 19 under 112(f) interpretation is confusing since claim 19 should be withdrawn interpretation under 112(f) since the terms (i.e. audio data storage module configured to)  used in the claim 19 are also generic place holders couple with functional language as like claim 13, are part of the system shown in the figures presented in the disclosure.  There is no sufficient recitation where these terms are detailed to be shown as structure (except [0013] where briefly states that terms used in the claims 13 and 19 are part of a system for training a classification module.) and the same generic terms are used in the figures, detail description in the specification and in the claims. Since applicant claimed that these generic terms are structure, then Claim 19 generic terms should be structure as well since it is within the same system.
Examiner therefore will keep claim interpretation of claims 13 and 19, under 112(f), until a clear statement from the applicant has been given that the claims 13 and 19 should not be interpreted under 112(f) and thereafter examiner will decide whether to issue a 112 second rejection (lack of structure supporting the terms under 112 (f) interpretation) or not in the future office action.

Applicant’s Argument 2: Applicant has amended claim 1 to now recite "capturing an in-ear sound pressure present in an occluded ear canal". Support for these amendments may be found at least within paragraphs [0086] and [0087] of the specification, as published. The Applicant respectfully submits that Shalon does not teach or suggest occluding the ear of the user and therefore fails to disclose "capturing an in-ear sound pressure present in an occluded ear canal" as recited in amended claim 1.

Examiner Response 2: Examiner respectfully disagree with the applicant narrowly analysis of the applicant specification and as well Shalon et al. teaching.
Applicant in his argument stated support of the applicant specification of [0086] and [0087], which recited “in-ear microphone 56 or pressure sensor[being] located inside the ear canal or at the vicinity of the ear canal and that measure the sounds present in the open or occluded human ear-canal. 
Examiner can not understand how this is different than the Shalon et al. teaching below since applicant’s microphone or sensor presence inside the ear canal or at the vicinity of the ear canal does not mean it is causing  full occlusion or blocking or obstructing the ear canal. Further applicant’s specification does not clearly teach how the ear canal is being occlude, is it by microphone or sensor or by other element or natural cause, i.e. ear wax or due to natural gravitational pressure. If it is by microphone or sensor then the drawings or the disclosure is not supporting that. 
Applicant specification stated the word “occluded ear canal” which from the specification and drawing can be read on partially occluded and not fully occluded or obstructing the ear canal. Also, microphone 56 or pressure sensor located at the vicinity of the ear canal cannot be fully occluded at all since vicinity broadly can be interpret as further out of the ear canal, near the ear canal and also can be anywhere on the outside surface of the ear or ear canal, which is in the vicinity of the ear canal and therefore may not fully occluded the ear canal. Fig.1-6 show ear canal occlusion with the microphone 56 or pressure sensor, is partially occluded, not fully occluded or fully obstructing the ear canal. Fig.13 does not show occluded at all which is described in the [0086] and referencing the occlusion to Fig.5. Fig.5 does not show the microphone or sensor occluding the ear canal fully rather partially. The paragraph states microphone or sensor located inside the ear canal or at the vicinity of the ear canal. Therefore, applicant’s specification support is not conclusive that the ear canal is fully occluded by the microphone or sensor if that is what applicant arguing throughout the applicant’s remark.
Further microphone 56 or pressure sensor measuring sounds present in the open or occluded ear-canal as recited in the specification, cannot be concluded that is what the amended limitation is reciting. The word “occluded” can broadly mean partially not fully obstructing as like Shalon et al. teaching of “not occluded”, where microphone in the space filling element or flexible member is also occluding, since the sensor or microphone  is partially obstructing or occluding the path of the ear canal. Further, ear canal naturally occluded from ear wax or gravitational pressure is relative to a specific situation and may or may not be present during system use and need to be clearly disclosed in order to claim the situation. Further naturally occluded would further be considered partially occluded since a person can still hear sound from the environment. 
The cited paragraph of the applicant specification can be read as that  the microphone or pressure sensor presence in the ear canal not considering fully occluding to measures sound present in the ear canal which could be partially occluded with the location or presence of the sensor or microphone being at the vicinity of the ear canal or not occluding at all by being located at the vicinity of the ear canal to measure sound pressure since the ear canal recited as open ear canal, not occluded. 
Therefore, the applicant conclusion on occluded fully would be wrong based on the applicant disclosure. Further applicant can not claim fully occluded since the specification no where teaches that microphone or pressure sensor location inside the ear canal causing fully obstructing/blocking the ear canal, thus fully occluded the ear canal, cannot be considered from the teaching.
In addition, applicant admits that Shalon et al. teaches “deformation of the flexible member may produce noise that might be detectable by a microphone.”. Now the question the examiner ask, is the noise in-side the ear canal is not sound pressure detected by microphone or sensor on the flexible member or space filling element inside the ear canal? Applicant claims is reciting “capturing an in-ear sound pressure present in an occluded ear canal. Noise is also a sound and creates pressure inside the ear canal, that is exactly what Shalon et al. teaches in the below paragraph. Further Shalon et al. also teaches microphone capturing speech inside the ear canal and speech is also sound.
Applicant’s specification nor Shalon et al. teaches that ear canal is fully or completely occluded or obstructed by the presence of pressure sensor or microphone. 
Therefore, examiner believes Shalon et al. clearly teaches the amended limitation same way the applicant’s specification describes it specifically [0086-0087], “capturing an in-ear sound pressure present in an occluded ear canal” in the following paragraphs: [0071] Acoustic energy generated …can be monitored through one or more sensors (e.g. microphones) positioned in or around the ear area,…[ 0074] The system may also have a sound tube to convey the sound to a sound processing unit elsewhere on the body. Such a sound tube can be, by way of example, a hollow plastic tube connected to a mounting means in the ear region and a sensing means such as a microphone in proximity to the sound processing unit. [0096] …The deformations of the ear canal can be detected using microphones or mechanical sensors such as accelerometer, piezo or strain gage technology known in the art, and/or changes in volume or pressure of a space-filling element in the ear canal, and/or optical means. The system can be designed with flexible members to universally fit into all ear canals,… [0119] Bone conduction technology could relay sound signals or synthesized voice messages through… cranial bones to the ear ... [0394] In a preferred embodiment, the sensor can sense speech, jaw motion, ear deformation, and/or other signals in an unobtrusive package that does not occlude hearing and/or completely block the ear canal. [0395] … The sensor can be small enough to allow room in the canal for sound and air circulation, so the wearer does not feel plugged by the device, and external sound is allowed to get to the tympanic membrane essentially unattenuated.  [0409] The deformations of the ear canal can be detected using sensors described above, including microphones, accelerometers, piezo or strain gage technology known in the art, changes in volume or pressure of a space-filling element in the ear canal, and/or optical sensors. The system can be designed with flexible members to universally fit into all ear canals, or a limited number of standard sizes to fit most ear canals, or lastly custom fit for each individual based on his or her ear canal geometry. [0422] The sensor can be disposed in the ear canal, or uses a second sound tube to sense sound in the ear canal. [0434] The in-ear sensor (e.g., sensors 904, 1800, 200, 301, and 311) described above is also able to derive a cardiac pulse of the wearer from deformations or vibrations of the ear canal by appropriate filtering of its signal.).

Applicant further argues that Shalon et al. teaches “not occluding” as referencing to [0394-0395] and [0421]. But the reference paragraph clearly means that “the in-canal sensor can sense speech, jaw motion, ear deformation and/or other signals in an unobtrusive package that does not occlude hearing and/or completely block the ear canal and “the in-canal sensor is designed so as to be non-occluding”. Shalon et al., therefore clearly stating that in-canal sensor is occluding but not fully and that is the way the sensor is designed so that it does not fully obstruct or occlude the ear canal and to allow room in the canal for sound and air circulation, so the wearer does not feel plugged by the device. Therefore, Shalon et al. non-occluded should be read as partially occluding as like applicant teaching partially occluding, only difference is that the applicant specification is not clearly reciting as like Shalon et al. Also see Fig. 1A, 1D, block 24, and  16 block 904 which is the sensor partially occluding ear canal.
Therefore, the applicant argument is not persuasive and office rejection is updated with the amended limitation and ground of the rejection for all claims remain same.



Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: a sampler for, a feature extractor for,  a nonverbal audio event definer for, a trainer for,  in claim13; data storage module configured to in claim 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.




Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, and 5-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shalon et al.(US 2011/0125063 A1).

Regarding Claim1, Shalon et al. teach: A method for training a classification module of nonverbal audio events, the method comprising ([0062] To monitor such activities, the system of the present invention includes a sensor unit mountable on or in a body region of the subject. The sensor is selected capable of sensing mechanical (e.g. jaw motion), thermal (e.g. body temperature), electrical (e.g. EKG or EMG) or acoustic activities. Acoustic activity is preferably, non-verbal (i.e. does not result from vocal chord vibrations) acoustic energy at a frequency of 0.001 Hz to 100 kHz, which is generated from mechanically-induced vibrations or motion. Further description of activities and suitable sensors for monitoring such activities is provided hereinbelow. [0063] In some embodiments, a wireless headset system of the present invention comprises several sub-systems: a sensor sensitive to jaw motion, unvoiced mouth sounds (teeth clicks, for example)… [0079] The system can detect the external and/or the intra-body sounds generated during urination. By measuring the duration and frequency of urination throughout the day, the system can keep track of the hydration level of the user.): capturing an in-ear sound pressure present in an occluded ear (deformation sound and/or other event based sound pressure in space-filling element in the ear canal) ([0063] In some embodiments, a wireless headset system of the present invention comprises several sub-systems: a sensor sensitive to jaw motion, unvoiced mouth sounds (teeth clicks, for example), and speech sounds…[0071] Acoustic energy generated …can be monitored through one or more sensors (e.g. microphones) positioned in or around the ear area,…[ 0074] The system may also have a sound tube to convey the sound to a sound processing unit elsewhere on the body. Such a sound tube can be, by way of example, a hollow plastic tube connected to a mounting means in the ear region and a sensing means such as a microphone in proximity to the sound processing unit. [0096] …The deformations of the ear canal can be detected using microphones or mechanical sensors such as accelerometer, piezo or strain gage technology known in the art, and/or changes in volume or pressure of a space-filling element in the ear canal, and/or optical means. The system can be designed with flexible members to universally fit into all ear canals,… [0119] Bone conduction technology could relay sound signals or synthesized voice messages through… cranial bones to the ear ... [0394] In a preferred embodiment, the sensor can sense speech, jaw motion, ear deformation, and/or other signals in an unobtrusive package that does not occlude hearing and/or completely block the ear canal. [0395] … The sensor can be small enough to allow room in the canal for sound and air circulation, so the wearer does not feel plugged by the device, and external sound is allowed to get to the tympanic membrane essentially unattenuated. [0422] The sensor can be disposed in the ear canal, or uses a second sound tube to sense sound in the ear canal.); associating at least one nonverbal audio event to the captured in-ear audio signal ([0101] The system of the present invention can preferably gather data from one or more of the sensors described above continuously, whenever a sensor detects such event, or when the user indicates the occurrence of an eating event. Any of the sensor's output may be interpreted by a universal algorithm, by asking the user to perform certain calibration tasks (e.g., eat food of known weight and consistency, swallow water, etc.) or by monitoring the patterns over time and adjusting the interpretation algorithms. [0107] The data generated by the system for each subject can be integrated and/or aggregated into a flat or relational database containing the behavior related activity signatures collected from a plurality of subjects over a time period. Such a database would be useful for establishing norms, averages, trends, classifications, calibrations, historical behaviors, reference sets, training data, statistical tests, clinical trials, targeted marketing, third party interventions, and relative scores in a stand-alone manner as pure data or as an integral part of the system used by individual subjects. [0194] The extracted acoustic energy patterns are then mapped into food intake events. Three preferred ways to measure food intake are by the type and number of bites and chews, and/or swallows and/or by the volume of the stomach. The volume of food per swallow is relatively constant, and averaged over the course of a day or a week, the caloric content of the total number of swallows is fairly constant as well. The Examples section which follows provides further description of such signal processing.); sampling the in-ear audio signal ([0208] A preprocessing module detects the presence of eating activity and automatically conditions the signal using automatic gain control on the analog signal prior to being digitized. The normalized signal is digitized using an analog-to-digital converter with a precision of 16 bits and a sampling rate of 8,000 Hz. [0209] A signal processing module extracts a sequence of salient features from the digitized signal. The digitized signal can be parameterized using short-term analysis frames. The frame rate is 1/100 ms and the analysis window is 500 ms.); extracting audio features of each sample of the in-ear audio signal; validating the extracted audio features; associating the validated and extracted audio features to the at least one nonverbal audio event ([0194] The bone conduction microphone is designed to sense the acoustic energy generated within the mouth during eating. The microphone's analogue electrical output is transmitted to processing unit 14 for signal processing. A preprocessing stage filters out noise, normalizes the energy level, and segments the sampled sound into analysis frames. Features are then extracted from the signal using spectral signature analysis to identify waveforms with eating microstructure events (signatures). The extracted components are then evaluated by a statistical classifier that combines the observed data (the features) with prior information about the patterns to segment the input data into specific event categories such as chews, sips, and speech. [0206] Sensor unit 12 (bone conduction microphone in this case) records the sounds made by chewing, swallowing, biting, sipping, and drinking. The salient acoustic features are extracted using a statistical-based pattern recognition system to classify the sounds into specific events. The output of the recognizer can be a hypothesized event sequence that can be used to track the flow of ingested food. The accuracy of the hypothesized output can be validated using a database of sounds annotated by a panel of human expert listeners. [0209] A signal processing module extracts a sequence of salient features from the digitized signal. The digitized signal can be parameterized using short-term analysis frames. The frame rate is 1/100 ms and the analysis window is 500 ms. [0210] To classify the input signal into an optimal sequence of eating events processing unit 14 can search a state graph that represents all admissible state sequences. Once the HMM parameters are estimated it is possible to search for the most likely state sequence given the observed data. A panel of human expert listeners can annotate training data efficiently using a multimedia recording of the subjects that may include audio, video and position of the mouth or jaw. Once the HMM parameters are estimated it is possible to search for the most likely state sequence given the observed data. The sequence of states is used to hypothesize the sequence of bites, chews, and swallows. Alternative methods of classifying the data include neural networks classifier. Another alternative is to detect chew events using a sliding window. For each window offset a score is computed using a Gaussian mixture model. Alternatively, wavelet methods can be used to classify the relevant chew features.); and training the classification module according to the validated and extracted audio features associated to the at least one nonverbal event ([0106] The system can detect the eating patterns of the user over time, thereby building a database of ingestion behavior and "learning" and customizing the performance of the system to the user. [107] By way of a second example, an aggregated database of the ingestion related motion or acoustic energy patterns detected for a plurality of users can be used to train the algorithms used to convert these patterns into classifications of ingestion behaviors. [0211] Features such as chew count, chew, duration, and chew energy are computed as part of "eating" events. The features are fed to a mapping algorithm that estimates the weight of the ingested food based on a calibration of the known food weight reference available database used to train system 10. Alternative methods of estimating the ingested food weight include HMM's, neural network, and regression trees.).

Regarding Claim 2 Shalon et al. teach: The method of claim 1, wherein the capturing of the in-ear audio signal is performed when the ear canal is at least partially occluded or partially isolated from an external environment to provide an occlusion effect (See rejection of claim 1 and Fig.1, [0096] Chewing events can also be sensed via mechanical deformation of the ear canal that occurs during chewing. The deformations of the ear canal can be detected using microphones or mechanical sensors such as accelerometer, piezo or strain gage technology known in the art, and/or changes in volume or pressure of a space-filling element in the ear canal, and/or optical means. The system can be designed with flexible members to universally fit into all ear canals, or a limited number of standard sizes to fit most ear canals, or lastly custom fit for each individual based on his or her ear canal geometry. [0394] In a preferred embodiment, the sensor can sense speech, jaw motion, ear deformation, and/or other signals in an unobtrusive package that does not occlude hearing and/or completely block the ear canal. [0395] … The sensor can be small enough to allow room in the canal for sound and air circulation, so the wearer does not feel plugged by the device, and external sound is allowed to get to the tympanic membrane essentially unattenuated.).

Regarding Claim 5, Shalon et al. teach: The method of claim 1, wherein validating the extracted audio features further comprises comparing the extracted audio features with a plurality of samples of the in-ear audio signal (See rejection of claim 1 and [0206] Sensor unit 12 (bone conduction microphone in this case) records the sounds made by chewing, swallowing, biting, sipping, and drinking. The salient acoustic features are extracted using a statistical-based pattern recognition system to classify the sounds into specific events. The output of the recognizer can be a hypothesized event sequence that can be used to track the flow of ingested food. The accuracy of the hypothesized output can be validated using a database of sounds annotated by a panel of human expert listeners. [0209] A signal processing module extracts a sequence of salient features from the digitized signal. The digitized signal can be parameterized using short-term analysis frames. The frame rate is 1/100 ms and the analysis window is 500 ms. [0210] The sequence of states is used to hypothesize the sequence of bites, chews, and swallows. [0211] Features such as chew count, chew, duration, and chew energy are computed as part of "eating" events. The features are fed to a mapping algorithm that estimates the weight of the ingested food based on a calibration of the known food weight reference available database used to train system 10.).

Regarding Claim 6, Shalon et al. teach:  The method of claim 1, wherein validating the extracted audio features further comprises testing the classification module with the extracted audio features (See rejection of claim 5, specifically [0211] Features such as chew count, chew, duration, and chew energy are computed as part of "eating" events. The features are fed to a mapping algorithm that estimates the weight of the ingested food based on a calibration of the known food weight reference available database used to train system 10.).

Regarding Claim 7, Shalon et al. teach:  The method of claim 6, wherein testing the classification module further comprises capturing another in-ear audio signal for a same at least one nonverbal audio event (See rejection of claim 5, specifically [0206] Sensor unit 12 (bone conduction microphone in this case) records the sounds made by chewing, swallowing, biting, sipping, and drinking. The output of the recognizer can be a hypothesized event sequence that can be used to track the flow of ingested food. The accuracy of the hypothesized output can be validated using a database of sounds annotated by a panel of human expert listeners).

Regarding Claim 8, Shalon et al. teach:  The method of claim 1, wherein the at least one nonverbal audio event is a user induced nonverbal audio event (See rejection of claim 5, specifically [0206] Sensor unit 12 (bone conduction microphone in this case) records the sounds made by chewing, swallowing, biting, sipping, and drinking.).

Regarding Claim 9, Shalon et al. teach:   The method of claim 8, wherein the user induced nonverbal audio event is selected from the group consisting of teeth clicking, tongue clicking, blinking, eye closing, teeth grinding, throat clearing, saliva noise, swallowing, coughing, talking, yawning with inspiration, yawning with expiration, respiration, heartbeat and head or body movement, earpiece manipulation, and any combination thereof (See rejection of claim 5 and [0071] Acoustic energy generated by chewing, swallowing, biting, sipping, drinking, teeth grinding, teeth clicking, tongue clicking, tongue movement, jaw muscles or jaw bone movement, spitting, clearing of the throat, coughing, sneezing, snoring, breathing rate, breathing depth, nature of the breath, heartbeat, digestion, motility to or through the intestines, tooth brushing, smoking, screaming, user's voice or speech, other user generated sounds, and ambient noises in the user's immediate surroundings can be monitored through one or more sensors (e.g. microphones) positioned in or around the ear area, on the skull, neck, throat, chest, back or abdomen regions. The preferred area allows for such sounds to be analyzed in order to determine the nature of the bolus swallowed (type of food, number of chews, hard versus soft chews, crunchy versus soft food, ingestion of liquid etc.). Microphones in different positions or orientations can be tuned to detect sounds originating within the user's body as opposed to ambient sounds surrounding the user. Software can be used to select which microphone is given priority for data collection and analysis based on the situation. Each microphone can be optimized to receive a specific range of sound frequencies corresponding to the signal to be measured. The sensing element can be designed to be sensitive to a wide range of frequencies of the acoustic energy generated in the head region, ranging from approximately 0.001 hertz up to approximately 100 kilohertz. The sensing element can be sensitive to just a narrow range of frequencies and a multiplicity of sensing elements used to cover a broader range of frequencies. The sensing element can receive the acoustic energy via air transmission, tissue or bone conduction.).

Regarding Claim 10, Shalon et al. teach:   The method of claim 1, wherein the at least one nonverbal audio event is a mechanically-induced event that is external to the user (See rejection of claim 9 and [0071] … ambient noises in the user's immediate surroundings can be monitored through one or more sensors (e.g. microphones) positioned in or around the ear area, on the skull, neck, throat, chest, back or abdomen regions. [0075] The ambient noise or the noise of the speaker can be cancelled out from the microphone input using passive and active means such as a focusing diaphragm, …).

Regarding Claim 11, Shalon et al. teach: The method of claim 1, wherein training the classification module further comprises generating instructions configured to identify a non-verbal audio of a captured audio signal based on the validated and extracted audio features associated to the at least one nonverbal event, the classification module being configured to execute the generated instructions(See rejection of claim 1 specifically, [0106] The system can detect the eating patterns of the user over time, thereby building a database of ingestion behavior and "learning" and customizing the performance of the system to the user. [107] By way of a second example, an aggregated database of the ingestion related motion or acoustic energy patterns detected for a plurality of users can be used to train the algorithms used to convert these patterns into classifications of ingestion behaviors. [0211] Features such as chew count, chew, duration, and chew energy are computed as part of "eating" events. The features are fed to a mapping algorithm that estimates the weight of the ingested food based on a calibration of the known food weight reference available database used to train system 10. Alternative methods of estimating the ingested food weight include HMM's, neural network, and regression trees.).

Regarding Claim 12, Shalon et al. teach: The method of claim 1, wherein training the classification module further comprises adding to the classification module the validated and extracted features associated to the at least one nonverbal event (See rejection of claim 1 specifically, ([0106] The system can detect the eating patterns of the user over time, thereby building a database of ingestion behavior and "learning" and customizing the performance of the system to the user. [0107] The data generated by the system for each subject can be integrated and/or aggregated into a flat or relational database containing the behavior related activity signatures collected from a plurality of subjects over a time period. By way of example, a database of the physical activity patterns or ingestion patterns of many subjects can be cross referenced to a database of their health, medical, exercise, drug use and/or weight records. By way of a second example, an aggregated database of the ingestion related motion or acoustic energy patterns detected for a plurality of users can be used to train the algorithms used to convert these patterns into classifications of ingestion behaviors. [0206] Sensor unit 12 (bone conduction microphone in this case) records the sounds made by chewing, swallowing, biting, sipping, and drinking. The salient acoustic features are extracted using a statistical-based pattern recognition system to classify the sounds into specific events. The output of the recognizer can be a hypothesized event sequence that can be used to track the flow of ingested food. The accuracy of the hypothesized output can be validated using a database of sounds annotated by a panel of human expert listeners. [0209] A signal processing module extracts a sequence of salient features from the digitized signal. The digitized signal can be parameterized using short-term analysis frames. The frame rate is 1/100 ms and the analysis window is 500 ms. [0210] To classify the input signal into an optimal sequence of eating events processing unit 14 can search a state graph that represents all admissible state sequences. Once the HMM parameters are estimated it is possible to search for the most likely state sequence given the observed data. A panel of human expert listeners can annotate training data efficiently using a multimedia recording of the subjects that may include audio, video and position of the mouth or jaw. Once the HMM parameters are estimated it is possible to search for the most likely state sequence given the observed data. The sequence of states is used to hypothesize the sequence of bites, chews, and swallows. Alternative methods of classifying the data include neural networks classifier. Another alternative is to detect chew events using a sliding window. For each window offset a score is computed using a Gaussian mixture model. Alternatively, wavelet methods can be used to classify the relevant chew features.).

Regarding Claim 13, Shalon et al. teach: A system for training a classification module of nonverbal audio events, the system comprising: an electronic earpiece having an in-ear microphone for capturing sound pressure present within an occluded ear canal; a sampler for sampling the stored audio signal present in a data source; a feature extractor for extracting a plurality of audio features from the sampled audio signal and validating the extracted audio features; a nonverbal audio event definer for receiving a nonverbal audio event definition corresponding to the captured audio signal; a trainer for training the classification module by associating the validated plurality of audio features to the received nonverbal audio event definition (See the rejection of claim 1).

Regarding Claim 14, Shalon et al. teach:  The system of claim 13, wherein the trainer is further configured to train the classification module to detect at least one of a health indicator, mood indicator, biosignal indicator, artefact indicator, command indicator, non-user induced event indicator, user induced event indicator (See rejection of claim 1, specifically [0066] The system can also be used to monitor and modify behaviors associated with eating disorders such as bulimia and anorexia, as well as to other behaviors including snoring, sleep apnea, bruxism, smoking, alcohol consumption, drug addiction, exercise and physical training, stuttering, panic disorders, attention deficit, hyperactivity disorders, or other disorders that have unique physiological, sound or motion characteristics (i.e. activities) that can be identified and monitored. [0087] The system can also measure the user's heart rate, heart rate coherence, breathing rate or breathing depth patterns or galvanic skin response to assess their stress, fear or anger level and then provide feedback to reduce the stress by, for example, talking the user through breathing exercises. This could be useful in proactively reducing violent activity and impulsive behavior. The galvanic skin response can also be correlated to the general mood of the user and the system can provide encouraging or funny verbal feedback to improve the user's mood. [0107] The data generated by the system for each subject can be integrated and/or aggregated into a flat or relational database containing the behavior related activity signatures collected from a plurality of subjects over a time period. By way of example, a database of the physical activity patterns or ingestion patterns of many subjects can be cross referenced to a database of their health, medical, exercise, drug use and/or weight records. By way of a second example, an aggregated database of the ingestion related motion or acoustic energy patterns detected for a plurality of users can be used to train the algorithms used to convert these patterns into classifications of ingestion behaviors. [0255] System 10 can utilize any physical activity indicator alone or a combination of base metabolic rate, physical activity, gender, age, height, weight, medical records, medical background, and health status of the individual to calculate the caloric expenditure and full energy balance of the user.).

Regarding Claim 15, Shalon et al. teach: The system of claim 13, wherein the trainer is further configured to train the classification module to determine a nonverbal audio event according to an in-ear audio signal captured by a health monitoring system (See rejection of claim 14).

Regarding Claim 16, Shalon et al. teach: The system of claim 13, wherein the trainer is further configured to train the classification module to determine a nonverbal audio event according to an in-ear audio signal captured by an artefact removal system (See rejection of claim 14).

Regarding Claim 17, Shalon et al. teach: The system of claim 13, wherein the trainer is further configured to train the classification module to determine a nonverbal audio event according to an in-ear audio signal captured by a biosignal monitoring system (See rejection of claim 14).

Regarding Claim 18, Shalon et al. teach: The system of claim 13, wherein the trainer is further configured to train the classification module to determine a nonverbal audio event according to an in-ear audio signal captured by a silent interface (See rejection of claim 14 and also see [0071] … ambient noises in the user's immediate surroundings can be monitored through one or more sensors (e.g. microphones) positioned in or around the ear area, on the skull, neck, throat, chest, back or abdomen regions. [0075] The ambient noise or the noise of the speaker can be cancelled out from the microphone input using passive and active means such as a focusing diaphragm, …).

Regarding Claim 19, Shalon et al. teach: The system of claim 13, the system further comprising an audio signal data storage module configured to store the captured audio signal (See rejection of claim 1 and [0107] The data generated by the system for each subject can be integrated and/or aggregated into a flat or relational database containing the behavior related activity signatures collected from a plurality of subjects over a time period. By way of example, a database of the physical activity patterns or ingestion patterns of many subjects can be cross referenced to a database of their health, medical, exercise, drug use and/or weight records. By way of a second example, an aggregated database of the ingestion related motion or acoustic energy patterns detected for a plurality of users can be used to train the algorithms used to convert these patterns into classifications of ingestion behaviors.).

Regarding Claim 20, Shalon et al. teach: The system of claim 13, the trainer being configured to generate instructions configured to identify a non-verbal audio even of a captured audio signal based on the validated plurality of audio features associated to the received nonverbal audio event definition, the classification module being configured to execute the generated instructions (See rejection of claim 11.).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Shalon et al.
Regarding Claim 3, Shalon et al. teach:  See rejection of claim 1, and [0062] The sensor is selected capable of sensing mechanical (e.g. jaw motion), thermal (e.g. body temperature), electrical (e.g. EKG or EMG) or acoustic activities. Acoustic activity is preferably, non-verbal (i.e. does not result from vocal chord vibrations) acoustic energy at a frequency of 0.001 Hz to 100 kHz, which is generated from mechanically-induced vibrations or motion.  [0209] A signal processing module extracts a sequence of salient features from the digitized signal. The digitized signal can be parameterized using short-term analysis frames. The frame rate is 1/100 ms and the analysis window is 500 ms. [0211] Features such as chew count, chew, duration, and chew energy are computed as part of "eating" events.
Shalon et al. do not explicitly teach: wherein the sampling further comprises sampling a frame having a duration ranging between 200 milliseconds and 1200 milliseconds.
However, Shalon et al. teaching frame duration above overlap or lie inside ranges as claimed.
In the case where the claimed ranges “overlap or lie inside ranges disclosed by the prior art” a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990).
Therefore, it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the invention was made for Shalon et al. to  sampling a frame having a duration ranging between 200 milliseconds and 1200 milliseconds in order to compute a non-verbal event.

Regarding Claim 4, Shalon et al. teach: The method of claim 3 wherein the sampling further comprises sampling a 400 milliseconds frame of the in-ear audio signal (See rejection of claim 3, In the case where the claimed ranges “overlap or lie inside ranges disclosed by the prior art” a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990).).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Abeyratne et al. (Us 10098569 B2) teach: (Abstract)A method of operating a computational device to process patient sounds, the method comprises the steps of: extracting features from segments of said patient sounds; and classifying the segments as cough or non-cough sounds based upon the extracted features and predetermined criteria; and presenting a diagnosis of a disease related state on a display under control of the computational device based on segments of the patient sounds classified as cough sounds..
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656