DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 05/16/2022. Claims 1-20 are pending in the application and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Response to Amendment
The response filed on 05/16/2022 has been correspondingly accepted and considered in this Office Action. Claims 1-20 have been examined. Applicant’s amendments to claim 1, 8 and 15, indicating the embedding model extracts the audio embeddings in order to correlate the one or more audio events with a physiological structure of the target subject with the support in the Applicant’s Specifications [0094]-[0096] have been considered in this Office Action.
Response to Arguments
Applicant's arguments filed 05/16/2022 have been fully considered as follows:
Applicant’s arguments with respect to amended claim 1 (based on previous claim 3) on page 11 state that
“….whether the audio signal or portion thereof corresponds with a cough sound. Patel, paras. [0051]- [0052]. However, a cough is not a physiological structure of a subject (e.g., a lung size, a lung capacity, an airway opening area, etc.). Instead, a cough is an action by a subject that generates a sound....”
	
The examiner respectfully disagrees, Patel teaches “Extracted features which are classified as a cough (e.g., determined to correspond to a cough sound) may be stored, as shown in box 250. The extracted features may correspond to a representation of the audio signal or portion thereof identified as corresponding to a cough sound” in Patel, [0053]. The broadest reasonable interpretation of “more audio events with a physiological structure of the target subject” includes 
identify respiratory or pulmonary issues.  Coughing is type of breathing sound that is used to identify respiratory or pulmonary issues and hence, persons skilled in the art will recognize that changes in cough sounds can be used to detect other lung conditions as identified in Patel [0058],  the rejection of amended claim 1 (previously claim 3) rejected under 35 U.S.C. 103 is sustained and further updated accordingly.
In response to the art rejection(s) of the remaining independent claims 8 and 15 and the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 05/16/2022, Examiner respectfully notes as follows. For completeness, should the mentioned claims are likewise traversed for similar reasons to independent claim 1, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claim 1 discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-3, 5-10, 12-17 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over J. H. L. Hansen and T. Hasan, "Speaker Recognition by Machines and Humans: A tutorial review," in IEEE Signal Processing Magazine, vol. 32, no. 6, pp. 74-99, Nov. 2015 in view of Patel, et.al., US Patent Application Publication 2020/0029929 (referenced in Applicant’s IDS of 8/20/2020) further in view of Lu, L., Liu, L., Hussain, M. J., & Liu, Y. (2017). I sense you by breath: Speaker recognition via breath biometrics. IEEE Transactions on Dependable and Secure Computing, 17(2), 306-319.

    PNG
    media_image1.png
    412
    820
    media_image1.png
    Greyscale
Regarding claim 1, Hansen teaches a method, comprising: obtaining, by an electronic device, an audio segment comprising one or more audio events of a target subject (see Hansen, pg. 83, fig.4, col 2 In automatic speaker recognition, computer programs designed to operate independently with minimum human intervention identify a speaker's voice. The system user may adjust the design parameters, but to make the comparison between speech segments, all the user needs to do is provide the system with the audio recordings); extracting, by the electronic device, audio embeddings from the one or more audio events using an embedding model, the embedding model comprising a trained machine learning model (see Hansen, fig. 4 and pg. 84, fig.4, col 1  Predefined feature parameters are first extracted from the audio recordings that are designed to capture the idiosyncratic characteristics of a person's speech in mathematical parameters. These features obtained from an enrollment speaker are used to build/train mathematical models that summarize their speaker-dependent properties; mathematical models are interpreted as the embedding model) ; comparing, by the electronic device, the extracted audio embeddings with a match profile of the target subject, the match profile generated during an enrollment stage (see Jansen, pg. 84, col 1 For an unknown test segment, the same features are then extracted, and they are compared against the model of the enrollment/claimed speaker. The models are designed so that such a comparison provides a score (a scalar value) indicating whether the two utterances are from the same speaker. If this score is higher (or lower) than a predefined threshold then the system accepts (or rejects) the test speaker). However, Hansen fails to teach wherein the embedding model extracts the audio embeddings in order to correlate the one or more audio events with a physiological structure of the target subject; generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject.  
However, Patel teaches wherein the embedding model extracts the audio embeddings in order to correlate the one or more audio events with a physiological structure of the target subject (see Patel, [0035]  when executed by the processing unit(s) 120 (e.g., by one or more processors), may cause the cough detecting device 102 to extract cough features from an audio signal as described herein; cough detecting is interpreted audio embeddings to correlate the one or more audio events with a physiological structure of the target subject).
Hansen and Patel are considered to be analogous to the claimed invention because they relate to speech recognition. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Hansen on using a trained model to identify and process audio recording with the classification of cough in an audio stream teachings of Patel to process a user’s respiratory symptoms without compromising user privacy (see Patel, [0013]).
However, Hansen in view of Patel fails to teach generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject.  

    PNG
    media_image2.png
    389
    550
    media_image2.png
    Greyscale
However, Lu teaches generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject (see Lu, pg. 313 sect 4.3 During verification step, breath samples from unknown speaker are matched to stored and reference models, and a similarity score is calculated. In analogy to training phase, we evaluate GMM, HMM, SVM, ANN and KNN algorithms, and a simple similarity based scheme (we term it as “Decision Maker”, labelled as BreathID in Figs. 7c and 7d).  
Hansen, Patel and Lu are considered to be analogous to the claimed invention because they relate to speaker recognition using neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Hansen and Patel on using a trained model to identify and process audio recording with the classification of cough with the breath fingerprint processing teachings of Lu to process and verify a user’s identity based on the breath features (see Lu, pg. 306, sect. 1).
	Regarding claim 2, Hansen in view of Patel further in view of Lu teach the method of Claim 1. Hansen further teaches performing the enrollment stage by the electronic device, comprising: obtaining enrollment audio data of the target subject, the enrollment audio data comprising samples of one or more enrollment audio events of the target subject (see Hansen, pg. 84, col 1 Predefined feature parameters are first extracted from the audio recordings that are designed to capture the idiosyncratic characteristics of a person's speech in mathematical parameters); extracting enrollment audio embeddings associated with the target subject from the one or more enrollment audio events using the embedding model (see Hansen, pg. 84, col 1, These features obtained from an enrollment speaker are used to build/train mathematical models that summarize their speaker-dependent properties; mathematical models are interpreted as the embedding model); and creating the match profile of the target subject using the extracted enrollment audio embeddings (see Hansen, pg. 84, col. 1, The models are designed so that such a comparison provides a score (a scalar value) indicating whether the two utterances are from the same speaker).
Regarding claim 3, Hansen in view of Patel further in view of Lu teach the method of Claim 2. Patel further teaches wherein: the samples are processed to extract one or more spectral audio features (see Patel, [0056] FIG. 3 is a flowchart of an example of cough reconstruction in accordance with an example of the present invention. In box 310, extracted features from an audio signal may be received. The extracted features may correspond to those, for example, stored in box 250 of FIG. 2. The extracted features may include the mean, normalization constant, and/or phase of the spectrogram in addition to projection scores, which may be used in reconstruction); and the embedding model transforms the spectral audio features to the enrollment audio embeddings in order to correlate the enrollment audio data with [[a]] the physiological structure of the target subject (see Patel, [0051] So, for example, the frequency-based representation of the audio signal or portion of audio signal may be compared with the cough model including principal components indicative of coughs. The lesser-dimensional matrix provided in box 230 may include a score for each of the principal components of the audio signal or portion thereof based on the vectors of the cough model. A plurality of scores (e.g. one score per eigenvalue) may be obtained in box 230 for use in determining whether or not the audio signal or portion thereof corresponds with a cough; cough model is interpreted as the embedding model to transform the spectral features in order to correlate to audio data with the physiological structure of the target subject ).

Regarding claim 5, Hansen in view of Patel further in view of Lu teach the method of Claim 1. Hansen further teaches, wherein comparing the extracted audio embeddings with the match profile of the target subject comprises evaluating one or more distancing metrics (see Hansen, pg. 91, col. 2, In [79], the cosine similarity measure-based scoring was proposed for speaker verification. In this measure, the match score between a target and test i-vector wtarget and wtest is computed as their normalized dot product).

    PNG
    media_image3.png
    208
    342
    media_image3.png
    Greyscale
Regarding claim 6, Hansen in view of Patel further in view of Lu teach the method of Claim 1. Hansen further teaches, wherein the audio segment comprises audio data received from multiple devices (see Hansen, Fig.1, pg. 76, to consider variability, Figure 1 highlights a range of factors that can contribute to mismatch for speaker recognition. These can be partitioned based on three broad classes: 1) speaker based, 2) conversation based, and 3) technology based. Technology-or external-based variability sources: these include how and where the audio is captured; Fig. 1 depicts audio captured from various devices).
Regarding claim 7, Hansen in view of Patel further in view of Lu teach the method of Claim 1. Patel further teaches, wherein the one or more audio events comprise at least one of a cough, a sneeze, or a speech of the target subject (see Patel [0053] Extracted features which are classified as a cough (e.g., determined to correspond to a cough sound) may be stored, as shown in box 250. The extracted features may correspond to a representation of the audio signal or portion thereof identified as corresponding to a cough sound).
Regarding claim 8, is directed to an electronic device claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 9, is directed to an electronic device claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 10, is directed to an electronic device claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3.
Regarding claim 12, is directed to an electronic device claim corresponding to the method claim presented in claim 5 and is rejected under the same grounds stated above regarding claim 5.
Regarding claim 13, is directed to an electronic device claim corresponding to the method claim presented in claim 6 and is rejected under the same grounds stated above regarding claim 6.
Regarding claim 14, is directed to an electronic device claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.
Regarding claim 15, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 16, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 17, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3.
Regarding claim 19, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 5 and is rejected under the same grounds stated above regarding claim 5.
Regarding claim 20, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.
Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over J. H. L. Hansen and T. Hasan, "Speaker Recognition by Machines and Humans: A tutorial review," in IEEE Signal Processing Magazine, vol. 32, no. 6, pp. 74-99, Nov. 2015 in view of Lu, L., Liu, L., Hussain, M. J., & Liu, Y. (2017). I sense you by breath: Speaker recognition via breath biometrics. IEEE Transactions on Dependable and Secure Computing, 17(2), 306-319, further in view of Patel, et.al., US Patent Application Publication 2020/0029929 (referenced in Applicant’s IDS of 8/20/2020) further in view of Gerl, et.al., US Patent Application Publication 2009/0119103.
Regarding claim 4, Hansen in view of Patel further in view of Lu teach the method of Claim 3 but fail to teach the match profile of the target subject comprises a first match profile; and performing the enrollment stage further comprises: creating a second match profile by transforming the first match profile, wherein the first match profile corresponds to a first audio event of the target subject and the second match profile corresponds to a second audio event of the target subject. However, Gerl teaches the match profile of the target subject comprises a first match profile (see Gerl, [0063] At 908, a speaker identification component 810 may identify a speaker. In this method, the segment of the current received utterance is processed to determine likelihood functions with respect to each speaker model within the speaker model set. At start-up, the speaker model set may include the UBM. In time, additional speaker models will be created and used to identify speech; speaker model interpreted as first match profile); and performing the enrollment stage further comprises: creating a second match profile by transforming the first match profile, wherein the first match profile corresponds to a first audio event of the target subject and the second match profile corresponds to a second audio event of the target subject (see Gerl,[0037, 0069] a model adaptation may compare a speaker model that is a member of the speaker model set before and after a potential change. The comparison may determine the divergence or distances between each of the speaker models prior to or after the adaptation. Some systems may determine a Kullback-Leibler entropy. Other systems may execute a cross-correlation. By these exemplary analyses additional processes may be processed with the predetermined criterion to identify a match; speaker model adaptation is interpreted as creating second match profile).
Hansen, Patel, Lu and Gerl are considered to be analogous to the claimed invention because they relate to speech processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Hansen, Patel and Lu on using a trained model to identify and process audio recording with the automatic retrain speaker model teachings of Gerl to reduce the training and storing of numerous voice files (see Gerl, [0006]).
Regarding claim 11, is directed to an electronic device claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4.
Regarding claim 18, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Sun, X., Lu, Z., Hu, W., & Cao, G. (2015, September). SymDetector: detecting sound related respiratory symptoms using smartphones. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (pp. 97-108) teaches an embedding model that extracts the audio embeddings in order to correlate the one or more audio events with a physiological structure of the target subject (pg. 97, col 2, SymDetector detects four types of respiratory symptoms (i.e., sneeze, cough, sniffle and throat clearing)).
Baughman et.al., US Patent 8,589,167 teaches comparison of different models to determine speaker authentication (see Baughman, col 12, lines 32-47).
M. Zhang, Y. Chen, L. Li and D. Wang, "Speaker recognition with cough, laugh and "Wei"," 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2017, pp. 497-501 teaches proposes a speaker recognition task with speech events, such as cough and laugh (see Zhang, abstract).
      Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656