DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 03/02/2021. Claims 1-10 are pending in the application and have been examined.
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
The claims 4 and 10 are objected to because they include reference characters which are not enclosed within parentheses.  
Reference characters corresponding to elements recited in the detailed description of the drawings and used in conjunction with the recitation of the same element or group of elements in the claims should be enclosed within parentheses so as to avoid confusion with other numbers or characters which may appear in the claims.  See MPEP § 608.01(m).
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
sound acquisition module in claims 1, 6 and 10;
sound audio feature extraction module claims 1, 8 and 10;
neural network module in claims 1, 2, 3 and 5.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Alam et. al., US Patent Application Publication 2020/0046244 in view of Kiranyaz, Serkan, et al. "Real-time pcg anomaly detection by adaptive 1d convolutional neural networks." (2019). 
Regarding claim 1, Alam teaches a system for offline embedded abnormal sound fault detection, comprising an embedded end system, wherein the embedded end system comprising a sound acquisition module, a sound audio feature extraction module, and a neural network module (see Alam, [0028] comprises the neural network which when executed by the one or more hardware processors 104 perform the methodology (or methodologies);the sound acquisition module converting sound from a sound source to be detected into an audio digital signal, and then transmitting the audio digital signal to the sound audio feature extraction module (see Alam, [0029] In an embodiment of the present disclosure, at step 202, the one or more hardware processors 104 receive, via the neural network architecture system 100, a plurality of heart sound signals specific to one or more users; interpreted as sound acquisition module);  the sound audio feature extraction module processing the audio digital signal in a frequency domain to obtain an audio frequency sample as an input of the neural network module (see Alam, [0036] Referring to step 206, in an embodiment of the present disclosure, the one or more hardware processors 104 extract a set of frequency domain based spectrogram features from each of the plurality of windows);  the neural network module consisting of at least one CNN feature extraction layer and at least one fully connected layer and at least one classification layer (see Alam, [0065] The Classifier may classify machine operation based on a classifier (1006). A machine learning model and multi-label classification may be accessed to estimate machine statuses from sound features. For example, a convolutional neural network (CNN) may be utilized for the machine learning model. In the case of a 1D CNN, each label may have binary values. To train the 1D CNN model, maximizing F1 score was used which is effective to multi-label classification);  after the at least one CNN feature extraction layer performs feature extraction on the audio frequency sample, the at least one fully connected layer and the at least one classification layer selecting one of a plurality of anomaly types as an anomaly detection result to complete anomaly classification (see Alam, [0054] In an embodiment of the present disclosure, at step 214, the plurality of heart sound signals are classified as one of a normal sound signal or a murmur sound signal based on the concatenated behavioral set. This classification is performed using the fully connected layers (e.g., 1.sup.st fully connected layer and 2.sup.nd fully connected layer) and a softmax layer associated with the neural network architecture system 100 depicted in FIG. 4);  working parameters of the neural network module being determined by an abnormal sound detection model (see Alam, [0050-0053] at step 210, the first Deep Neural Network learns a temporal behavior based on the set of time based Mel Frequency Cepstral Coefficient (MFCC) features and the second Deep Neural Network learns a spatial behavior based on the set of frequency based Spectrogram features. at step 212, the learned temporal behavior and the learned spatial behavior are concatenated (or linked together) to obtain a concatenated behavioral set; the behavioral set are interpreted as the working parameters determined by an abnormal sound detection module); the number of the anomaly types being determined by the abnormal sound detection model (see Alam, [0054] At step 214, the plurality of heart sound signals are classified as one of a normal sound signal or a murmur sound signal based on the concatenated behavioral set; normal or murmur is interpreted as the anomaly types);
However, Alam fails to teach the number of network layers of the at least one CNN feature extraction layer being dynamically adjustable; and a network structure of the at least one fully connected layer and the at least one classification layer being determined according to the number of the anomaly types and being dynamically variable, and the anomaly types to be outputted comprising N types of anomalies, unrecognized anomaly, and no anomaly.
However, Kiranyaz teaches the number of network layers of the at least one CNN feature extraction layer being dynamically adjustable (see Kiranyaz, sect III pg. 3 , we used an adaptive 1D CNN at the core of the system that is trained by recordings as N, A, or too noisy to evaluate; the adaptive ID CNN is interpreted as being dynamically adjustable ); and a network structure of the at least one fully connected layer and the at least one classification layer being determined according to the number of the anomaly types and being dynamically variable, and the anomaly types to be outputted comprising N types of anomalies, unrecognized anomaly, and no anomaly (see Kiranyaz, sect III pg. 3 , we used an adaptive 1D CNN at the core of the system that is trained by recordings as N, A, or too noisy to evaluate. Fig. 2 Classification Abnormal ( Anomaly), Normal (no Anomaly) or bad ( unrecognized anomaly) ).
Alam and Kiranyaz  are considered to be analogous to the claimed invention because they relate to anomaly detection compared to normal sound using Deep Neural Networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Alam on processing and classifying the sound signals using the fully connected deep neural networks with the an adaptive 1D Convolutional Neural Network (CNN) teachings of Kiranyaz to achieve a real-time processing ability with significantly lower delay and computational complexity( see Kiranyaz  pg. 1, sect I).
Regarding claim 5, Alam in view of Kiranyaz teach the system for offline embedded abnormal sound fault detection according to claim 1. Alam further teaches wherein the input of the neural network module is sample information formed by splicing audio frequency samples of a current frame and consecutive N frames prior to the current frame (see Alam, [0034] In other words, each heart sound signal is split/divided into one or more windows (e.g., time analysis windows or time windows). Each window from the plurality of windows is of a fixed time duration (e.g., say 4 seconds window)).
Regarding claim 8, Alam in view of Kiranyaz teach the system for offline embedded abnormal sound fault detection according to claim 1. Alam further teaches wherein the sound audio feature extraction module extracts the audio frequency sample by fast Fourier transform (see Alam, [0038] 2. On each window, Fast Fourier Transform (FFT) is applied with a Hamming window of the length of ‘z’ (e.g., where value of ‘z’ is 128) ).
Regarding claim 9, Alam in view of Kiranyaz teach the system for offline embedded abnormal sound fault detection according to claim 8. Alam further teaches wherein the fast Fourier transform is 512-point fast Fourier transform (see Alam, [0042] 2. On each window Discrete Fourier Transform (DFT) D.sub.i(k) (k∈[1, K], where K is length of DFT are applied with Hamming window of length of ‘c’ (e.g., value of ‘c’ is 50); K is assumed to be 512 ).
Claims 2-3 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Alam et. al., US Patent Application Publication 2020/0046244 in view of Kiranyaz, Serkan, et al. "Real-time pcg anomaly detection by adaptive 1d convolutional neural networks." (2019) further in view of Salekin, et.al., US 2021/0005067.
 	Regarding claim 2, Alam in view of Kiranyaz teach the system for offline embedded abnormal sound fault detection according to claim 1. Alam further teaches wherein the neural network module further comprises a long short-term memory (LSTM) layer (see Alam, [0049] Referring to step 208, in an embodiment of the present disclosure, the one or more hardware processors 104 concurrently input the set of MFCC features to two parallel deep neural networks are proposed, namely, Recurrent Neural Network based Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN)).  However, Alam in view of Kiranyaz  fails to teach the LSTM layer processes output of the at least one CNN feature extraction layer, performs screening of time dimension information, and then sends output to the at least one fully connected layer and the at least one classification layer.
However,  Salekin teaches the LSTM layer processes output of the at least one CNN feature extraction layer, performs screening of time dimension information, and then sends output to the at least one fully connected layer and the at least one classification layer (see Salekin,  [0065] FIG. 6 shows a logical flow diagram illustrating the operations of the BLSTM classifier model 38 of the audio event detection program 30. The DCNN audio tagging model 34 receives as an input the sequence of audio vector representations v.sub.1, . . . , v.sub.N for the individual audio clip 102. The BLSTM classifier model 38 is configured to determine for each window segment S.sub.i whether it includes the target audio event or does not include the target audio event. In this way, the BLSTM classifier model 38 determines the boundaries in time of the target audio event within the individual audio clip 102; BLSTM classifier interpreted as screening of time dimension information). 
Alam, Kiranyaz and Salekin  are considered to be analogous to the claimed invention because they relate to audio anomaly detection using Deep Neural Networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Alam and Kiranyaz on processing and classifying the sound signals using the fully connected deep neural networks with LSTM and CNN combination for audio event detection teachings of Salekin with the goal of understanding the environment and detecting events and anomalies, can be useful in variety of applications such as smart homes and smart cars( see Salekin  [0005]).
Regarding claim 3, Alam in view of Kiranyaz further in view of Salekin teach the system for offline embedded abnormal sound fault detection according to claim 2. Kiranyaz further teaches wherein the neural network module further comprises a trigger decision layer ( see Kiranyaz, Fig. 4 Final decision layer); and the trigger decision layer performs final classification for the output of the at least one fully connected layer and the at least one classification layer to eliminate generalization errors (see Kiranyaz, pg. 4 sect III 1D CNN can be used directly or alternatively can be processed by a majority rule to obtain the final class decision of the entire stream, e.g., the stream is classified as A if more than 25% of the beats are abnormal).
Regarding claim 10, Alam in view of Kiranyaz further in view of Salekin teach the system for offline embedded abnormal sound fault detection according to claim 3. Alam further teaches a method , comprising the following steps of: step 201) using the sound acquisition module to collect the sound from the sound source to be detected to obtain the audio digital signal(see Alam, [0029] In an embodiment of the present disclosure, at step 202, the one or more hardware processors 104 receive, via the neural network architecture system 100, a plurality of heart sound signals specific to one or more users; interpreted as sound acquisition module); step 202) using the sound audio feature extraction module to process the audio digital signal in the frequency domain to obtain the audio frequency sample of the audio digital signal(see Alam, [0036] Referring to step 206, in an embodiment of the present disclosure, the one or more hardware processors 104 extract a set o f frequency domain based spectrogram features from each of the plurality of windows); step 203.1) using the at least one CNN feature extraction layer to perform convolution on the audio frequency sample to complete the feature extraction(see Alam, [0065] The Classifier may classify machine operation based on a classifier (1006). A machine learning model and multi-label classification may be accessed to estimate machine statuses from sound features. For example, a convolutional neural network (CNN) may be utilized for the machine learning model. In the case of a 1D CNN, each label may have binary values. To train the 1D CNN model, maximizing F1 score was used which is effective to multi-label classification); step 203.2) using the LSTM layer to screen time dimension information of the feature extracted (see Alam, [0051] the BiLSTM learns the temporal trend of MFCC's sequences of the heart sound signals; temporal interpreted as screen time dimension information); step 203.3) using the at least one fully connected layer and the at least one classification layer to complete anomaly classification (see Alam, [0054] at step 214, the plurality of heart sound signals are classified as one of a normal sound signal or a murmur sound signal based on the concatenated behavioral set. This classification is performed using the fully connected layers (e.g., 1.sup.st fully connected layer and 2.sup.nd fully connected layer) and a softmax layer associated with the neural network architecture system 100 depicted in FIG. 4); and step 204) using the trigger decision layer to perform final classification, eliminating generalization errors, and obtaining an anomaly detection result (see Alam, [0057], In the first convolutional layer, the input was convolved with 4 filters of size 3×3, in an example embodiment. Batch normalization followed by Rectified Linear Unit (ReLU) activation function was applied on the output of the convolutional filter. First max-pooling layer summarizes and reduces the size of filters using 2×2 kernel. Similarly, two subsequent convolutional layers convolve output of max-pooling layers using 3×3 filter followed by batch normalization and ReLU activations. Final activation output is then flattened and fed to fully connected layer with 128 units. To reduce over-fitting, L2-Regularization was used over all the layers in CNN).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Alam et. al., US Patent Application Publication 2020/0046244 in view of Kiranyaz, Serkan, et al. "Real-time pcg anomaly detection by adaptive 1d convolutional neural networks." (2019) further in view of Salekin, et.al., US 2021/0005067 further in view of Huang et. al. US Patent Application Publication 2020/0293653.
Regarding claim 4, Alam in view of Kiranyaz further in view of Salekin teach the system for offline embedded abnormal sound fault detection according to claim 3. Alam teaches step 101) obtaining an abnormal sound detection result of the at least one fully connected layer and the at least one classification layer (see Alam, [0054] At step 214, the plurality of heart sound signals are classified as one of a normal sound signal or a murmur sound signal based on the concatenated behavioral set). However, fail to teach proceeding to step 102 if the abnormal sound detection result is one of the N types of anomalies or the unrecognized anomaly, otherwise proceeding to step 105; step 102) incrementing a counter and proceeding to step 103; step 103) if, in L frames, a number of times that the abnormal sound detection result is a same anomaly is greater than or equal to a threshold, proceeding to step 104, otherwise proceeding to step 105; step 104) resetting the counter, reporting an anomaly, and ending the workflow; and step 105) resetting the counter, and ending the workflow without reporting an anomaly.
However, Huang teaches proceeding to step 102 if the abnormal sound detection result is one of the N types of anomalies or the unrecognized anomaly, otherwise proceeding to step 105 (see Huang [0082]  A difference between the predicted system call and the observed system call may be considered an anomaly ); step 102) incrementing a counter and proceeding to step 103 (see Huang, [0082] In some illustrative embodiments, the probability comparison and alert generation logic 450 may maintain a count of detected anomalies over a period of time ); step 103) if, in L frames, a number of times that the abnormal sound detection result is a same anomaly is greater than or equal to a threshold, proceeding to step 104, otherwise proceeding to step 105 (see Huang, [0082-0083] when the count reaches a threshold number of anomalies, an alert notification may be generated and output/logged); step 104) resetting the counter, reporting an anomaly, and ending the workflow (see Huang, [103] Based on the identified anomalies, alerts indicating the anomalies are generated and logged/transmitted in order to inform appropriate personnel of a potential attack on the computing system resources of the monitored computing environment (step 790). The operation then terminates); and step 105) resetting the counter, and ending the workflow without reporting an anomaly  (see Huang, [0082] The predicted system call sequence 440 may be compared to the actual observed system call sequence 405 to determine if there are differences. As a result, probability comparison and alert generation logic 450 may operate to determine whether to generate an alert/log of an anomaly or not).
Alam, Kiranyaz, Salekin and Huang are considered to be analogous to the claimed invention because they relate to audio anomaly detection using Deep Neural Networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Alam and Kiranyaz on processing and classifying the sound signals using the fully connected deep neural networks with the alert notification teachings of Huang with the goal of alert notification that may be logged and/or transmitted to a system administrator or other authorized personnel so that they may investigate whether the anomaly is indeed part of an attack or not and perform appropriate corrective actions ( see Huang  [0082]).
Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Alam et. al., US Patent Application Publication 2020/0046244 in view of Kiranyaz, Serkan, et al. "Real-time pcg anomaly detection by adaptive 1d convolutional neural networks." (2019) further in view of Salonidas et. al, US Patent 9,892,744.
Regarding claim 6, Alam in view of Kiranyaz teach the system for offline embedded abnormal sound fault detection according to claim 1. However, fail to teach  wherein the sound acquisition module uses a digital microphone as an acquisition device.
However, Salonidas teaches wherein the sound acquisition module uses a digital microphone as an acquisition device (see Salonidas col. 6 lines 26-28 . The sensing unit 220 may include an acoustic sensor, such as, for example, a microphone).
Alam, Kiranyaz and Salonidas are considered to be analogous to the claimed invention because they relate to audio anomaly detection using Deep Neural Networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Alam and Kiranyaz on processing and classifying the sound signals using the fully connected deep neural networks with monitoring of multiple machines teachings of Salonidas to identify degradation or impending issues in a machine's operation performance ( see Salonidas col. 1 line 50-55).
Regarding claim 7, Alam in view of Kiranyaz in view of Salonidas teach the system for offline embedded abnormal sound fault detection according to claim 6. Salonidas further teaches wherein an audio sampling frequency of the digital microphone is 48 KHz (see Salonidas, Col 6 lines 33-37 For example, the sampling rate and dynamic range characteristics of the sensing unit 220 may be chosen so as to allow for sampling of received signals at sampling rates of, for example, between approximately 20 Hz-48 kHz, for acoustic signals ranging in magnitude). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Yelchuru et. al., US Patent 10,475,468 teaches monitoring the equipment using audio may provide an early indication of a fault occurring in the equipment (see Yechuru, Fig. 2).
Koizumi et. al., US Patent Application Publication 2019/0376840 teaches training/detection flow for unsupervised anomalous sound detection (see Koizumi, Fig. 7).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656