Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

Priority
Claims 4-7, 15, 18, 21 are not entitled to the provisional filing date of 6/22/18 because the provisional application fails to provide necessary support for the use of a neural network for beat detection. At best the provisional application asserts that a neural network has potential to operate as a classifier and that such an implementation is not practical due to computational complexity. As such claims 4-7, 18 are afforded the priority date of the instant non-provisional application 6/3/19. 

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims  1-3, 12-14, 16, 17, 22-25 rejected under 35 U.S.C. 103 as being unpatentable over Theill: 20170180875 (published 6/21/2017 as EP3182729) and further in view of LaBoeuf: 20110075851 hereinafter LaB.
Regarding claim 1, 24
Theill teaches:
A music classifier for an audio device, hearing aid, etc. (Theill: Abstract) comprising: 
a signal conditioning unit or stage configured to receive and transform a digitized, time-domain audio signal into a corresponding frequency domain signal including of a plurality of frequency bands (Theill: ¶ 19; Fig 1: microphone receives an input audio signal; an A/D converter digitizes input audio into digital domain representation thereof; filterbank divides digital representation of the input signal or signals into a plurality of input frequency band signal); 
a plurality of decision making units operating in parallel that are each configured to evaluate one or more of the plurality of frequency bands to determine a plurality of feature 
the plurality of decision making units including 
a modulation activity tracking unit, configured to output a feature score for modulation activity based on a first value of an averaged wideband energy of the plurality of frequency bands and a second value of the averaged wideband energy of the plurality of frequency bands (Thiell: ¶ 19-30, 35-42, 46-69; Fig 2: the recited modulation activity tracking must be considered as any tracking of energy values of an incoming signal sufficient to establish and/or distinguish changes, divergence, etc. of tonality, beat and/or energy such as the figure 6 activity measurement module with relation to the pattern decision module; in Thiell filterbank 102 in concert with classifier 104 and feature extractor 201 therein, operate to track modulation activity of an input in the form of cepstral coefficient(s) output from the feature extractor to the classifier and comprising a moving average and taken as a value stream represent summed sub bands and comprise a measure of wideband energy used to track at least a minimum and a maximum energy value of groups of values which result in a measurement of pure tones in the signal collectively optimized such that a set of most recently identified symbols is persisted in a circular buffer and used to track a modulation activity among the cepstral coefficients by computing a running probability estimate using the circular buffer to optimize processing of the input signal using the tracked values; Thiell further discloses the utility of averaging the probability estimates for at least the purpose of further smoothing the signal values; the Thiell base class classifier 204 operates to output scores representative of at least quiet, urban noise, 
a tone detection unit configured to output feature scores for tone in each frequency band based on (i) an amount of energy in the frequency band (Theill: ¶ 19-30, 45-50; 55-69, 72-83; Fig 1-3: filterbank 102 in concert with classifier 104, feature extractor 201 therein, operates to detect tone and output MFCC values which comprise an representation of spectral energy in a plurality of frequency bands) 
a combination and music detection unit configured to combine the plurality of feature scores over a period of time to determine if the audio signal includes music (Theill: ¶ 19-30, 45-50; 55-69, 72-83; Fig 1-3: plurality of provided feature vectors provides multiple base classes for the input the base classes comprising a probability of the input comprising a particular class of sound; the classes including a likelihood that the input comprises music).
Thiell strongly suggests but does not explicitly teach tracking a ratio of first, second etc. wideband energy values. However Examiner has taken official notice which Applicant has failed to timely traverse and it is thus accepted as Applicant’s Admitted Prior Art (AAPA: please see MPEP 2144.03) that the tracking of a ratio such as by the Thiell taught maximum and minimum values would have comprised an obvious inclusion. Indeed this must be considered the essence of operation of the base class classifier which determines mixtures of input numerical valued MFCC features to represent particular classes such as quiet, urban noise, transportation noise, party noise, music, quiet speech, urban noise and speech, transportation noise and speech, party noise, etc.
 (ii) a variance of the energy in the frequency band based on a first order differentiation.
In a related field of endeavor LaB teaches the well-known utility of filtering audio using a set of extracted features for performing classification, labelling, and/or other learning processes on audio (LaB: Abstract: ¶ 10, 18-23, 26-42, 47-55) wherein the features including mean and variance of statistical features of the audio as well as first and/or second derivatives of cepstral coefficients which describe the variance of the change in spectral energy of at a plurality of frequencies (LaB: Abstract: ¶ 10, 18-23, 26-42, 47-55). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize well known metadata parameters such as those described by LaB within the Thiell system and method. The average skilled practitioner would have been motivated to do so for the purpose of clustering determined into base classes, genre, etc. and/or determining partitions of feature spaces in a classifier. The average skilled practitioner would have expected only predictable results from such an inclusion.
 
Regarding claim 16
Theill teaches or suggests:
A method for music detection in an audio signal, the method comprising: 
receiving an audio signal (Theill: ¶ 19; Fig 1: microphone receives an input audio signal); 
digitizing the audio signal to obtain a digitized audio signal (Bren: ¶ 19; Fig 1: an A/D converter digitizes input audio into digital domain representation thereof); 

applying the plurality of frequency bands to a plurality of decision making units operating in parallel (Theill: ¶ 19-30; Fig 1-3: output of filterbank provided to parallel decision making units 201, 202, 203 output of feature extractor 201 applied to parallel classifier unit 204), 
the plurality of decision making units including a modulation activity tracking unit, configured to output a feature score for modulation activity based on a first value of an averaged wideband energy of the plurality of frequency bands and a second value of the averaged wideband energy of the plurality of frequency bands (Thiell: ¶ 19-30, 35-42, 46-69; Fig 2: the recited modulation activity tracking must be considered as any tracking of energy values of an incoming signal sufficient to establish and/or distinguish changes, divergence, etc. of tonality, beat and/or energy such as the figure 6 activity measurement module with relation to the pattern decision module; in Thiell filterbank 102 in concert with classifier 104 and feature extractor 201 therein, operate to track modulation activity of an input in the form of cepstral coefficient(s) output from the feature extractor to the classifier and comprising a moving average and taken as a value stream represent summed sub bands and comprise a measure of wideband energy used to track at least a minimum and a maximum energy value of groups of values which result in a measurement of pure tones in the signal collectively optimized such that a set of most recently identified symbols is persisted in a circular buffer and used to track a modulation activity among the cepstral coefficients by computing a running probability estimate using the circular buffer to optimize processing of the input signal using the tracked values; Thiell further discloses the utility of averaging the probability estimates for at least the purpose 
a tone detection unit configured to output feature scores for tone in each frequency band based on (i) an amount of energy in the frequency band (Theill: ¶ 19-30, 45-50; 55-69, 72-83; Fig 1-3: filterbank 102 in concert with classifier 104, feature extractor 201 therein, operates to detect tone and output MFCC values which comprise an representation of spectral energy in a plurality of frequency bands) 
obtaining a feature score from each of the plurality of decision making units (Theill: ¶ 19-30; Fig 1-3: output of parallel decision making units 201, 202, 203 comprise features output to parallel classifier unit 204 and/or final classifier 205);
 the feature score from each decision making unit corresponding to a probability that a particular music characteristic is included in the audio signal (Theill: ¶ 19-30, 45-50; 55-69, 72-83; Fig 1-3: plurality of provided feature vectors provides multiple base classes for the input the base classes comprising a probability of the input comprising a particular class of sound the classes including a likelihood that the input comprises music); and 
combining the feature scores to detect music in the audio signal  (Theill: ¶ 19-30, 45-50; 55-69, 72-83; Fig 1-3: plurality of provided feature vectors provides multiple base classes for the input; the base classes comprising a probability of the input comprising a particular class of sound the classes including a likelihood that the input comprises music). 
Thiell strongly suggests but does not explicitly teach tracking a ratio of first, second etc. wideband energy values. However Examiner has taken official notice which Applicant has failed 
Thiell strongly suggests but does not explicitly teach a tone detection unit operable to determine (ii) a variance of the energy in the frequency band based on a first order differentiation.
In a related field of endeavor LaB teaches the well-known utility of filtering audio using a set of extracted features for performing classification, labelling, and/or other learning processes on audio (LaB: Abstract: ¶ 10, 18-23, 26-42, 47-55) wherein the features including mean and variance of statistical features of the audio as well as first and/or second derivatives of cepstral coefficients which describe the variance of the change in spectral energy of at a plurality of frequencies (LaB: Abstract: ¶ 10, 18-23, 26-42, 47-55). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize well known metadata parameters such as those described by LaB within the Thiell system and method. The average skilled practitioner would have been motivated to do so for the purpose of clustering determined into base classes, genre, etc. and/or determining partitions of feature spaces in a classifier. The average skilled practitioner would have expected only predictable results from such an inclusion.

Regarding claim 2

A music classifier for the audio device according to claim 1, wherein the plurality of decision making units include a beat detection unit (Theill: ¶ 19-42, 82-100; Fig 2, 4: feature extractor operable in concert with classifiers 204, 205 as a beat detection unit includes beat detection functionality in a set of lowest frequency bands).

Regarding claim 3, 17
Theill in view of LaB teaches or suggests:
The music classifier and method for the audio device according to claim 2, 16 wherein the beat detection unit is configured to detect, a repeating beat pattern in a first frequency band that is the lowest of the plurality of frequency bands (Theill: ¶ 19-42, 82-100; Fig 2, 4: feature extractor includes beat detection functionality in a set of lowest frequency bands to detect successive rising edges comprising a pattern; the set of lowest frequency bands considered as a singular frequency band with a bandwidth encompassing the plurality).
Theill discusses correlation with regard to the detection of tonality and strongly suggests determining beat in such a way that changes in value among frequency bands are associated with changes in value among adjacent frequency bands but does not explicitly teach detection and/or determination of a beat based on correlation however Examiner has taken official notice which Applicant has failed to timely traverse and it is thus accepted as Applicant’s Admitted Prior Art (AAPA: Please see MPEP 2144.03) that the detection of beat based on correlation values among signal bands would have comprised an obvious inclusion. The average skilled practitioner would have been motivated to do so for the purpose of detecting a beat in a computationally effective manner and would have expected predictable results therefrom.


Regarding claim 12
Theill in view of LaB teaches or suggests:
A music classifier for the audio device according to claim 1, wherein the combination and music detection unit is configured to apply a weight to each feature score to obtain weighted feature scores and to sum the weighted feature scores to obtain a music score (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: a loudness estimator tracks and sums the frequency band energy based on a relationship with a first and second threshold to determine hysteresis relationships between maxima and minima of the supped frequency band signal levels).

Regarding claim 13
Theill in view of LaB teaches or suggests:
A music classifier for the audio device according to claim 12 wherein the combination and music detection unit is further configured to accumulate music scores for a plurality of frames, to compute an average of the music scores for the plurality of frames, and to compare the average to a threshold (Theill: ¶ 19-35, 51-69: music is framed into overlapping window of weighted coefficient values, summed and compared to at least a first and second threshold).

Regarding claim 14
Theill in view of LaB teaches or suggests:
A music classifier for the audio device according to claim 13, wherein the combination and music detection unit is further configured to apply a hysteresis control to a music or no music output of the threshold (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: a loudness estimator tracks and sums the frequency band energy based on a relationship with a first and second 

Regarding claim 22
Theill in view of LaB teaches or suggests:
A method for music detection according to claim 16, wherein the combining comprises; multiplying the feature score from each of the plurality of decision making units with a respective weight factor to obtain a weighted score from each of the plurality of decision making units (Theill: ¶ 19-35, 51-69: music is framed into overlapping window of weighted coefficient values the weighting at least comprising multiplying by a scalar, normalization, etc., summed and compared to at least a first and second threshold); summing the weighted scores from the plurality of decision making units to obtain a music score (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: a loudness estimator tracks and sums the frequency band energy based on a relationship with a first and second threshold to determine hysteresis relationships between maxima and minima of the supped frequency band signal levels); accumulating music scores over a plurality of frames of the audio signal (Theill: ¶ 19-35, 51-69: music is framed into overlapping window of weighted coefficient values, summed and compared to at least a first and second threshold); averaging the music scores from the plurality of frames of the audio signal to obtain an average music score (Theill: ¶ 19-35, 51-69: application of averaging techniques yield estimates of the audio classification confidence);  comparing the average music score to a threshold to detecting music in the audio signal  (Theill: ¶ 19-35, 51-69: music is framed into overlapping window of weighted coefficient values, summed and compared to at least a first and second threshold). 

Regarding claim 23
Theill in view of LaB teaches or suggests:
A method for music detection in an audio signal according to claim 16, further comprising: modifying the audio signal based on the music detection; and transmitting the audio signal (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: hearing aid system parameters set based on a determined classification of the input audio).

Regarding claim 25
Theill in view of LaB teaches or suggests:
A hearing aid according to claim 24, wherein the hearing aid includes an audio signal modifying stage coupled to the signal conditioning stage and to the music classifier, the audio signal modifying stage configured to process the plurality of frequency bands differently when a music signal is received than when a no-music signal is received (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: hearing aid system parameters set based on a determined classification of the input audio).

Claims 4, 5, 15, 18, 21 rejected under 35 U.S.C. 103 as being unpatentable over Theill: in view of LaB applied to claims 1-3, 12-14, 16, 17, 22-25 supra and further in view of MaCallum: 20200074982 hereinafter Mac.

Regarding claim 4
Theill in view of LaB teaches or suggests:
A music classifier and method, wherein the beat detection unit is configured to detect a repeating beat pattern, based on an output of a beat detection (BD) classifier (Theill: ¶ 19-30, 
Theill in view of LaB does not explicitly teach the utility of the recited neural network operative to identify music using a neural network to provide beat detection. 
In a related field of endeavor Mac teaches a beat detection neural network suitable to determine beats in relationship to particular frequencies and use the features thereby extracted to activate a beat detection neutral network and thereby identify music and salient features thereof such as genre, mood, etc. (Mac: ¶ 18-28; Fig 1-3, 5). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize a beat detection neural network such as that taught by Mac within the Theill in view of LaB system and method. The average skilled practitioner would have been motivated to do so for the purpose of generating filter coefficients for a hearing aid by appropriately characterizing and identifying music segments within the input audio signal and would have expected predictable results therefrom.

Regarding claim 5
Theill in view of LaB in view of Mac teaches or suggests:
A music classifier and method, wherein the beat detection unit is configured to select one or more frequency bands from the plurality of frequency bands and is configured to extract a plurality of features from each selected frequency band (Theill: ¶ 19-30, 86-100; Fig 1-3; claim 1, 2, 11: beat detection operative a plurality of particular frequency bands and compared within selected active frequency bands).

Mac discloses that a beat detection neural network operates to determine beats in relationship to particular frequencies and use the features thereby extracted to activate a beat detection neutral network and thereby identify music and salient features thereof such as genre, mood, etc. (Mac: ¶ 18-28; Fig 1-3, 5). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize a beat detection neural network such as that taught by Mac within the Theill in view of LaB system and method. The average skilled practitioner would have been motivated to do so for the purpose of generating filter coefficients for a hearing aid by appropriately characterizing and identifying music segments within the input audio signal and would have expected predictable results therefrom.

Regarding claim 15, 21
Theill in view of LaB teaches or suggests:
A music classifier and method, for the audio device according to claim 1, 16 wherein the combination and music detection unit is a neural network. Theill in view of LaB does not explicitly teach the utility of the recited neural network operative to identify music using a neural network to provide beat detection. 
In a related field of endeavor Mac teaches a beat detection neural network suitable to determine beats in relationship to particular frequencies and use the features thereby extracted to activate a beat detection neutral network and thereby identify music and salient features thereof such as genre, mood, etc. (Mac: ¶ 18-28; Fig 1-3, 5). It would have been obvious to one 

Regarding claim 18
Theill in view of LaB in view of Mac teaches or suggests:
A music classifier and method, wherein the decision making units include a beat detection unit, and wherein: obtaining a feature score from the beat detection unit includes: detecting, based on a neural network, a repeating beat pattern in the plurality of frequency bands. (Theill: ¶ 19-30, 86-100; Fig 1-3; claim 1, 2, 11: beat detection operative a plurality of particular frequency bands and compared within selected active frequency bands); (Mac: ¶ 18-28; Fig 1-3, 5: established operation of a beat detection neural network).

Claims 6, 7 rejected under 35 U.S.C. 103 as being unpatentable over Theill in view of LaB and further in view of MaCallum as applied variously to claims 1-5 supra and further in view of Gondi: 20140379352.

Regarding claim 6
Theill in view of LaB in view of Mac teaches or suggests:
A music classifier and method, wherein the plurality of features extracted from each selected frequency band form a feature set including an energy mean, an energy standard 
In a related field of endeavor Gondi discusses a system for extracting meaning from parameters of an input audio signal using a Hidden Markow (sic) Model Toolkit, said toolkit operable to create Mel Frequency spectral coefficients from input audio and extract therefrom energy means, standard deviation, kurtosis, skewness, minimum and maximum values, relative positions, and range as well as two linear regression coefficients and mean square error (MSE) from each of a plurality of separate bands of the input audio signal (Gondi: ¶ 60-72; Figs 3). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Gondi taught features to determine music and/or speech parameters within the Theill in view of LaB in view of Mac system and method. The average skilled practitioner would have been motivated to do so for the purpose of augmenting a hearing aid device to provide increased audio context to a user and would have expected predictable results therefrom.

Regarding claim 7
Theill in view of LaB in view of Mac in view of Gondi teaches or suggests:
A music classifier and method, wherein the BD neural network receives the feature set for each selected band as a plurality of inputs (Gondi: ¶ 60-72; Figs 3). 




Claims 11, 20 rejected under 35 U.S.C. 103 as being unpatentable over Theill: 20170180875 (published 6/21/2017 as EP3182729) and further in view of LaBoeuf: 20110075851 hereinafter LaB as applied to claims  1-3, 12-14, 16, 17, 22-25 supra and further in view of Eronen: 20150094835 .
Regarding claim 11
Theill in view of LaB teaches or suggests:
The music classifier and method for the audio device according to claim 10, wherein a second value corresponds to a minimum of the averaged energy and the first value corresponds to a maximum of the averaged energy (Theill: ¶ 19-35, 42-50; 55-69, 72-83; Fig 1-3: a loudness estimator tracks and sums the frequency band energy based on a relationship with a first and second threshold to determine hysteresis relationships between maxima and minima of the supplied frequency band signal levels; Theill further discloses averaging the summed probability estimates for at least the purpose of further smoothing the signal values). 
Thiell in view of LaB does not explicitly teach determining values representative of a minimum of the averaged wideband energy and a maximum of the averaged wideband energy; wherein the averaged wideband energy corresponds to an average of a sum of the energy in each of the plurality of frequency bands
In a related field of endeavor Eronen teaches a system and method for tracking wideband as well as frequency dependent features (Eronen: Abstract: ¶ 86-109; Fig 3) for the purpose of performing learning upon a signal thereby determining particular features of the signal (Eronen: Abstract: ¶ 86-109; Fig 3). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Thiell in view of LaB metadata parameters such as maximum, minimum, etc. in concert with the Eronen taught wideband parameter determination. The average skilled practitioner would have been 

Regarding claim 20
Theill in view of LaB in view of Eronen teaches or suggests:
The music classifier and method for the audio device according to claim 16, wherein obtaining a feature score from the modulation activity tracking unit includes: tracking a minimum averaged energy of a sum of the plurality of frequency bands as the second value and a maximum averaged energy of the sum of the plurality of frequency bands as the first value. (Eronen: Abstract: ¶ 86-109; Fig 3)

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-9, 11-25 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant’s arguments in concert with amended claims, see Remarks and Claims, filed 11/30/20, with respect to the rejection(s) of claim(s) 1-3, 12-14, 16, 17, 22-25 under 35 USC 103 over Thiell and Zhang have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Thiell and Laboeuf.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
MFCC_components c2017 details the utility of differentiation and other measures of variance in MFCC coding
20170330540 mean and variance of derivative, second derivative comprise well known spectral analysis parameters
20140180673 utility of variance, first derivative for beat detection, learning, etc.
                                                                                                                                                                                                   
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701.  The examiner can normally be reached on 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VIVIAN CHIN can be reached on 5712727848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 






/PAUL C MCCORD/Primary Examiner, Art Unit 2654