DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged or paper submitted under 35 U.S.C. 119(a)-(d), which papers have been places of record in the file.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/30/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings were submitted on 04/30/2020.  These drawings are reviewed and accepted by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

Claims 1-7 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Seroussi et al. (US 20180308494 A1) in view of Li et al. (“Background-foreground information based bit allocation algorithm for surveillance video on high efficiency video coding (HEVC)”, 2016).

Regarding claims 1 and 13, Seroussi teaches:
“receiving an audio signal to be encoded, the audio signal comprising a plurality of successive audio frames” (par. 0054; ‘Divide the input signal into frames, each frame containing a fixed number of audio samples.’);
“for each successive audio frame of the audio signal: representing the audio frame in a frequency domain with respect to a plurality of frequency sub-bands” (par. –58; ‘For each frame, partition the vector X into M bands B.sub.i, according to:’).
Seroussi teaches encoding each successive audio frame of the audio signal, wherein a number of bits is allocated for each frequency sub-band of the audio frame, wherein the number of bits allocated for a frequency sub-band is higher if the audio frame is perceptually important (par. 0042; ‘For bands in which there is relatively little frequency content, the encoder can allocate a relatively small number of bits to represent the frequency content. In general, the higher the number of bits allocated for a particular band, the more accurate the representation of the frequencies in that particular band. The encoder can strike a balance between accuracy, which drives the bit allocation upward, and data rate, which can provide an upper limit to the number of bits allocated per frame.’;;; par. 0077; ‘As explained above, the encoder allocates bits 
However, Seroussi does not expressly teach background or foreground classification, as in:
“classifying the audio frame in each frequency sub-band as either background or foreground using a background model specific to the frequency sub-band”; and
“encoding each successive audio frame of the audio signal, wherein a number of bits is allocated for each frequency sub-band of the audio frame, wherein the number of bits allocated for a frequency sub-band is higher if the audio frame is classified as foreground in the frequency sub-band than if the audio frame is classified as background in the frequency sub-band.”
Li teaches:
“classifying the audio frame in each frequency sub-band as either background or foreground using a background model specific to the frequency sub-band” (pg. 1, right col., “These algorithms…”; ‘Basically, BFIBA classifies a LCU into a background LCU (BLCU) or a foreground LCU (FLCU) by utilizing the background and foreground information (BFI) and then allocates bits for frames and LCUs based on the classification information.’);
“encoding each successive audio frame of the audio signal, wherein a number of bits is allocated for each frequency sub-band of the audio frame, wherein the number of bits allocated for a frequency sub-band is higher if the audio frame is classified as foreground in the frequency sub-band than if the audio frame is classified as 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify Seroussi’s bit allocation method by incorporating Li’s Background-Foreground Information based Bit Allocation algorithm in order to encode each successive audio frame of the audio signal in a similar manner. The combination would improve rate control performance. (Li: pg. 2, left col., “Based on the….”)

Regarding claim 2 (dep. on claim 1), the combination of Seroussi in view of Li further teaches:
“wherein the number of bits allocated for encoding a background classified frequency sub-band of the audio frame is dependent on a frequency range of the background classified frequency sub-band of the audio frame; and/or the number of bits allocated for encoding a foreground classified frequency sub-band of the audio frame is dependent on a frequency range of the foreground classified frequency sub-band of the audio frame” (Seroussi: par. 0090; ‘The encoder can form a bit-allocation curve 614 for a particular frame, which represents how many bits are allocated for each band in the particular frame.’).

claim 3 (dep. on claim 1), the combination of Seroussi in view of Li further teaches:
“wherein the audio signal is encoded such that the number of bits allocated to a background classified first frequency sub-band of a first audio frame is higher if the same first frequency sub-band in an audio frame preceding the first audio frame was classified as foreground compared to if the same first frequency sub-band in the audio frame preceding the first audio frame was classified as background” (Seroussi: par. 0042; ‘For bands in which there is relatively little frequency content, the encoder can allocate a relatively small number of bits to represent the frequency content. In general, the higher the number of bits allocated for a particular band, the more accurate the representation of the frequencies in that particular band. The encoder can strike a balance between accuracy, which drives the bit allocation upward, and data rate, which can provide an upper limit to the number of bits allocated per frame.’; par. 0090; ‘The encoder can form a bit-allocation curve 614 for a particular frame, which represents how many bits are allocated for each band in the particular frame.’).

Regarding claim 4 (dep. on claim 1), the combination of Seroussi in view of Li further teaches:
“wherein the number of bits allocated for encoding a frequency sub-band of the audio frame further depends on a psychoacoustic model” (Seroussi: par. 0095; ‘The encoder 700 can employ data from any available sources 710, including psychoacoustic models and others, and perform bit-allocation 712 to produce a bit-allocation curve 714.’).

Regarding claim 5 (dep. on claim 2), the combination of Seroussi in view of Li further teaches:
“wherein the number of bits allocated for encoding a frequency sub-band of the audio frame is dependent on the frequency range of the frequency sub-band of the audio frame according to a psychoacoustic model” (Seroussi: par. 0095; ‘The encoder 700 can employ data from any available sources 710, including psychoacoustic models and others, and perform bit-allocation 712 to produce a bit-allocation curve 714.’).

Regarding claim 6 (dep. on claim 1), the combination of Seroussi in view of Li further teaches:
“wherein the number of bits allocated for encoding a background classified frequency sub-band of the audio frame is independent of a frequency range that the background classified frequency sub-band of the audio frame represents and wherein the number of bits allocated for encoding a foreground classified frequency sub-band of the audio frame is independent of a frequency range that the foreground classified frequency sub-band of the audio frame belongs to” (Seroussi: par. 0122; ‘In some examples, at least one target parameter can include a reference number of bits allocatable for each band. In some examples, the method 1100 can optionally further include: setting the estimated number of bits allocatable for each band to equal the reference number of bits allocatable for each band, for multiple frames in the digital audio signal; and encoding data representing the reference number of bits allocatable for each band into the bit stream.’).

Regarding claim 7 (dep. on claim 1), the combination of Seroussi in view of Li further teaches:
“for an audio frame of the audio signal: for a frequency sub-band of the audio frame; updating the background model specific to the frequency sub-band which corresponds to the frequency sub-band of the audio frame based on a frequency content of the frequency sub-band of the audio frame” (Li: pg. 2, Fig. 2; ‘Background modeling’ Updating background models are well-known in the art. It would have been obvious to update the background model specific to the frequency sub-band which corresponds to the frequency sub-band of the audio frame based on a frequency content of the frequency sub-band of the audio frame.).

Regarding claim 12, the combination of Seroussi in view of Li further teaches:
“A computer program product comprising a non-transitory computer-readable medium storing computer-readable instructions which, when executed on a processor, will cause the processor to perform the method according to claim 1” (Seroussi: par. 0162; ‘Further, one or any combination of software, programs, computer program products that embody some or all of the various embodiments of the encoding and decoding system and method described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.’).
Claims 8-10 are rejected under 35 U.S.C. 103 as being unpatentable over Seroussi in view of Li as applied to claim 1 above, and further in view of Chu et al. (“A semi-supervised learning approach to online audio background detection”, 2009).

Regarding claim 8 (dep. on claim 1), Seroussi in view of Li do not expressly teach Gaussian Mixture Model, as in “wherein the background model specific to the frequency sub-band includes a Gaussian Mixture Model, GMM, the GMM comprising a plurality of Gaussian distributions, each of which representing a probability distribution for energy levels in the frequency sub-band.”
Chu teaches:

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify the background model taught by Seroussi in view of Li by incorporating the Gaussian mixture models used for background/foreground classification as taught by Chu in order to be able to understand and predict ambient context surrounding an agent, both human and machine. (Chu: abstract)

Regarding claim 9 (dep. on claim 8), the combination of Seroussi in view of Li and Chu further teaches:
“wherein a frequency sub-band of the audio frame is classified as background if an energy level of the frequency sub-band of the audio frame lies within a predetermined number of standard deviations around a mean of one of the Gaussian distributions of the GMM of the background model specific to the frequency sub-band, and if a weight of said Gaussian distribution is above a threshold, wherein the weight represents a probability that an energy level of the frequency sub-band of the audio frame will be within the predetermined number of standard deviations around the mean of said Gaussian distribution” (Chu: par. 1630, right col., “The history…”; ‘The Kth component is viewed as a match if xt is within 2.5 standard deviations from the mean of 

Regarding claim 10 (dep. on claim 8), the combination of Seroussi in view of Li and Chu further teaches:
“wherein the energy level is a power spectral density, PSD, measurement” (power spectral density is well-known in the art, as evident by Shug et al. (US 20140188488 A1) (par. 0079; ‘Furthermore, the bit allocation process determines a power spectral density (PSD) distribution and a frequency-domain masking curve (based on a psychoacoustic model) for each channel. The PSD distribution and the frequency-domain masking curve are used to to determine a substantially optimal distribution of the available bits to the different normalized mantissas 314 of the audio frame.’).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Seroussi in view of Li as applied to claim 1 above, and further in view of Gurijala et al. (US 10043527 B1).

Regarding claim 11 (dep. on claim 1), Seroussi in view of Li does not expressly teach metadata, as in “transmitting the encoded audio frames of the audio signal together with metadata, wherein the metadata indicates the classification of the frequency sub-bands of the audio frames.”

“transmitting the encoded audio frames of the audio signal together with metadata, wherein the metadata indicates the classification of the frequency sub-bands of the audio frames” (col. 18, lines 31-35; ‘The input to the embedding system of FIG. 5 includes the message payload 800 to be embedded in an audio segment, the audio segment, and metadata about the audio segment (802) obtained from classifier modules, to the extent available.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify the encoding taught by Seroussi in view of Li by incorporating Gurijala’s embedding system in order to provide audio classification parameters to facilitate embedding.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Seroussi in view of Li, further in view of Visser et al. (US 20170278519 A1).

Regarding claim 14, the combination of Seroussi in view of Li further teaches:
“wherein the receiver is configured to receive an audio signal to be encoded, the audio signal comprising a plurality of successive audio frames, and; wherein the one or more processors are configured to: for each successive audio frame of the audio signal: represent the audio frame in a frequency domain with respect to a plurality of frequency sub-bands; classify the audio frame in each frequency sub-band as either background or foreground using a background model specific to the frequency sub-band; encode each successive audio frame of the audio signal, wherein a number of bits is allocated for each frequency sub-band of the audio frame, wherein the number of bits allocated for a frequency sub-band is higher if the audio frame is classified as foreground in the frequency sub-band than if the audio frame is classified as background in the frequency sub-band” (see claim 1).
Seroussi in view of Li and do not explicitly teach a microphone, as in:
“a microphone configured to record an audio signal”;
“an encoder configured to receive the audio signal from the microphone and encode the audio signal with variable bitrate, the encoder for encoding an audio signal with variable bitrate, the encoder comprising a receiver and one or more processors.”
Visser teaches:

“an encoder configured to receive the audio signal from the microphone and encode the audio signal with variable bitrate, the encoder for encoding an audio signal with variable bitrate, the encoder comprising a receiver and one or more processors” (par. 0047; ‘Determining the proximity of the sound sources 122, 124, 126 may enable the processor 104 to encode audio signals from closer sound sources (e.g., foreground audio signals) at higher bit-rates and audio signal from sound sources farther away (e.g., background audio signals) at lower bit-rates for encoding efficiency.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify the audio input method taught by Seroussi in view of Li by incorporating the microphone taught by Visser in order to capture audio.

Conclusion
Other pertinent prior art are listed in the PTO-892 for consideration.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191. The examiner can normally be reached 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658