DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “S205” has been used to designate both “generating enhancement metadata based on the first input” in Fig. 3 and “generating enhancement metadata based on the first/second input” in Fig. 4.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claim 2 objected to because of the following informalities:  line 2 reads “in determining the suitability” which should be removed or reworded.  Appropriate correction is required.
Claim 17 objected to because of the following informalities:  lines 2-4 refer to “the bitstream parameters”, while Claim 15 teaches “one or more bitstream parameters”.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-27 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “low” in claims 1 and 19 is a relative term which renders the claim indefinite. The term “low” is not defined by the claims, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim 17 recites the limitation "the Generator" in the last two lines.  There is insufficient antecedent basis for this limitation in the claim, because the “and/or” language in the claim includes the possibility that “or” is chosen, in which case there is no antecedent Generator.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Parkkinen et al. (US 7,072,366 B2), hereinafter referred to as Parkkinen.

Regarding claim 1, Parkkinen teaches:
A method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data in a decoder at a decoder side, including: 
core encoding original audio data at a low bitrate to obtain encoded audio data (Fig. 5 element 210, col. 2 lines 11-18, where the core encoder encodes the audio, and lines 51-58, where the core encoder operates at lower rates between 4.75-12.2kbit/s); 
generating, in an encoder, enhancement metadata to be transmitted to the decoder for controlling a type and/or amount of audio enhancement in the decoder after core decoding the encoded audio data (Fig. 5 elements 230, 421, 422, col. 2 lines 18-38, col. 7 lines 49-57, where the enhancement information is encoded based on the preference information); and 
outputting the encoded audio data and the enhancement metadata to the decoder (Fig. 1 elements 120, 130, Fig. 5 elements 110, 104, 520, col. 1 lines 56-66, where the multiplexed streams are sent to a decoder), wherein generating enhancement metadata includes: 
core decoding the encoded audio data to obtain core decoded raw audio data (Fig. 5 element 220, col. 2 lines 24-26, where the encoded signal is decoded); 
inputting the core decoded raw audio data into an audio enhancer for processing the core decoded raw audio data based on candidate enhancement metadata for controlling the type and/or amount of audio enhancement of audio data that is input to the audio enhancer (Fig. 5 elements 230, 422, col. 2 lines 18-38, col. 7 lines 49-57, where the enhancement information is encoded based on the preference information); 
obtaining, as an output from the audio enhancer, enhanced audio data (Fig. 5 element 103, col. 2 lines 32-38, where the enhancement data stream is produced);
determining a suitability of the candidate enhancement metadata based on the enhanced audio data (Fig. 5 elements 421,422, col. 7 lines 57-67, col. 8 lines 14-27, where the bitrate levels are set by the preference information); and 
generating the enhancement metadata based on a result of the determination (Fig. 5 elements 421,422, col. 7 lines 57-67, col. 8 lines 14-27, where the bitrate levels are used in adjusting the encoder for generating the enhancement data stream).

Claim(s) 1, 6-11, 14, 18, and 26 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ashley et al. (US 8,639,519 B2), hereinafter referred to as Ashley.

Regarding claim 1, Ashley teaches:
A method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data in a decoder at a decoder side, including: 
core encoding original audio data at a low bitrate to obtain encoded audio data (Fig. 2 element 104, col. 4 lines 17-36, where a signal is core encoded, and col. 3 line 60 - col. 4 line 8, where overall data rate is reduced); 
generating, in an encoder, enhancement metadata to be transmitted to the decoder for controlling a type and/or amount of audio enhancement in the decoder after core decoding the encoded audio data (Fig. 2 element 202, 204, col. 4 lines 17-36, where the metadata is the selection signal which indicates which type of enhancement layer encoders is used); and 
outputting the encoded audio data and the enhancement metadata to the decoder (Fig. 2 elements 106, 204, 208, col. 4 lines 17-36, where the encoded signal, the selection signal, and the enhanced signal are all transmitted to the decoder), wherein generating enhancement metadata includes: 
core decoding the encoded audio data to obtain core decoded raw audio data (Fig. 2 elements 112, 110, col. 4 lines 17-36, where the encoded signal is passed through a core decoder); 
inputting the core decoded raw audio data into an audio enhancer for processing the core decoded raw audio data based on candidate enhancement metadata for controlling the type and/or amount of audio enhancement of audio data that is input to the audio enhancer (Fig. 2 elements 202, 206, col. 4 lines 17-36, 43-57, where the output of the decoder is sent to the comparator/selector and the enhancement encoders, and where the selection signal is determined and sent to the enhancement encoders); 
obtaining, as an output from the audio enhancer, enhanced audio data (Fig. 2 element 208, col. 4 lines 43-57, where an enhancement layer encoded signal is produced);
determining a suitability of the candidate enhancement metadata based on the enhanced audio data (Fig. 10 element 1012-1016, col. 7 lines 29-53, where the comparator determines if there is a good match, and col. 7 line 65 - col. 8 line 11, where the comparator receives input from enhancement layers); and 
generating the enhancement metadata based on a result of the determination (Fig. 10 elements 1014-1016, col. 7 lines 29-53, where the enhancement layer encoded signal is produced based on the comparison).

Regarding claim 6, Ashley teaches:
The method of claim 1, wherein the enhancement metadata include one or more items of enhancement control data (Fig. 2 element 202, 204, col. 4 lines 17-36, where the metadata is the selection signal which indicates which type of enhancement layer encoders is used).  

Regarding claim 7, Ashley teaches:
The method of claim 6, wherein the enhancement control data include information on one or more types of audio enhancement, the one or more types of audio enhancement including one or more of speech enhancement, music enhancement and applause enhancement (col. 6 line 45 - col. 7 line 28, where the selection signal selects the best encoder for the frame, for enhancing speech or music).  

Regarding claim 8, Ashley teaches:
The method of claim 7, wherein the enhancement control data further include information on respective allowabilities of the one or more types of audio enhancement (Fig. 10 element 1012-1016, col. 7 lines 29-53, where the comparator determines if there is a good match, interpreted as the allowability).  

Regarding claim 9, Ashley teaches:
The method of claim 6, wherein the enhancement control data further include information on an amount of audio enhancement (col. 6 line 45 - col. 7 line 28, where the selection includes a number of frames by each enhancement encoder).  

Regarding claim 10, Ashley teaches:
The method of claim 6, wherein the enhancement control data further include information on an allowability as to whether audio enhancement is to be performed by an automatically updated audio enhancer at the decoder side (Fig. 10 element 1012-1016, col. 7 lines 29-53, where the comparator determines if there is a good match, interpreted as the allowability, and Fig. 2 element 210, col. 4 line 58 - col. 5 line 5, where the enhancement decoder is updated by the selection signal).  

Regarding claim 11, Ashley teaches:
The method of claim 6, wherein processing the core decoded raw audio data based on the candidate enhancement metadata is performed by applying one or more predefined audio enhancement modules, and wherein the enhancement control data further include information on an allowability of using one or more different enhancement modules at decoder side that achieve the same or substantially the same type of enhancement (Fig. 2 element 206, col. 4 lines 17-36, 43-57, where the decoded audio is passed through the enhancement layer encoder based on the selection signal, and element 210, col. 4 line 58 - col. 5 line 5, where the selection signal is also passed to the decoder side for the enhancement layer decoding).  

Regarding claim 14, Ashley teaches:
The method of claim 1, wherein the enhancement metadata include at least an indication of an encoding quality of the original audio data (Fig. 2 element 202, col. 4 lines 17-36, where the selection is based on comparison of the original and reconstructed signals).  

Regarding claim 18, Ashley teaches:
An encoder for generating enhancement metadata for controlling enhancement of low-bitrate coded audio data, wherein the encoder includes one or more processors configured to perform the method according to claim 1 (Fig. 2, col. 3 lines 1-20, where a processor is used to implement the encoder).  

Regarding claim 26, Ashley teaches:
A computer program product comprising a computer-readable storage medium with instructions adapted to cause a device to carry out the method according to claim 1 when executed on a device having processing capability (col. 3 lines 1-20, where program instructions are stored for implementing the method).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ashley, in view of Maling III et al. (US 9,823,892 B2), hereinafter referred to as Maling.

Regarding claim 2, Ashley teaches:
The method of claim 1
Ashley does not teach:
wherein determining the suitability of the candidate enhancement metadata determining the suitability includes presenting the enhanced audio data to a user and receiving a first input from the user in response to the presenting, and wherein generating the enhancement metadata based on the result of the determination is based on the first input.
Maling teaches:
wherein determining the suitability of the candidate enhancement metadata determining the suitability includes presenting the enhanced audio data to a user and receiving a first input from the user in response to the presenting, and wherein generating the enhancement metadata based on the result of the determination is based on the first input (col. 10 lines 19-35, where audio is played to the user, lines 36-54, where the user selects a control for enhancing the audio, which results in outputting parameter adjustment controls).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ashley by including the user input of Maling (Maling col. 10 lines 36-54) in the speech enhancement of Ashley (Ashley Fig. 2 element 206) by allowing the user to have control of the enhancement process and allow the user to see the current settings and make changes accordingly (Maling col. 11 lines 3-20).

Regarding claim 3, Ashley in view of Maling teaches:
The method of claim 2, wherein the first input from the user includes an indication of whether the candidate enhancement metadata are accepted or declined by the user (Maling col. 10 lines 36-54, where adjusting parameters is interpreted as declining the candidate enhancement metadata).  

Regarding claim 4, Ashley in view of Maling teaches:
The method of claim 3, wherein, in case of the user declining the candidate enhancement metadata, a second input indicating a modification of the candidate enhancement metadata is received from the user and generating the enhancement metadata based on the result of the determination is based on the second input (Maling col. 10 lines 36-54, where adjusting parameters is interpreted as declining the candidate enhancement metadata, resulting in outputting parameter adjustment controls).  

Regarding claim 5, Ashley in view of Maling teaches:
The method of claim 3, wherein, in case of the user declining the candidate enhancement metadata, operations of inputting the core decoded raw audio data, obtaining the enhanced audio data, determining the suitability and generating the enhancement metadata based on the result of the determination are repeated (Ashley col. 7 lines 16-28, where the process is performed on multiple frames, and Maling Fig. 2, col. 10 lines 19-54, where the user makes adjustments based on the currently playing audio).  

Claims 12-13, 19-22, 25, and 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ashley, in view of Pascual et al. (Pascual, S., Bonafonte, A., & Serra, J. (2017). SEGAN: Speech enhancement generative adversarial network. arXiv preprint arXiv:1703.09452.), hereinafter referred to as Pascual.

Regarding claim 12, Ashley teaches:
The method of claim 1
Ashley does not teach:
wherein the audio enhancer is a Generator trained in a Generative Adversarial Network setting.
Pascual teaches:
wherein the audio enhancer is a Generator trained in a Generative Adversarial Network setting (Page 2 Fig. 2, section 3, where a GAN is used for speech enhancement).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ashley by using the GAN of Pascual (Pascual Fig. 2, section 3) as the enhancer of Ashley (Ashley Fig. 2 elements 202, 206) to provide a quick enhancement process with a simple and generalizable system (Pascual page 1 col. 2 bullet points).

Regarding claim 13, Ashley in view of Pascual teaches:
The method of claim 12, wherein, during training in the Generative Adversarial Network, obtaining the enhanced audio data as output of the Generator is conditioned based on the enhancement metadata (Pascual page 2 Fig. 2, second column, where training the GAN include using the input and vector z to determine the output, and Ashley col. 4 lines 17-36, where the output depends on the selection signal).  

Regarding claim 19, Ashley teaches:
A method for generating in a decoder enhanced audio data from low- bitrate coded audio data based on enhancement metadata, wherein the method includes: 
receiving audio data encoded at a low bitrate and enhancement metadata from an encoder (Fig. 2 elements 204', 208', 112', col. 4 line 58 - col. 5 line 5, where the signals are received); 
core decoding the encoded audio data to obtain core decoded raw audio data (Fig. 2 element 120, col. 4 line 58 - col. 5 line 5, where core decoding is performed); 
inputting the core decoded raw audio data into an audio enhancer for processing the core decoded raw audio data based on enhancement metadata (Fig. 2 elements 210, 118, 204', col. 4 line 58 - col. 5 line 5, where the decoded core signal and selection signal are input to the enhancement decoder); 
obtaining, as an output from the audio enhancer, enhanced audio data (Fig. 2 element 210, 212, col. 4 line 58 - col. 5 line 5, where the enhanced audio data is decoded); and
outputting the enhanced audio data (Fig. 2 element 210, 212, col. 4 line 58 - col. 5 line 5, where the enhanced audio data is output),
Ashley does not teach:
wherein the audio enhancer is a Generator trained in a Generative Adversarial Network (GAN) setting.
Pascual teaches:
wherein the audio enhancer is a Generator trained in a Generative Adversarial Network (GAN) setting (Page 2 Fig. 2, section 3, where a GAN is used for speech enhancement).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ashley by using the GAN of Pascual (Pascual Fig. 2, section 3) as the enhancer of Ashley (Ashley Fig. 2 elements 202, 206) to provide a quick enhancement process with a simple and generalizable system (Pascual page 1 col. 2 bullet points).

Regarding claim 20, Ashley in view of Pascual teaches:
The method of claim 19, wherein processing the core decoded raw audio data based on the enhancement metadata is performed by applying one or more audio enhancement modules in accordance with the enhancement metadata (Ashley Fig. 2 element 210, col. 4 line 58 - col. 5 line 5, where the audio data is decoded based on the selection signal).  

Regarding claim 21, Ashley in view of Pascual teaches:
The method of claim 19, wherein, during training in the Generative Adversarial Network, obtaining the enhanced audio data as output of the Generator is conditioned based on the enhancement metadata (Pascual page 2 Fig. 2, second column, where training the GAN include using the input and vector z to determine the output, and Ashley col. 4 lines 17-36, where the output depends on the selection signal).  

Regarding claim 22, Ashley in view of Pascual teaches:
The method of claim 19, wherein the enhancement metadata include at least an indication of an encoding quality of the original audio data (Ashley Fig. 2 element 202, col. 4 lines 17-36, where the selection is based on comparison of the original and reconstructed signals).  

Regarding claim 25, Ashley in view of Pascual teaches:
A decoder for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata, wherein the decoder includes one or more processors configured to perform the method of claim 19 (Ashley Fig. 2, col. 3 lines 1-20, where a processor is used to implement the decoder).  

Regarding claim 27, Ashley in view of Pascual teaches:
A computer program product comprising a computer-readable storage medium with instructions adapted to cause a device to carry out the method according to claim 19 when executed on a device having processing capability (Ashley col. 3 lines 1-20, where program instructions are stored for implementing the method).

Claims 15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ashley, in view of Atti et al. (US 2019/0103118 A1), hereinafter referred to as Atti.

Regarding claim 15, Ashley teaches:
The method of claim 1
Ashley does not teach:
wherein the enhancement metadata include one or more bitstream parameters.
Atti teaches:
wherein the enhancement metadata include one or more bitstream parameters (para [0100], where the header or metadata includes the bitrate, supported in para [0099] of the provisional application).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ashley by using the bitrate of Atti (Atti para [0100]) in the metadata of Ashley (Ashley col. 4 lines 17-36), in order to determine whether or not to encode portions of the data (Atti para [0081]).

Regarding claim 16, Ashley in view of Atti teaches:
The method of claim 15, wherein the one or more bitstream parameters include one or more of a bitrate, a scale factor values related to AAC-based codecs and Dolby AC-4 codec and a Global Gain related to AAC-based codec (Atti para [0100], where the bitrate is the parameter).  

Regarding claim 17, Ashley in view of Atti teaches:
The method of claim 15, wherein the bitstream parameters are used to guide enhancement of original audio data in a Generator trained in a Generative Adversarial Network setting and/or wherein the bitstream parameters include an indication on whether to enhance the decoded raw audio data by the Generator (Atti para [0113], where metadata includes a similarity value, which is used in para [0074] to skip encoding of a frame, and Ashley Fig. 2 element 202, 204, col. 4 lines 17-36, where the metadata is the selection signal which indicates which type of enhancement layer encoders is used).  

Claims 23-24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ashley, in view of Pascual, and further in view of Atti.

Regarding claim 23, Ashley in view of Pascual teaches:
The method of claim 19
Ashley in view of Pascual does not teach:
wherein the enhancement metadata include one or more bitstream parameters.
Atti teaches:
wherein the enhancement metadata include one or more bitstream parameters (para [0100], where the header or metadata includes the bitrate, supported in para [0099] of the provisional application).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ashley in view of Pascual by using the bitrate of Atti (Atti para [0100]) in the metadata of Ashley in view of Pascual (Ashley col. 4 lines 17-36), in order to determine whether or not to encode portions of the data (Atti para [0081]).

Regarding claim 24, Ashley in view of Pascual and Atti teaches:
The method of claim 23, wherein the one or more bitstream parameters include one or more of a bitrate, a scale factor values related to AAC-based codecs and Dolby AC-4 codec and a Global Gain related to AAC-based codec (Atti para [0100], where the bitrate is the parameter).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2016/0225387 A1 para [0237] teaches user input for specifying a preference  for certain types of speech enhancement operations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658