DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Office Action is in response to the applicants’ amendment filed on February 23, 2022 and wherein the Applicant has amended claims 1, 12-13, 19-20. 
In virtue of this communication, claims 1-20 are currently pending in this Office Action. 
With respect to the objection of claims 12-13, 19-20 due to formality issues, as set forth in the previous Office Action, the Applicant’s amendment, and argument, see paragraph 3 of page 6 in Remarks filed on February 23, 2022, have been fully considered and the argument is persuasive. Therefore, the objection of claims 12-13, 19-20 due to the formality issues, as set forth in the previous Office Action, has been withdrawn.
With respect to the rejection of claims 1-2, 4-20 under 35 USC §112(b), as set forth in the previous Office Action, the Applicant’s amendment, and argument, see paragraph 5 of page 6 in Remarks filed on February 23, 2022, have been fully considered and the argument is persuasive. Therefore, the rejection of claims 1-2, 4-20 under 35 USC § 112(b), as set forth in the previous Office Action, has been withdrawn. The rejection of claim 3 under 35 USC §112(b) is maintained as set forth below. 
The Examiner appreciates the explanation of the amendment and analyses of the prior arts, and however, although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993) and MPEP 2145.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(B)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 3-7 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.
Claim 3 is rejected and see the previous office action under the title above because the claimed “the audio object gains” has insufficient antecedent basis in claim 3.
Claim 4 is rejected and see the previous office action under the title above because of the claimed “any of claim 1” which is of uncertainty.
Claim 5 is rejected for the at least similar reason as described in claim 4 above because claim 5 recites the similar deficient feature as recited in claim 4. Claims 6-7 are rejected due to the dependencies to claim 5.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over (US 20160219391 A1, hereinafter Ward) and in view of reference Kuech et al (US 20160240204 A1, hereinafter Kuech).
Claim 1: Ward in this embodiment teaches a method (title and abstract, ln 1-21, and method implemented at an audio decoder 100 in fig. 5, details in fig. 1A), performed by a downstream audio rendering stage in an end-to-end audio processing chain (elements in fig. 1A; in an encoder-decoder chain in fig. 5), comprising: 
receiving and decoding a coded bitstream (receiving an encoded audio signal 102 via the DEMUX in fig. 5 and the encoded audio signal 102 in fig. 1A) generated by an upstream audio processor (produced by the MUX in the audio encoder 150 and through the content creator in fig. 150 in fig. 5; outputted 102 from the audio signal encoder 158 in fig. 1B), wherein the coded bitstream is encoded with audio content (through NGC encoder of the element 150 in fig. 5,  encoded by audio signal encoder 158, etc., in fig. 1B, para [0091]; media content including individual portions of contents such as frames, blocks, para [0040], having programs,  commercials, and transitions between the program and the commercial, etc., para [0049], para [0090], an entire program covering a movie, a TV program, a radio broadcast, etc., para [0063],  dialogue, non-dialogue audio content, para [0091]) and audio metadata corresponding to the audio content (downmix related metadata including sets of gains including DRC gains, etc., computed at audio encoder, with respect to audio blocks, frames, para [0040], being associated with dialogue, non-dialogue contents para [0091], programs and commercials para [0041], para [0090], provided by the audio encoder, detailed in fig. 3, para [0040], para [0100]); 
wherein the audio content includes first audio objects corresponding to a first media content type of two consecutive media content type (e.g., one of several programs and the one or more commercials as audio objects, para [0090]; with the program, the commercial, and loudness level transition between, para [0049] and thus, the two consecutive media content types are of inherency for experiencing loudness level transitions; a portion of an entire program, e.g., a movie, a TV program, a radio broadcast, etc., as audio objects with different content types, para [0063]; individual portion of the audio content as audio objects, para [0040]) and second audio objects corresponding to a second media content type of the two consecutive media content types (another of the several programs and the one or more commercials with loudness level transitions between, para [0049] and the discussion above; other portion of the entire program above, para [0063]; other portion of the audio content, para [0040]; other portion of individual portion of the audio content, para [0040]); 
wherein first and second audio object gains, are respectively for the first and second audio objects (DRC gains, smoothed DRC gains produced from the audio encoder in fig. 3, para [0040], para [0100], corresponding to audio contents with different types including dialogue/non-dialogue content types para [0091], etc., and the discussion above, using Huffman coding, differential coding, to encode the gains and inherently transmitted to the decoder for application), generated at least in part based on a first fading curve of the first media content type (DRC curves, figs. 2A/2B, used to derive the DRC gains from input loudness levels, para [0098], e.g., defined upon different audio content types including music, film, speech and having parameters including attack threshold, release threshold, fast/slow attack time, fast/slow release time, holdoff period, etc., in table 1, para [0098]-[0099], e.g., one of the content types of music, speech, etc.,) and a second fading curve of the second media content type, respectively (the discussion above, e.g., another one of the content types of music speech, etc., in table 1, the gains including smoothing, DRC gains, gain limiting para [0100]-[0106]; the calculation can be in the encoder or decoder, para [0100]);
applying the first and second audio object gains generated at least in part based on the first and second fading curves to the first and second audio objects, respectively (applying the DRC gains, gain limiting and gain smoothing determined by at least the dialogue loudness levels and DRC curves, to the audio contents with different content types defined in table 1 in fig. 3, para [0100]-[0105]; applying the loudness related gain to the audio content, para [0106]; via the part of the audio renderer 108 in fig. 1A); 
rendering a sound field represented by the first audio objects with the applied first audio object gain and the second audio object with the applied second audio object gain (generating channel-specific audio data after applying the gains as determined based on DRC, gain limiting, gain smoothing, etc., to the input audio data extracted from the encoded audio signal 102 and driving the speakers, headphones, represented in the speaker configuration, para [0085]). 
However, Ward does not explicitly teach wherein the audio metadata includes the disclosed first and second audio object gains, although Ward teaches encoding the gains (by using Hoffman, differential encoding, etc., para [0040]).
Keuch teaches an analogous field of endeavor by disclosing a method, performed by a downstream audio rendering stage in an end-to-end audio processing chain (title and abstract, ln 1-17 and fig. 2, details in fig. 3, an audio encoder and an audio decoder chain in figs. 1-2), comprising:
receiving and decoding a coded bitstream generated by an upstream audio processor (audio decoder for decoding an audio bitstream and a metadata bitstream related to the audio bitstream, abstract), wherein the coded bitstream is encoded with audio content (provided by a content creator, para [0020], para [0123]) and audio metadata corresponding to the audio content (provided by DRC gain sequences by DRC & gCP metadata encoder in fig. 1); and wherein the audio metadata includes first and second audio object gains, respectively for the first and second audio objects (at least two DRC gains related to different audio objects and metadata encoder including at least two DRC gains, para [0030]) for benefits of achieving an operation and sound quality improvement by integrating DRC, clipping prevention gain adjustment, and maximum limiter in a signal processing chain (para [0123], fig. 5).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied wherein the audio metadata includes first and second audio object gains, respectively for the first and second audio objects, as taught by Keuch, to the metadata and the first and second audio object gains in the method, as taught by Ward, for the benefits discussed above.
Claim 12 has been analyzed and rejected according to claim 1 above and the combination of Ward and Keuch further teaches a method performed by an upstream audio processor (Ward, title and abstract, ln 1-21, and method implemented at an audio encoder 150 in fig. 5 and details in fig. 1B) prior to a downstream audio rendering stage in an end-to-end audio processing chain (Ward, the processing chain in fig. 5; the decoder 100 as the downstream audio rendering stage detailed in fig. 1A, para [0088]), comprising: 
generating first audio object gains, for first audio objects of a first media content type of two consecutive media content types, based at least in part on a first fading curve for the first media content type (Ward, dialogue loudness levels from the element 154, DRC parameters from the DRC reference repository 156 in fig. 1B; through the DRC gains, limiting gains, and smoothing gains calculated at the encoder upon the at least dialogue loudness levels and DRC parameters in fig. 3, para [0100]; including attack time, release time, etc., with different content types in table 1, e.g., music content type in table 1, para [0098]-[0099] and the discussion in claim 1 above);
generating second audio object gains, for second audio objects of a second media content type of the two consecutive media content types, based at least in part on a second fading curve for the second media content type (Ward, the discussion above, e.g., speech content type in table 1, para [0098]-[0099], and the discussion in claim 1 above);
generating a coded bitstream encoded with audio content and audio metadata corresponding to the audio content (Ward, outputting encoded input signal 102 from the audio signal encoder 158 in fig. 1B, para [0062], para [0088]; by formatting the audio content into audio blocks/frames, and formatting the dialogue loudness levels, the DRC reference parameters etc., into metadata and encoding the audio data blocks/frames and the metadata into the encoded audio signal 102, para [0094]); 
wherein the audio content includes the first and second audio objects (Ward, e.g., music and speech defined in table 1; a set of programs and one or more commercials with loudness transitions, para [0090]; individual portions of the audio frames/blocks, para [0040]; audio only and audiovisual, para [0090]; an entire program including a movie, a TV program, a radio broadcast, etc., para [0063]; dialogue and non-dialogue contents, para [0091], and the discussion in claim 1 above); 
wherein the audio metadata includes the first and second audio object gains (Ward, including DRC gains with different content types in table 1 and DRC curves, and the discussion in claim 1 above, and Keuch, the metadata include the at least two DRC gains for the audio objects, para [0030]); 
sending the coded bitstream to the downstream audio rendering stage (Ward, via the wireless, wire, one or more connections, etc., to the decoder in fig. 5, para [0095]). 
Claim 2: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein the first and second audio object gains generated by the upstream audio processor (Ward, DRC gains are generated at audio encoder, para [0100] and Keuch, the DRC gains are generated and encoded into the metadata in the audio encoder in fig. 1) free the upstream audio processor from performing cross-fading operations on the first audio objects of the first media content type and the second audio objects of the second media content type (Ward, the application of the DRC gains implemented at the audio decoder in fig. 1A, and Keuch, the application of DRC gains with other gains is implemented at the audio decoder in figs. 3-5). 
Claim 3: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein cross fading gain components in the audio object gains approximate the first and second fading curves. 
Claim 4: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein the first fading curve and the second fading curve are linear. 
Claim 5: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein the first fading curve and the second fading curve are determined by interpolating at least two consecutive first audio object gains and two consecutive second audio object gains, respectively (Ward, the DRC curves in figs. 2A/2B, and defined upon the different content types such as music, speech, film, in table 1, para [0098]-[0099], which is piece-wise linear segments, para [0120], and thus, lines are inherently connected by the two location points in the curve for obtaining a continuation of the DRC gains in figs. 2A/2B). 
Claim 6: the combination of Ward and Keuch further teaches, according to claim 5 above, wherein a minimum number of audio samples representing the audio content is used between metadata updates (Ward, using the histogram to control the speed of gain changes in loudness level transitions between a program and a commercial by modifying the time constants, para [0049] and Keuch, using a gain smoothing filter, para [0089]), wherein the metadata updates are associated to a particular audio sample (Ward, between the program and the commercial, and Keuch, using a look-ahead delay of the input signal with the smooth filter, para [0089]). 
Claim 7: the combination of Ward and Keuch further teaches, according to claim 6 above, wherein the metadata updates (Ward, DRC gains per frame and blocks, para [0040] and subdivisions and components of audio data frame, para [0040], and Keuch, an frame having one or more DRC gains, para [0003]).
However, the combination of Ward and Keuch does not explicit teach a maximum of eight per frame. 
It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have recognized that selecting seven, eight, nine, or other number of maximum per frame is a matter of designer’s choice, for example, more metadata updates per frame for real-time and accurate adaption of the audio data change, while less number of metadata updates per frame for a smooth to avoid quick and accident audio data spike.
Claim 8: the combination of Ward and Keuch further teaches, according to claim 1 above, the method further comprising supporting, by the downstream audio rendering stage, long term and short-term cross fading (Ward, fast/slow release time and attack time in table 1 with different audio content types in table 1). 
Claim 9: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein the first and second object gains are generated at least in part based on selected dynamic range compression curves (Ward, the DRC gains are upon the DRC curves with respect to the measured loudness level, para [0037], para [0057], para [0072]-[0075]) and wherein the method further comprises applying the first and second object gains to one or more loudness levels in one or more types of loudness levels represented by first and second audio objects to achieve a target loudness levels for a specific playback environment (Ward, the DRC gains are applied to the audio content with the loudness levels, to achieve the desired target, para [0057]-[0058]). 
Claim 10: the combination of Ward and Keuch further teaches, according to claim 9 above, wherein the selected dynamic range compression curves are smoothed with a time constant (Ward, smoothed DRC gain obtained from the DRC curve, para [0076]-[0077]). 
Claim 11: the combination of Ward and Keuch further teaches an audio object decoder comprising an audio rendering stage configured to perform the method according to claim 1 (audio decoder in fig. 1A). 
Claim 13 has been analyzed and rejected according to claims 12, 2 above.
Claim 14 has been analyzed and rejected according to claims 12, 9 above.
Claim 15 has been analyzed and rejected according to claims 14, 10 above.
Claim 16: the combination of Ward and Keuch further teaches, according to claim 1 above, wherein the audio content is position-less (Ward, one or more commercials, programs, TV, movie, etc., as discussion claim 1 above, which are inherently no position assigned until rendered to channel-specific to drive  speakers, para [0085]).
Claim 17 has been analyzed and rejected according to claims 12, 16 above.
Claim 18: the combination of Ward and Keuch further teaches an encoder comprising an audio processor configured to perform the method of claim 12 (Ward, audio encoder in fig. 1B, and Keuch, audio encoder in fig. 1).
Claim 19: the combination of Ward and Keuch further teaches a non-transitory computer readable storage medium, comprising software instructions, which when executed by one or more processors cause the one or more processors to perform the method of claim 1 (Ward, non-transitory computer readable storage medium storing software instructions and executed by one or more processors to implement the method in claim 1 above, para [0227], and Keuch, non-transitory digital storage medium having computer program stored thereon to perform the method, para [0005]).
Claim 20 has been analyzed and rejected according to claim 12, 19 above.

Response to Arguments

Applicant's arguments filed on February 23, 2022 have been fully considered and but are moot in view of the new ground(s) of rejection necessitated by the applicant amendment. Although a new ground of rejection has been used to address additional limitations that have been at least added to claims 1, 12, a response is considered necessary for several of applicant’s arguments since references Ward and Keuch will continue to be used to meet several claimed limitations.
With respect to the prior art rejection of independent claim 1 under 35 USC §103(a), as set forth in the Office Action and for the claimed features “… generating first audio objects gains, for first audio objects of a first media content type of two consecutive media content types, based at least in part on a first fading curve for the first media content type; generating second audio object gains, for second audio objects of a second media content type of the two consecutive media content types, based at least in part on a second fading curve for the second media content type …”, the Applicant argued: “those with ordinary skill in the art would not equate DRC curves with fading curves, which are two completely different audio processes. DRC gains are generated from compression curves that are designed to adjust the range between loud and quiet portions of an audio signal. By contrast, fading curves are not generated to adjust the range between loud and quiet portions of an audio signal. For example, cross-fades, which are a type of a fading curve, are typically applied to two consecutive audio segments: at the end of the first audio segment and at the beginning of the second audio segment, irrespective relative amplitude considerations. The cross-fade curve can be any shape such as …” and thus, “Indeed, nowhere in Ward or Kuech are fading curves disclosed or suggested… Applicant’s Specification makes clear that DRC curves and fading curves are different by stating in reference to a particular embodiment: ‘By having the first and second object gains generated in part based on cross fading curves and in part based on selected dynamic range compression curves, it is possible to aggregate both cross fading aspects and DRC aspects in a single gain component, thereby …’ Paragraph [0129]”, and “Accordingly, equating DRC curves with fading curves is not supported by Applicant’s Specification”, as asserted in paragraph 4 of page 7 and paragraphs 1-3 of page 8 in Remarks filed on February 23, 2022.
In response to the argument cited above, the examiner respectfully disagrees because claim 1, similar to claim 12, broadly recited “a fading curve for the first media content type” and “a fading curve for the second media content type” with no recitation of the argued “cross-fade” (or fade-in one audio media and meanwhile fade-out in another audio media during transition of the two types of media consequently), i.e., the argued feature is not recited in claims 1 and 12 and the argument above is moot. Note: the broadly claimed “fading curve” does not have to be interpreted as the argued “cross-fading”, e.g., a fading or fading curve could be caused by a dynamic range control or DRC (e.g., US 20140355786 A1 by Betbeder et al, para 25).  The applicant further pointed to the features written in application specification (para 129 above), and however, it is well-known in the art that “aggregating both cross fading aspects and DRC aspects in a single gain component” would have shown a “fading curve” behavior, but under DRC, or dynamic range control and therefore, the argument above is moot.
Therefore, on the bases of above analyses and evidences from the prior art, the prior art rejection of independent claim 1 under 35 USC §103(a), as set forth in the Office Action, is maintained. For the at least similar reasons discussed above, the prior art rejection of other independent claim 12 and dependent claims 2-11, 13-20 is also maintained. 
In the response to this office action, the examiner respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the Examiner in prosecuting this application.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589.  The examiner can normally be reached on Monday-Friday 6:30am-4:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LESHUI ZHANG/
Primary Examiner, Art Unit 2654