DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. KR10-2019-0079377, filed on July 02, 2019.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on March 10, 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The independent claim 12 recites “A method of encoding a high band of an audio, the method performed by an encoder, the method comprising: extracting a parameter extracted through a first neural network; and quantizing the extracted parameter, wherein the parameter is transmitted to a decoder, input into a third neural network together with side information extracted through a second neural network, and used to restore a high band of an audio.” 

This judicial exception is not integrated into a practical application. In particular, p. 3, line 26 recites the additional elements of “a decoder including a processor” as per the independent claims. For example, in p. 17, lines 20-26 of the as filed specification, there is a description of using a general purpose computing environment or computing device as recited in p. 18, line 17. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a generalized architecture as a computing system as noted. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, the additional limitation in the claims noted above are directed towards insignificant solution activity. The claims are not patent eligible. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-6, 9-17 is/are rejected under 35 U.S.C. 103 as being anticipated by Schmidt et al. (US Pub. No. US 2020/0243102) in view of “Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension” by Lin et al. (2018).
Regarding claim 1 and 13, Schmidt teaches a method of decoding a high band of an audio, the method performed by a decoder, the method comprising: identifying a parameter extracted through a first neural network; (see [0031], wherein one embodiment “all processing is done in the decoder without the need for transmitting extra bits. Parameters like spectral envelope parameters are estimated by a regressive convolutional deep neural network (CNN) with long short-term memory (LSTM).”)
As to claim 13, decoder comprising a processor are related as the processor configuration, with each claimed element's function corresponding to the claimed method. Accordingly claim 13 is similarly rejected under the same rationale as applied above with respect to method claim. Furthermore, Schmidt teaches a processor (see [0013], “a neural network processor” and [0158], “a processing means”). 
Schmidt does not teach identifying side information extracted through a second neural network; and restoring a high band of an audio by applying the parameter and the side information to a third neural network.  Lin teaches identifying side information (see p. 75, section 2.2, col. 1, where “the low band part of the audio spectrum provides information about the spectral shape of the high band part.”)  
extracted through a second neural network (see p. 75, section 2.2, col. 2, where “HF signal is not only associated with the LF signal of the current frame, but also associated with the LF signal of the front frame… HF construction can be derived from the LF signal of context dependent frames besides the current frame”; and see p. 77, section 3.1, col. 2, where the information provided by the LF signal of the past and current frames are extracted through a network: “the prediction function f(l) is a neural network model” and “y=f(l) is the prediction value of the model from decoded LF signal l.”) 	Lin also teaches restoring a high band of an audio by applying the parameter and the side information to a third neural network (see p. 76, section 3.1, col 2, where there is a third network: “a nonlinear mapping model to predict the coarse HF signal using the context dependent LF signal.”, see p. 74, section 1-2.1, col 2, where parameters extracted in Fig 1. are also applied to the neural network: "few parameters of HF are transmitted to the decoder side for reconstructing the high frequency signals…final HF signal is recreated using coarse HF signal and decoded HF parameters”, see p. 78, section 3.2.1-2, col. 1-2, where the side information is used from contextual frames in the sequence of preceding and following frames: “RNNs are able to incorporate contextual information from previous input vectors…for our prediction system, because of the context dependency correlation phenomenon, we desire the model to have access to both past and future context...combining RNNs with LSTM gives rise to BLSTM, which can access long-range context in both directions.”)  Schmidt and Lin are combinable because they both propose methods for non-blind audio bandwidth extension that involve neural networks. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the spectral band replication and parametric encoding from the neural network laid out in Schmidt with the use of contextual frames as extracted side parameters in another neural network in Lin to restore a high band of audio in a final neural network. One would be motivated to do because there is a correlation between the information HF and LF signals that improves HF construction in restoring a high band of audio (p. 75, section 2.2, col. 1).

Regarding claims 2 and 14, the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt teaches wherein the first neural network is configured to extract the parameter of the high band from a first input based on a per-frame spectrum of the audio. (see [0026], where “the neural network processor is not fed with the amplitude spectrum, but is fed with the power spectrum of the input audio signal. Furthermore, in this embodiment, the neural network processor outputs a parametric representation and, for example, spectral envelope parameters in a compressed domain such as a LOG domain, a square root domain”, and see [0022], where “the neural network is only used for providing a parametric representation of the high band”)
 
Regarding claims 3 and 15, the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt teaches wherein the first input is determined to be a subset of a spectrum, and the spectrum includes a high-band coefficient and a low-band coefficient of a current frame, (see [0086], where “the neural network processor 30 is configured to receive, at the input layer 32, a spectrogram derived from the input audio signal, the spectrogram comprising a time sequence of spectral frame”, and see [0056], where “The input audio signal frequency range can be a low band range or a full band (eg. including low and high band) range but with smaller or larger spectral holes”)
and a high-band coefficient and a low-band coefficient of a previous frame. (see [0027], where  “the input layer into the neural network is a two-dimensional layer having the full frequency range of the input audio signal, and additionally, having certain number of preceding frames as well”)

Regarding claims 4 and 16,  the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt does not teach the second neural network is configured to extract the side information to restore the high band from a second input based on a per-frame spectrum of the audio. Lin teaches the second neural network is configured to extract the side information (see p. 75, section 2.2, col. 1, where “the low band part of the audio spectrum provides information about the spectral shape of the high band part.”, and see p. 75, section 2.2, col. 2, where “HF signal is not only associated with the LF signal of the current frame, but also associated with the LF signal of the front frame… HF construction can be derived from the LF signal of context dependent frames besides the current frame”; and see p. 77, section 3.1, col. 2, where the information provided by the LF signal of the past and current frames are extracted through a network: “y=f(l) is the prediction value of the model from decoded LF signal l” and “the prediction function f(l) is a neural network model”)
to restore the high band from a second input based on a per-frame spectrum of the audio. (see p. 80, section 4.2, col, 2: “In our implementation, we generally use the previous 5 frames decoded LFs signal (eg. side-information) to predict the current frame HFs coarse spectrum”) Schmidt and Lin are combinable because they both propose methods for non-blind audio bandwidth extension that involve neural networks. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the spectral band replication and parametric encoding from the neural network laid out in Schmidt with the use of contextual frames as extracted side parameters in another neural network in Lin to restore a high band of audio in a final neural network. One would be motivated to do because there is a correlation between the information HF and LF signals that improves HF construction in restoring a high band of audio (p. 75, section 2.2, col. 1).
 
Regarding claims 5 and 17, the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt does not teach wherein the second input is determined to be a subset of a spectrum. Lin teaches wherein the second input is determined to be a subset of a spectrum, and the spectrum includes a high-band coefficient and a low-band coefficient of a previous frame, and a low-band coefficient of a current frame. (see p. 75, section 2.2, col. 1, where “The motivation for all bandwidth expansion methods is the fact that the spectral envelope of the lower and higher frequency bands of the audio signal is dependent (eg. where the envelope involves current frames and context-dependent frames, including previous ones), i.e., the low band part of the audio spectrum (eg. a subset of a spectrum) provides information about the spectral shape of the high band part.”).  Schmidt and Lin are combinable because they both propose methods for non-blind audio bandwidth extension that involve neural networks. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the spectral band replication and parametric encoding from the neural network laid out in Schmidt with the use of coefficients of a previous and current frame from a subset of spectrum including high and low bands to restore a high band of audio in a final neural network. One would be motivated to do because there is a correlation between the information in HF and LF signals that improves HF construction in restoring a high band of audio (p. 75, section 2.2, col. 1).

Regarding claim 6, Schmidt teaches wherein, a first input applied to the first neural network includes a high-band coefficient of the current frame, (see [0057], where “a neural network processor 30 configured for generating a parametric representation 70 for the enhancement frequency range (eg. first neural network) using the input audio signal frequency range of the input audio signal and using a trained neural network”, see [0056] and Figure 1, where the neural network receives “an input audio signal 50 having an input audio signal frequency range. The input audio signal frequency range can be a low band range or a full band (eg. including high band) range but with smaller or larger spectral holes.” )
Schmidt does not teach wherein decoding frame of the audio is a current frame and a second input applied to the second neural network includes a low- band coefficient of the current frame. Lin teaches wherein decoding frame of the audio is a current frame (see p. 80, sections 3.2-4.2, where a DBLSTM-RNN is used “to predict the current frame HFs coarse spectrum”)
and a second input applied to the second neural network includes a low- band coefficient of the current frame. (see p. 75, section 2.2, col. 1, where “the current frame LF signal is used to recreate the coarse HF” and see p. 77, section 3.1, col. 2, where the information provided by the LF signal of the past and current frames are extracted through a network: “y=f(l) is the prediction value of the model from decoded LF signal l” and “the prediction function f(l) is a neural network model”).   Schmidt and Lin are combinable because they both propose methods for non-blind audio bandwidth extension that involve neural networks. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the use of high band coefficient of a current frame for parametric encoding in the neural network laid out in Schmidt with the use of low band coefficients of a current frame to restore a high band of audio in a final neural network. One would be motivated to do because there is a correlation between the information in HF and LF signals that improves HF construction in restoring a high band of audio (p. 75, section 2.2, col. 1).

Regarding claim 9, the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt teaches wherein the identifying of the parameter comprises identifying the parameter by dequantizing a quantized parameter received from an encoder. (see [0143] for a description of bandwidth extension procedure, some models are used to “predict the vocal tract shape from features calculated on the speech signal (eg. parameters)” see [0144], where “a codebook of linear prediction coefficients (LPC) calculated on frames containing the upper band speech signal (eg. parameters) is created by vector quantization. At decoder-side, features are calculated on the decoded speech signal and an HMM is used to model the conditional probability of a codebook entry (eg. de-quantizing the parameter) given the features..”)

Regarding claim 10, the combination of Schmidt and Lin teaches the device of claim 1 and 13. Schmidt teaches wherein the identifying of the parameter comprises identifying the extracted parameter by randomly sampling an output of the first neural network. (see [0060], where the “input audio signal 50 could be used and could be processed by some sort of non-linearity in the time domain”; see [0061], where the “trained neural network that outputs parametric data…These parametric data can be any parametric data describing the missing or bandwidth enhancement signal like….spectral envelope parameters… spectral envelope parameters or a kind of a "base line" parametric representation are spectral envelope parameters and, advantageously, absolute energies or powers for a number of bands.” and are sampled for the parametric representation: “i.e., ten parameters for exemplary ten bands.”; see [0144], where “a coarse description of the parametric representation is used as a first approximation”) 

Regarding claim 11, Schmidt teaches wherein the restoring comprises using a high band of the current frame and a high band of at least one previous frame of the current frame. (see [0020], where “the raw signal which is a spectrally whitened patched signal is further processed by the raw signal processor using the parametric representation provided from the neural network in order to obtain the processed raw signal having frequency components in the enhancement frequency range” and “refers to certain spectral holes between the maximum frequency and a certain minimum frequency that are filled by the intelligent gap filling procedures”; see [0086], where “the neural network processor 30 is configured to receive, at the input layer 32, a spectrogram derived from the input audio signal, the spectrogram comprising a time sequence of spectral frame”, and see [0056], where “The input audio signal frequency range can be a low band range or a full band (eg. including low and high band) range but with smaller or larger spectral holes”; see [0027], where “the input layer into the neural network is a two-dimensional layer having the full frequency range of the input audio signal, and additionally, having certain number of preceding frames as well”)

Regarding claim 12, Schmidt teaches wherein a method of encoding a high band of an audio, the method performed by an encoder, the method comprising: extracting a parameter extracted through a first neural network; and quantizing the extracted parameter, (see [0030], where within the encoder “a parametric representation generated by the neural network processor is used as a first approximation (eg. the first inventive processing of the neural network) which is refined, for example, in the parameter domain by some sort of data quantization controlled by a very small number of bits transmitted as additional side information. Thus, an extremely low bitrate guided extension is obtained that, however, relies on a neural network processing within the encoder”)
 wherein the parameter is transmitted to a decoder, (see [0030], where “the additional low bitrate side information” is transmitted to the decoder along with “the parametric representation from the input audio signal” and “this parametric representation is refined by the additional very low bitrate side information”, 
Schmidt does not teach wherein input into a third neural network together with side information extracted through a second neural network, and used to restore a high band of an audio  (see p. 76, section 3.1, col 2, where there is “a nonlinear mapping model to predict the coarse HF signal using the context dependent LF signal.”, see p. 74, section 2.1, col 2, where parameters are also applied to the neural network, and the "final HF signal is recreated using coarse HF signal and decoded HF parameters”, and see p. 78, section 3.2.1, col. 2, where “combining BRNNs with LSTM gives rise to BLSTM, which can access long-range context in both directions.”) Schmidt and Lin are combinable because they both propose methods for non-blind audio bandwidth extension that involve neural networks. Therefore it would have been obvious for a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the spectral band replication and parametric encoding from the neural network laid out in Schmidt with the use of contextual frames as extracted side parameters in another neural network in Lin to restore a high band of audio in a final neural network. One would be motivated to do because there is a correlation between the information HF and LF signals that improves HF construction in restoring a high band of audio (p. 75, section 2.2, col. 1).
 
Allowable Subject Matter
Claims 7 and 8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 103, set forth in this Office action. More specifically, the prior art of record, Schmidt, teaches when a first input applied to the first neural network includes a high-band coefficient of the current frame, (see [0057], where “a neural network processor 30 configured for generating a parametric representation 70 for the enhancement frequency range (eg. first neural network) using the input audio signal frequency range of the input audio signal and using a trained neural network”, see [0056] and Figure 1, where the neural network receives “an input audio signal 50 having an input audio signal frequency range. The input audio signal frequency range can be a low band range or a full band (eg. including high band) range but with smaller or larger spectral holes.”; see [0027], where “the input layer into the neural network is a two-dimensional layer having the full frequency range of the input audio signal, and additionally, having certain number of preceding frames as well”) and Lin teaches a second input applied to the second neural network including a low-band coefficient of the previous frame, and a low-band coefficient of the current frame. (see p. 75, section 2.2, col. 1, where “the current frame LF signal is used to recreate the coarse HF” and see p. 77, section 3.1, col. 2, where the information provided by the LF signal of the past and current frames are extracted through a network: “y=f(l) is the prediction value of the model from decoded LF signal l” and “the prediction function f(l) is a neural network model”). However, Lin and Schmidt do not teach a high-band coefficient as a second input for the previous frame or when a decoding frame of the audio is a previous frame. Other prior art on the record, such as “Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration” by Deng et al. teaches estimating a high frequency spectral envelope does not specify using their high and low band coefficient as an neural network input to extract side information. In “A Robust Frame-based Nonlinear Prediction System for Automatic Speech Coding”, Azar et al teach the use of predictive coding to predict a sample for each frame of a speech signal, but employ different kinds of coefficients in the network.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARVAJNA KALVA whose telephone number is (571) 272-4692. The examiner can normally be reached on Monday - Friday 9 to 6. Examiner interviews are available via telephone, in person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppairmy.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.\
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARVAJNA KALVA whose telephone number is (571)272-4692. The examiner can normally be reached Monday - Friday 9 AM to 5 PM.

/SARVAJNA KALVA/               Examiner, Art Unit 2659                                                                                                                                                                                         


	
/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        

01/03/2022