DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendments and Arguments
Regarding an objection to an informality issue, applicant amended claim 2. The objection has been withdrawn. 

Regarding the rejection under 35 U.S.C. §112(b), applicant cancelled claim 5. The rejection under §112(b) has been withdrawn.

Regarding the rejection under 35 U.S.C. §101, applicant amended claim 1 by incorporating a limitation from a dependent claim 2. Applicant stated (Remarks, page 6) that “The independent claim 1 as now amended complies with all requirements to be a part of the statutory subject matter under 101.”

By reviewing the amended claim 1 as well as newly presented claim 10, the examiner believes that the amended claim 1 (as well as the new claim 10) still recite limitations that are directed to a judicial exception (an abstract idea: mathematical operations) without including additional elements that meet significantly more consideration. The rejection under §101 has been maintained. See more detailed analysis in the section of 35 U.S.C. §101. 

Regarding the rejection under 35 U.S.C. §103 over Chorowski (“Unsupervised speech representation learning using WaveNet autoencoders, published 09/11/2019) in view of Garbacea et al. (US PG Pub. 2020/0234725), applicant stated (Remarks, pages 8) that “Claim 1 is directed to, inter alia, for example, encoding and decoding by using
association vector generated by concatenating the previous latent vector of the previous
frame with the current latent vector of the current frame”.

	Applicant first discussed disclosure in the secondary reference to Garbacea and then alleged (Remarks, page 9) “Therefore, claim 1 is not taught or suggested by Chorowski or Garbacea, whether these references are considered individually or in combination. Withdrawal of the rejection and allowance of claim 1 is respectfully requested.”

	In response, the examiner first notices that “One cannot show non-obviousness by attacking references individually where the rejections are based on combinations of references. In reKeller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In reMerck & Co., Inc., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).” (See MPEP 2145). 

	Primary reference Chorowski describes applying an autoencoding neural network for representing speech signals with latent representation (Abstract, Section II, unsupervised feature learning to obtain latent representation such as latent vectors). Chorowski further discloses combining latent vectors extracted at neighboring time steps using convolutions (Section III, page 4, “combines latent vectors extracted at neighboring time steps using convolutions”; Section IV, “The jittered latent sequence was passed through a single convolutional layer with filter length 3 and 128 hidden units to mix information across neighboring timesteps”). 

	Chorowski’s description of “combining latent vectors from neighboring time steps” meets the argued limitation: “generating a concatenation vector by concatenating a previous latent vector”. Chorowski’s description “the jittered latent sequence … to mix information across neighboring timesteps” also meets the argued limitation. 

	The cited primary reference (Chorowski) meets the argued limitation. The argument regards to the secondary reference to Garbacea is not persuasive. The rejection under §103 based on a combined teaching of Chorowski in view of Garbacea has been maintained. 

Regarding the newly added claim 10, both Chorowski and Garbacea discloses decoder (Chorowski, Section III, model description; Garbacea, Fig. 1A, #150, [00149-0052]). The newly added claim 10 is also rejected over the combined teaching of Chorowski in view of Garbacea. 

	After performing an update search, the examiner discovered a new reference to Sung et al. (US PG Pub. 2019/0164052). Sung is a co-inventor of the instant application. The Sung reference has the same assignee of the instant application. Sung discloses audio encoding / decoding using latent vectors (Sung, Fig. 7 and Fig. 9). The examiner rejects the newly added claim 10 based on a combined teaching of Sung with another reference. 

Claim Rejections - 35 USC § 101
The Manual of Patent Examining Procedure (MPEP) provides detailed rules for determining subject matter eligibility for claims in §2106.  Those rules provide a basis for the analysis and finding of ineligibility that follows.

Claims 1 and 10 are rejected under 35 U.S.C. 101.  The claimed invention is directed to unpatentable subject matter because the claimed invention recites a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.

Although claims 1 and 10 are directed to one of the four statutory categories of invention (MPEP 2106.03(II)), the claims recite a number of steps of (“generating a current vector…”; “generating a concatenation vector…”, “encoding and quantizing the concatenated vector...”; “reconstructing a current latent vector…”). These limitations fall into a judicial exception (“abstract idea”). 

In the prone one of the two prong inquiry (Step 2A in the flowchart in MPEP 2106.04(II)(A)), the above limitations recited in claims are directed to at least one of groups of abstract ideas (MPEP 2106.04(a), “Mathematical concepts”, “Certain methods of organizing human activity”, “Mental Processes”). It should be noted that these groupings are not mutually exclusive, i.e., some claims recite limitations that fall within more than one grouping or sub-grouping (MPEP 2106.04(a)(2)).

The mathematical concepts grouping is defined as mathematical relationships, mathematical formulas or equations, and mathematical calculations. It is important to note that a mathematical concept need not be expressed in mathematical symbols, because “[w]ords used in a claim operating on data to solve a problem can serve the same purpose as a formula.” In re Grams, 888 F.2d 835, 837 and n.1,12 USPQ2d 1824, 1826 and n.1 (Fed. Cir. 1989). See MPEP 2106.04(a)(2)(I) ).

In light of the specification, the claimed “a current latent vector” or “a previous latent vector” are some numerical values. The claimed “a neural network” is a mathematical model. Therefore, claim limitations define some steps of inputting some numerical values into a mathematical model. In other words, the claims are directed to an abstract idea. 

Since the claimed invention falls into a judicial exception according above analysis, a claim that is directed to a judicial exception must be evaluated to determine whether the claim recite additional elements that integrate the judicial exception into a practical application (MPEP 2106.04(II)(A)(2)). Prong Two asks whether the claim recite additional elements that integrate the judicial exception into a practical application. In Prong Two, examiners evaluate whether the claim as a whole integrates the exception into a practical application of that exception. Court in Gottschalk v. Benson ‘‘held that simply implementing a mathematical principle on a physical machine, namely a computer was not a patentable application of that principle. Accordingly, after determining that a claim recites a judicial exception in Step 2A Prong One examiners should evaluate whether the claim as a whole integrates the recited judicial exception into a practical application of the exception in Step 2A Prong Two.

The instant claims do not include additional elements which provide an improvement to another technology or technical field, nor do they recite an improvement to the functioning of the computer itself.  See MPEP §2106.05(a).  The claims require no more than a generic computer to implement the abstract idea, which does not amount to significantly more than an abstract idea.  See MPEP §2106.05(f).  Because the claims only recite use of a generic computer, they do not apply the judicial exception with a particular machine.  See MPEP §2106.05(b).  For these reasons, the claims do not provide a practical application of the abstract idea, nor do they amount to significantly more than an abstract idea under step 2B of the subject matter eligibility analysis.  Using a generic computer to implement an abstract idea does not provide an inventive concept.  Therefore, the claims recite ineligible subject matter under 35 USC §101. 

	Furthermore: an element found to amount to insignificant extra solution activity in step 2A of the subject matter eligibility analysis must be evaluate in step 2B to determine whether the element is well-understood, routine, and conventional. Step 2B asks: Does the claim recite additional elements that amount to significantly more than the judicial exception? The instant claims do not have additional elements. Therefore, the claimed invention is directed to an abstract idea without significantly more. 

	Claim Rejections - 35 USC § 103
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Sung et al. (US PG Pub. 2019/0164052, referred to as Sung) in view of Prabhavalkar et al. (US PG Pub. 2020/0357387, referred to as Prabhavalkar). 

Sung reference is from a co-inventor of the instant application and assigned to the same company (“ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE”). Sung discloses audio encoding / decoding by using latent vectors and inputting the latent vectors to encode / decode audio signals (Sung, Fig. 5, Fig. 7 and Fig. 9, and relevant sections in the disclosure). 

Regarding claim 10, Sung discloses a method for decoding audio signal (Fig. 5, #520) comprising: 
reconstructing a current latent vector by decoding a current frame of a bit stream (Fig. 9, #901, [0148-0151]); and 
decoding an audio signal by inputting the latent vector into a decoding neural network (Fig. 9, #903, 0148-0151], [0165-0169]).

Sung discloses encoding an audio signal by generating latent vectors reduced dimension and outputting bitstream through transmission channel (Fig. 5 and Fig. 7). On the receiving end, the latent vectors are restored from the received bitstream and using a neural network to decode audio signals (Fig. 5 and Fig. 9).

Sung does not explicitly discloses “generating a concatenation vector by concatenating a previous latent vector reconstructed from a previous frame of the bit stream with the current latent vector”.

Prabhavalkar discloses encoding a speech utterance by concatenating feature vectors obtained from context frames (i.e., previous words) with a current frame (Fig. 1, #234, [0044-0045], [0052], [0070], [0073], [0076]).  

Sung and Prabhavalkar are dealing with encoding an audio / speech signal into vectors. It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Sung’s teaching with Prabhavalkar’s teaching to concatenate feature vectors from a previous frame with current frame. One having ordinary skill in the art would have been motivated to make such a modification because a speech / audio signal content a sequence of frames and previous frames provide context information for the current frame. By combining vectors could improve encoding / decoding performance. In addition, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods, and in the combination each element merely would have performed the same function as it did separately. “A combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR, 550 U.S. ___, 82 USPQ2d at 1395 (2007). One of ordinary skill in the art would have recognized that the results of the combination were predictable.

Claims 1-3 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Chorowski (Document ID: “Unsupervised speech representation learning using wavenet autoencoders”) in view of Garbacea (Document ID: US-20200234725-A1)
Regarding claim 1, Chorowski teaches a method for encoding audio signal, comprising: generating a current latent vector by reducing a dimension of a current frame of an audio signal (Fig 1 and corresponding Description, show the dimensionality reduction being performed at encoder to VQ-VAE stage; For example 768 going to 64); encoding and quantizing vector to output a bit stream (Fig 1 and corresponding description, and Page 2, Col 2, Paragraph 4, lines 1-4, show the vector quantization process being performed on the encoded latent vector; also see Fig 2, table I, and Page 10, Col 1, Paragraph 4-6 where VQ-VAE which is used for quantization generating bits/ token).

Chorowski describes “combining latent vectors from neighboring time steps” (Section III, page 4) meets the claimed “generating a concatenation vector by concatenating a previous latent vector”. Similarly, Chorowski describes “the jittered latent sequence … to mix information across neighboring timesteps” (Section IV, page 5) also meets the argued limitation.

Some features are implicitly disclosed by Chorowski. MPEP (2144.01) stated “[I]n considering the disclosure of a reference, it is proper to take into account not only specific teachings of the reference but also the inferences which one skilled in the art would reasonably be expected to draw therefrom.” In re Preda, 401 F.2d825, 826, 159 USPQ 342, 344 (CCPA 1968). To further show claimed features, the examiner cites Garbacea which discloses more details about speech / audio encoding / decoding using latent representations. 

	Garbacea teaches the claimed limitation of generating concatenation vector using the current and previous samples (Fig 1A and Paragraph 0043-0044 where discrete latent representation is generated using the current input audio and previous input audio). It would have been inherent to one skill in the art to have used the discrete latent representation mentioned by Garbecca to present a formal representation and implementation of past sample mentioned to be considered by Chorowski during encoding process (Chorowski, Page 4, Col 1, line 9-14;). Garbacea is considered analogous to the claimed invention because it is also aimed towards audio coding and reconstruction. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Chorowski to incorporate discrete latent representation as taught by Garbecea to improve performance of the Fidelity of reconstructed speech (Paragraph 007).

The combined teaching further discloses the generating the current latent vector reduces the dimension of the current frame of the audio signal using a neural network (Garbecea, Fig 1A and Paragraph 0042; shows an encoded neural network being used to get encoder output of current input audio. Here, it is also mentioned a mean pooling process over time dimension which will inherently result in reduced frame dimension),

	Regarding claim 2, Chorowski in view of Garbecea further teach: wherein the neural network learns is trained according to a loss function of the current latent vector calculated by setting the previous latent vector as a conditional probability (Garbecea, Fig 1A-B and Paragraph 0097-0099; mention of a system determining reconstruction loss to update the decoder and encoder network parameter which include finding the probability between the input audio and decoder input which is the discrete latent representation shown in Fig 1A. Here, the discrete latent representation as mentioned and cited earlier consist of previous input audio as well as current input audio). Garbacea is considered analogous to the claimed invention because it is also aimed towards audio coding and reconstruction. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Chorowski to incorporate loss calculation as taught by Garbecea to improve performance of the Fidelity of reconstructed speech (Garbacea, Paragraph 007).
Regarding claim 3, Chorowski in view of Garbecea further teach wherein the neural network is trained according to an entropy of the current latent vector calculated by setting the previous latent vector as a conditional probability (Garbecea, Fig 1A-B and Paragraph 0097-0099; mention of a system determining reconstruction loss to update the decoder and encoder network parameter which include finding the probability between the input audio and decoder input which is the discrete latent representation shown in Fig 1A. Here, the discrete latent representation as mentioned and cited earlier consist of previous input audio as well as current input audio). The loss found to update the parameter of encoder network can be equated to the entropy mentioned in the claim as both are commonly used alternatively in the field. Furthermore, the loss found by Garbecca can be seen as doing the same conditional probability function as mentioned in the claim. Garbacea is considered analogous to the claimed invention because it is also aimed towards audio coding and reconstruction. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Chorowski to incorporate loss calculation as taught by Garbecea to improve performance of the Fidelity of reconstructed speech (Garbacea, Paragraph 007).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JIALONG HE/Primary Examiner, Art Unit 2659