DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged.  

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/24/2019 and 11/17/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
	
Claim Objections
Claim 21 is objected to because of the following informalities:
  
Claim 21 is directed to a computer readable-medium. Claim 21 should depend on claim 19, not on a method of claim 18.  

Appropriate correction is required.

Specification
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter.  See 37 CFR 1.75(d)(1) and MPEP § 608.01(o).  Correction of the following is required:
 
Claim 12 recites limitations related to “sending a posterior vector to a word hypothesis decoder”. The examiner could not find any relevant section in the specification that describes the claimed feature. 

Claim 15 recites “the depth processing block comprises a plurality of maxout DNN processing blocks”. The examiner could not find a relevant section in the specification that discloses the claimed feature. Does the term “maxout” refer to “softmax” functions of a neural network illustrated in Fig. 2A? 

CLAIM INTERPRETATION

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.


The claims 1-6 in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “an input layer”, “a plurality of time layer”, “a depth processing block” and “an output layer” in claims 1-6. 


If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-11 and 13-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a Mental Processes without significantly more. The claims 1-11 and 13-21 recite “an input layer”, “a plurality of time layer”, “a depth processing block” and “an output layer” for processing input data and generating some output data (claimed “a classified posterior vector”). In light of the specification, the claimed “a trained machine learning model” is a set of equations and the input data and output data are just numbers. The steps defined by a depth processing block are calculations for converting input data into a different set of data (“a classified posterior vector”). The 

Examiner Remarks
 After studying the disclosure (specification and drawings), the examiner understand that the instant patent application was drafted based on a published research paper (“Improving layer trajectory LSTM with future context frames”, IEEE, ICASSP 2019). This research paper was filed as a provisional application (62/834,622) on 04/16/2019 and also submitted in the IDS filed 05/24/2019 (item #34). 

Two inventor’s previously published papers were submitted in the IDS filed on 05/24/2019 (Li, item #14 and Das, item #2). The examiner conducted an extensive search and believed that a combination of two references (Li and Das, co-authored by the instant inventors) could meet the broadly recited limitations.  

	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-11 and 13-21 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (“Exploring multidimensional LSTMs for large vocabulary ASR”, 2016, applicant submitted IDS, referred as Li) in view of Das et al. (“Advancing connectionist temporal classification with attention modeling”, March, 2018, applicant submitted IDS, referred as Das). 

Regarding claims 1, 7 and 19, Li discloses a system, a method and a computer readable medium (Section 5, a computer implemented neural network-based speech recognition system by using T-LSTM, F-LSTM and TF-LSTM neural network models) comprising:
 receiving an input signal as a series of frames representing at least one of handwriting data, speech data, audio data, and textual data (Section 5, speech recognition using the neural network models); 
Section 5, speech recognition experiments using different types of neural network models) comprising: 
a plurality of time layers, each time layer comprising a uni-directional recurrent neural network processing block (Section 2, section 3, Fig. 2, in light of specification ([0006]), the claimed “a uni-directional recurrent neural network” refers to LSTM neural networks, which is a type of recurrent neural network), 
a depth processing block that scans hidden states of the recurrent neural network processing block of each time layer (Section 3, scan both time and frequency axes jointly, Fig. 2 shows hidden states of time and frequency axes), and 
an output layer that outputs a final classification (Section 3 and section 4); and 
receiving from the trained machine learning model a final classification comprising a classified posterior vector of the input signal (Section 2 and 3, in light of the specification [0025], [0027], the claimed “classification” refers output from T-LSTM).

Li discloses using time LSTM (T-LSTM), frequency LSTM (F-LSTM) and time-frequency LSTM (TF-LSTM) for speech recognition. Li does not disclose “wherein the depth processing block is associated with a first frame and receives context frame information of a sequence of one or more future frames relative to the first frame”.

wherein the depth processing block is associated with a first frame and receives context frame information of a sequence of one or more future frames relative to the first frame” (Das, Fig. 1, section 2 and section 3, LSTM network, CTC attention, with context vectors Cu-1, Cu, Cu+1). 

Both Li and Das are in the area of using neural network for speech recognition. It would have been obvious to a person having ordinary skill in the art at the time the invention was made to combine Li’s teaching with Das teaching to receive context frame information of a sequence of one or more future frames relative to the first frame. One having ordinary skill in the art would have been motivated to make such a modification to reduce error rate (Das, Abstract). 

Regarding claims 2, 8 and 20, the combined teaching of Li in view of Das further discloses the depth processing block receives the context frame information from an output of a time layer processing block of the future frame (Li, Fig. 2, Das, section 3.3. and Fig. 1, note Cu+1 is a feature frame).

Regarding claims 3, 9 and 21, the combined teaching of Li in view of Das further discloses the depth processing block receives the context frame information from another depth processing block of the future frame (Li, Fig. 2, Das, section 3.3, equation 20, and Fig. 1, note Cu+1 is a feature frame).

Regarding claims 4 and 10, the combined teaching of Li in view of Das further discloses the context frame information is added to an input of the depth processing block (Li, Fig. 2, Das, section 3.3, equations 12 and equation 16, and Fig. 1, note Cu+1 is a feature frame).

Regarding claim 5, the combined teaching of Li in view of Das further discloses the depth learning block is associated with a deep learning model (Li, Fig. 1 shows recurrent connections and Fig. 2, multi-layer LSTM networks, which are deep learning model, which is further listed in claim 6 below).

Regarding claim 6, the combined teaching of Li in view of Das further discloses the deep learning model is associated with at least one of: (i) a deep neural network, (ii) a deep belief network, (iii) a recurrent neural network, and (iv) a convolutional neural network (Li, Fig. 1 and Fig. 2 TF-LSTM and T-LSTM are a type of recurrent neural network; the reference only need to teach ONE alternative recited using “at least one of”).

Regarding claim 11, the combined teaching of Li in view of Das further discloses the input signal is speech data and the output is a senone posterior vector (Das, section 2, a posterior distribution probabilities p(y/x), note a term “senone” just refers to an output from a neural network).

Li, Time LSTM in Fig. 2, Das, Fig. 1). 

Regarding claim 14, the combined teaching of Li in view of Das further discloses the depth processing block comprises a plurality of gated DNN processing blocks, one corresponding to each time layer (Li, section 2, Fig. 1 shows a cell in LSTM, input gate, output gate and forget gate, Fig. 2 shows many layers which is a deep neural network (DNN)).

Regarding claim 15, the combined teaching of Li in view of Das further discloses the depth processing block comprises a plurality of maxout DNN processing blocks, one corresponding to each time layer (Das, Fig. 1, softmax element the each time layer).

Regarding claim 16, the combined teaching of Li in view of Das further discloses an attention layer between the depth processing block and the output layer (Das, Fig. 1, attention layers between LSTM and output layer)

Regarding claim 17, the combined teaching of Li in view of Das further discloses during training of the machine learning model, arranging to make future context frame information available (Das, Section 3 and section 4, training using context information).

Li, section 3, overlapped chunks)

Allowable Subject Matter
Claim 12 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The examiner discovered several relevant prior art references that are related to one or more concepts disclosed by the instant application. These references are included in the attached PTO-892 form for completeness of the record.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/JIALONG HE/Primary Examiner, Art Unit 2659