DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.


Drawings
The applicant’s submitted drawings appear to be acceptable for examination purposes. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the drawings.


Information Disclosure Statement
The listing of references in the specification is not a proper information disclosure statement.  37 CFR 1.98(b) requires a list of all patents, publications, or other information submitted for consideration by the Office, and MPEP § 609.04(a) states, "the list may not be incorporated into the specification but must be submitted in a separate paper."  Therefore, unless the references have been cited by the examiner on form PTO-892, they have not been considered.

As required by M.P.E.P. 609(c), the applicant's submission of the Information Disclosure Statements, dated 3 April 2020 and 21 July 2021, are acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending.  As required by M.P.E.P 609 C(2), a copy of the PTOL-1449 forms, initialed and dated by the examiner, are attached to the instant office action.


Claim Objections
Claim 14 is objected to because of the following informalities:  “adjust to shift” appears as though it should just be “shift” or “adjust by shifting” or similar.  Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 17-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter.

Claims 17-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Specifically, according to the description given in the specification, paragraph 0141, the broadest reasonable interpretation of “computer readable storage medium” covers transitory propagating signals, which are non-statutory. To overcome this rejection, applicant should insert --non-transitory-- before “computer readable storage medium”. Such an amendment is not considered new matter. See the "Subject Matter Eligibility of Computer Readable Media" memo dated January 26, 2010 (OG Cite: 1351 OG 212; OG Date: 23 Feb 2010).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7-9 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

The term “approximately” in claim 7 is a relative term which renders the claim indefinite. The term “approximately” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.

Claim 8 recites the limitation "the amount to shift" in line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim 9 recites the limitation "the amount to shift" in line 1.  There is insufficient antecedent basis for this limitation in the claim.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurata et al. (Improved Knowledge Distillation from Bi-Directional to Uni-Directional LSTM CTC for End-to-End Speech Recognition, Dec 2018, pgs. 411-417) in view of Hannun (US 2016/0171974).

As per claim 1, Kurata teaches a computer-implemented method for training a model, comprising: obtaining a training sample including an input sequence of observations and a target sequence of symbols having length different from the input sequence of observations [training a speech recognition model includes feeding feature sequences to the model and comparing resultant outputs to a target symbol sequence (pgs. 411-412, sections 1-2.2; etc.)]; feeding the input sequence of observations into the model to obtain a sequence of predictions [training a speech recognition model includes feeding feature sequences to the model and comparing resultant outputs to a target symbol sequence (pgs. 412-413, sections 2-3; etc.)]; and updating the model based on a loss using a sequence of predictions and the target sequence of the symbols [training a speech recognition model includes feeding feature sequences to the model and comparing resultant outputs to a target symbol sequence (pgs. 412-413, sections 2-3; etc.) to update the model via back-propagation using a CTC loss function (pg. 414, section 3, etc.)].
While Kurata also teaches that the length of the outputs and inputs may differ by an alignment in time, and searching multiple time frames for the appropriate output for comparison (see, e.g., Kurata: sections 2.1 and 3, and fig. 3), it does not explicitly teach shifting the sequence of predictions by an amount with respect to the input sequence of observations; and updating the model based on a loss using a shifted sequence of predictions and the target sequence of the symbols.
Hannun teaches shifting the sequence of predictions by an amount with respect to the input sequence of observations [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (paras. 0053-54)]; and updating the model based on a loss using a shifted sequence of predictions and the target sequence of the symbols [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them, where the shifted outputs may be used with a CTC loss function to train and update the model (paras. 0053-54, claim 1, etc.)].
Kurata and Hannun are analogous art, as they are within the same field of endeavor, namely training and applying a ML model for speech recognition.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the shifting of the output sequences of the model to a chosen alignment, as taught by Hannun, with(or in place of) the comparing of the output sequences of the uni-directional model to output sequences at multiple preceding time indices of the teacher model in the system taught by Kurata.
Hannun provides motivation as [the issue of time shifts between models may be addressed by shifting either/both the inputs or the outputs of the models (paras. 0053-54); for the time shifts in the system taught by Kurata].

As per claim 2, Kurata/Hannun teaches wherein the sequence of predictions is shifted forward with respect to the input sequence of observations to generated the shifted sequence of predictions and wherein the model is unidirectional [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54) for training the unidirectional model (Kurata: pgs. 412-413, sections 2-3 and fig. 3; etc.)].

As per claim 3, Kurata/Hannun teaches wherein the model is a recurrent neural network based model [the models are long short-term (LSTM) models (i.e., RNN) (Kurata: pg. 410, abstract, etc.); and RNN model is used (Hannun: abstract, etc.)].

As per claim 4, Kurata/Hannun teaches wherein the loss is CTC (Connectionist Temporal Classification) loss [a CTC loss function is used (Kurata: pgs. 410-414, etc.; Hannun: paras. 0046, 0054, etc.)].

As per claim 5, Kurata/Hannun teaches wherein shifting the sequence of predictions comprises: adjusting so that the lengths of the shifted sequence of predictions and the input sequence of observations are the same [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54, etc.)].

As per claim 6, Kurata/Hannun teaches wherein shifting the sequence of predictions and updating the model using the shifted sequence of predictions are performed at a predetermined rate [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54, etc.) for W-1 time indices (the rate of shift) to find the appropriate match (Kurata: pgs. 412-414, section 3 and fig. 3; etc.); with a specified learning rate (Kurata: pg. 414, section 4; Hannun: para. 0046; etc.)].

As per claim 7, Kurata/Hannun teaches wherein the predetermined rate ranges from approximately 5 to 40% [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54, etc.) for W-1 time indices (the rate of shift) to find the appropriate match (Kurata: pgs. 412-414, section 3 and fig. 3; etc.); with a specified learning rate (Kurata: pg. 414, section 4; Hannun: para. 0046; etc.)].
While Kurata/Hannun do not specifically teach the rate of approximately 5 to 40%, it would have been obvious to one of ordinary skill in the art to extend the rate taught by Kurata/Hannun to include this rate, as it has been held that where the general conditions of a claim are disclosed in the prior art, discovering the optimum or working ranges involves only routine skill in the art. In re Aller, 105 USPQ 233.

As per claim 8, Kurata/Hannun teaches wherein the amount to shift is fixed [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54, etc.) for W-1 time indices (the rate of shift) to find the appropriate match (Kurata: pgs. 412-414, section 3 and fig. 3; etc.)].

As per claim 9, Kurata/Hannun teaches wherein the amount to shift is determined probabilistically within a predetermined range [time shifts between inputs or outputs may be fixed by shifting the inputs or outputs to align them (Hannun: paras. 0053-54, etc.) for W-1 time indices (the rate of shift) to find the appropriate match (Kurata: pgs. 412-414, section 3 and fig. 3; etc.); with a specified learning rate (Kurata: pg. 414, section 4; Hannun: para. 0046; etc.)].
While Kurata/Hannun do not specifically teach determining the range for shift amounts probabilistically, it would have been obvious to one of ordinary skill in the art to extend the rate taught by Kurata/Hannun to include this determination, as it has been held that where the general conditions of a claim are disclosed in the prior art, discovering the optimum or working ranges involves only routine skill in the art. In re Aller, 105 USPQ 233.

As per claim 10, Kurata/Hannun teaches wherein the model is a neural network based model having a plurality of parameters [the models are long short-term (LSTM) models (i.e., RNN) (Kurata: pg. 410, abstract, etc.); and RNN model is used (Hannun: abstract, etc.)], wherein feeding the input sequence comprises conducting a forward-propagation through the neural network based model [the model is unidirectional (forward) (Kurata: pgs. 412-413, sections 2-3 and fig. 3; etc.)], and wherein updating the model comprises performing back-propagation through the neural network based model to update the plurality of parameters [updating the model via back-propagation using a CTC loss function (Kurata: pgs. 412-414, sections 2-3; Hannun: paras. 0046, 0054, etc.)].

As per claim 11, Kurata/Hannun teaches wherein the model comprises an end-to-end speech recognition model [an end to end automatic speech recognition (ASR) model (Kurata: abstract; Hannun: abstract and paras. 0098-101; etc.)], each observation in the input sequence of the training sample represents an acoustic feature and each symbol in the target sequence of the training sample represents a phone, a context dependent phone, a character, a word-piece, or a word [the inputs include acoustic features (Kurata: pg. 411, abstract-section 1; Hannun: paras. 0007, 0036; etc.) and the outputs include phones, characters, words, or combinations (Kurata: pg. 411, section 1; Hannun: paras. 0036, 0055-57; etc.)].

As per claim 12, see the rejection of claim 1, above, wherein Kurata/Hannun also teaches a computer system for training the model, by executing program instructions, the computer system comprising: a memory storing the program instructions; and processing circuitry in communication with the memory for executing the program instructions configured to perform the method steps [the system may be implemented by a processor executing instructions from a storage medium (Hannun: paras. 0024, 0122-123, etc.)].

As per claim 13, see the rejection of claim 2, above.

As per claim 14, see the rejection of claim 5, above.

As per claim 15, see the rejection of claim 6, above.

As per claim 16, see the rejection of claim 3, above.

As per claim 17, see the rejection of claim 1, above, wherein Kurata/Hannun also teaches a computer program product for training the model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform the method steps [the system may be implemented by a processor executing instructions from a storage medium (Hannun: paras. 0024, 0122-123, etc.)].

As per claim 18, see the rejection of claim 2, above.

As per claim 19, see the rejection of claim 5, above.

As per claim 20, see the rejection of claim 6, above.

As per claim 21, see the rejection of claim 3, above.

As per claim 22, see the rejection of claim 17, above, wherein Kurata/Hannun also teaches feeding an input into the model to obtain an output [training a speech recognition model includes feeding feature sequences to the model and comparing resultant outputs to a target symbol sequence (Kurata: pgs. 411-412, sections 1-2.2; Hannun: paras. 0036-38; etc.)].


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-22 are rejected.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Howard (US 2019/0074028) – discloses shifting audio data for alignment and utilizing a CTC loss function.
Abdel-Hamid et al. (Convolutional Neural Networks for Speech Recognition, July 2014, 1533-1545) – discloses using a hybrid model including a HMM for time shifts in speech recognition.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEORGE GIROUX/Primary Examiner, Art Unit 2128