DETAILED ACTION
Response to Arguments
Applicant's arguments filed 06/13/22 have been fully considered but they are not persuasive.
Applicant argues that the limitations of " recognizing the input speech by recognizing one or more subwords of a portion subsequent to the at least one portion of the input speech based on the at least one second subword," as recited in independent claim 1.” are not taught by Kanda (see Remarks p. 11 final paragraph).
Applicant and examiner agree that the posterior probability P(s|X) is obtained by applying a mapping function as disclosed in [0060].  Examiner is interpreting the teaching in [0060] where different label sequences are mapped to the same intermediate product as reading on the limitation. In other words, the symbols A, B, and C can be interpreted as subwords which are combined to a subword AB and subsequently a second subword ABC.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 4-16, 18-24 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Kanda et al (US20190139540 – see the PCT priority date).
Kanda teaches:
1, 14. A processor-implemented speech recognition method comprising:
extracting a speech feature from an input speech to be recognized ([0068] and 304 Fig. 5);
estimating a first sequence of first subwords corresponding to at least one portion of the input speech based on the extracted speech feature ([0075]);
converting the first sequence to a second sequence of at least one second subword by combining at least two of the first subwords to be the at least one second subword ([0073] – also see [0060] where subwords A, B, C, are combined to make AB and ABC); and
recognizing the input speech by recognizing one or more subwords of a portion subsequent to the at least one portion of the input speech based on the at least one second subword ([0068], also see the posterior probability of the combined word sequence as in [0076] -- the symbols A, B, and C can be interpreted as subwords which are combined to a subword AB and subsequently a second subword ABC [0060]).

2, 15. The method of claim 1, wherein the estimating of the first sequence comprises:
estimating each of the first subwords corresponding to the at least one portion of the input speech using an end-to-end encoder-decoder implementing one or more neural networks, wherein an output layer of the end-to-end encoder-decoder includes nodes corresponding to subwords in a subword dictionary ([0036]).


4, 18. The method of claim 1, wherein the converting of the first sequence to the second sequence comprises:
generating, based on the first subwords, a subword corresponding to a word recognizable to an end-to-end encoder-decoder as the at least one second subword (the occurrence probability model as in [0036] will recognize a second subword based on a first).

5. The method of claim 4, wherein the recognizable word is a word used for training the end-to-end encoder-decoder (see Fig. 5).

6. The method of claim 1, wherein the converting of the first sequence to the second sequence comprises:
generating a subword in a subword dictionary as the at least one second subword (best hypothesis [0077]).

7. The method of claim 1, wherein the converting of the first sequence to the second sequence comprises:
in response to a sequence of the first subwords forming a word, generating a subword corresponding to the formed word as the at least one second subword (speech recognition as taught [0077]).

8, 18. The method of claim 1, wherein the converting of the first sequence to the second sequence comprises:
determining whether a formation of a word is completed by a lastly-generated first subword among the first subwords (word pronunciation dictionary interprets phoneme context [0072]);
in response to the formation of the word being completed, identifying, from a subword dictionary, a subword matching at least one combination of the first subwords as the at least one second subword (word posterior probabilities [0076]); and 
converting the first subwords to the identified at least one second subword (best word posterior probabilities hypothesis [0076].

9. The method of claim 8, wherein the determining of whether the formation of word is completed by the lastly-generated first subword comprises:
determining whether the formation of the word is completed based on information included in the subword dictionary as to whether a spacing is present after the lastly-generated first subword (blank labels include as in [0075]).

10, 19. The method of claim 1, wherein the converting of the first sequence to the second sequence comprises:
generating a text from the first sequence using a text subword decoder ([0068]);
generating the second sequence of the at least one second subword by encoding the text using a text subword encoder ([0068]); and
in response to the first sequence and the second sequence differing from each other, converting the first sequence to the second sequence (highest score probability as a recognized speech text [0069]).

11, 20. The method of claim 10, further comprising:
estimating a sequence of subwords corresponding to at least one portion of the input speech at each of a plurality of points in time, wherein the estimating of the first sequence is performed at a current point in time among the plurality of points in time ([0075]); and
updating the current point in time by subtracting, from the current point in time, a value
obtained by subtracting a length of the second sequence from a length of the first sequence (see Fig. 3) and [0057].

12, 21. The method of claim 1, wherein the estimating of the first sequence comprises:
generating first sequence candidates corresponding to the at least one portion of the input speech, wherein the converting of the first sequence to the second sequence comprises:
generating second sequence candidates corresponding to the first sequence candidates (highest score probability as a recognized speech text [0069]);
generating recognition results corresponding to the second sequence candidates using a language model ([0067, 0072]); and
determining one of the second sequence candidates to be the second sequence based on the generated recognition results (highest score probability as a recognized speech text [0069]).

13. A non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the method of claim 1 (see citations in claim 1 – computer readable medium taught as in claim 6 of the reference).


22. A processor-implemented speech recognition method comprising:
generating, at an output layer of a recurrent neural network (RNN), a first subword based on a feature extracted from a speech signal ([0075]);
generating, at the output layer of the RNN, a second subword based on the first subword (the occurrence probability model as in [0036] will recognize a second subword based on a first);
generating a third subword by combining the first and the second subwords (the occurrence probability model as in [0036] will recognize a second subword based on a first, and so on);
generating, at the output layer of the RNN, a fourth subword based on the third subword (the occurrence probability model as in [0036] will recognize any number of subwords);
and recognizing the speech signal based a determined sequence of the third and fourth subwords subsequent to a sequence of the first and second subwords ([0077]).

23. The method of claim 22, wherein the generating of the fourth subword comprises restoring a state of a hidden layer in the RNN to a state before the first and the second subwords were generated, such that the generating of the fourth subword is not based on the generation of the first and the second subwords (see the sequence following the blank label which is independent of the previous sequences [0020]).

24. The method of claim 22, wherein the generating of the third subword comprises combining the first and the second subwords in response to a word being formed by the sequence of the first and the second subwords(the occurrence probability model as in [0036] will recognize any number of subwords).

Allowable Subject Matter
Claims 3 and 17 are allowed.


 Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Matthew H Baker whose telephone number is (571)270-1856. The examiner can normally be reached Monday-Friday 9-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571) 272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MATTHEW H BAKER/Primary Examiner, Art Unit 2655