DETAILED ACTION

Response to Arguments
Applicant's argues that Li does not teach the added amendments, a new grounds of rejection is presented in further view of Hori necessitated by the amendment.  The arguments to Claims 2 and 12 are convincing.  The rejection under 35 USC 112 is withdrawn in view of the amendments.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 2, 3, 12, 13  is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (US 2020/0066271) in view of Hori (US 2018/0330718).
See the following citations to Li:

Hori teaches:performing, by the at least one processor, an attention model training based on a new sequences of hidden states generated based on the sequence of hidden states (attention  weights computed for trainable network, [0042]);and decoding, by a decoder implemented by the at least one processor, the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training based on the same input data (decoding network as in [0042]) by independently performing the CTC model training and the attention model training (attention training [0042] and CTC training [0048]) .
It would have been obvious to one of ordinary skill in the art at the time of the invention to combine Li’s speech recognition system with Hori’s speech recognition system that including a CTC module to improve recognition accuracy (see Hori [0004]).

3, 13. The Seq2Seq speech recognition training method of claim 1, further comprising: performing the CTC model training based on a CTC loss function (optimize a connectionist temporal classification (CTC) objective function until convergence, [0065]). 


Claims 4, 5, 14, 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (US 2020/0066271) in view of Hori (US 2018/0330718) in view of Kim (JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING, 2017).

See the citations to Kim:
4, 14. The Seq2Seq speech recognition training method of claim 1, further comprising: performing the attention model training based on a cross entropy loss function (see Kim Eq. 8).	It would have been obvious to one of ordinary skill in the art at the time of the invention to combine Li’s speech recognition system with Hori’s speech recognition system that including a CTC 
5, 15. The Seq2Seq speech recognition training method of claim 1, wherein the independently performing the CTC model training and the attention model training comprises: performing the CTC model training to minimize CTC loss during a first time period (see Kim, Eq. 3); A person of ordinary skill in the art would have had good reason to pursue the known options of performing the attention model training to minimize cross entropy loss during a second period different from the first period. It would require no more than "ordinary skill and common sense," to perform training of two separate models at different times to account for processing and memory constraints.	It would have been obvious to one of ordinary skill in the art at the time of the invention to combine Li’s speech recognition system with Hori’s speech recognition system that including a CTC module to improve recognition accuracy (see Hori [0004]) with Kim’s joint CTC-attention model to reduce misalignment errors (see Section 1 final ¶). 
Claims 7, 8, 9, 17, 18, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable Li et al (US 2020/0066271) in view of Hori (US 2018/0330718) in view of Kim (JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING, 2017) in view of Senior et al (US 2017/0011738).

See the citations to Kim:

See the citations to Senior
generating a context information by calculating a soft alignment over all steps of the additionally transformed sequence of hidden states based on the query (soft alignment [0009]).
	It would have been obvious to one of ordinary skill in the art at the time of the invention to combine Li’s speech recognition system with Hori’s speech recognition system that including a CTC module to improve recognition accuracy (see Hori [0004]) with Kim’s joint CTC-attention model to reduce misalignment errors (see Section 1 final ¶) with Senior’s soft alignment to improve performance of small models.

See the citations to Li:
8, 18. The Seq2Seq speech recognition training method of claim 7, wherein the context information is a summary of speech signals encoded in hidden layers of the encoder (The attention-based biasing 

9, 19. The Seq2Seq speech recognition training method of claim 7, wherein the context information is generated using scalar energy computed based on content similarity between the additionally transformed sequence of hidden states at each time step and the query information (soft alignment [0009]).
	It would have been obvious to one of ordinary skill in the art at the time of the invention to combine Li’s speech recognition system with Hori’s speech recognition system that including a CTC module to improve recognition accuracy (see Hori [0004]) with Kim’s joint CTC-attention model to reduce misalignment errors (see Section 1 final ¶) with Senior’s soft alignment to improve performance of small models.
Allowable Subject Matter
Claims 2, 6, 10, 12, 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MATTHEW H BAKER/               Primary Examiner, Art Unit 2659