Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 19, 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference
claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).

A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto- processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 45, 47-48, 50-54, and 56 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 7-13, 29, and 31 of U.S. Patent No. 10783880. Although the claims at issue are not identical, they are not patentably distinct from each other because simply removing inherent and/or unnecessary limitations/step would be within the level of one of ordinary skill in the art. In re Karlson, 136 USPQ 184 (CCPA 1963). Also note Ex parte Rainu, 168 USPQ 375

(Bd. App. 1969). Omission of a reference element or step whose function is not needed would be obvious to one of ordinary skill in the art.


U.S. Publication No. 20200320987
U.S. Patent No. 10783880
45. A speech processing system comprising: an input for receiving an input utterance spoken by a user; a speech recognition system that recognizes the input utterance spoken by the user and that outputs a recognition result comprising a sequence of recognized words and sub- word units corresponding to the input utterance; an acoustic model store that stores acoustic speech models; a word alignment unit configured to receive the sequence of recognized words and sub-word units output by the speech recognition system and to align a sequence of said acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors representing the input utterance spoken by the user and to output an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user.
1. A speech processing system comprising: an input for receiving a n input utterance spoken by a user in response to a read prompt text; an acoustic model store that stores acoustic speech models; a read prompt data store that stores text data identifying the sequence of words in the read prompt; a data store that stores data defining a first network having a plurality of paths through the first network, each path representing a different possible utterance that a user might say in response to the read prompt text, the different paths allowing for: i) the user to skip part of the read prompt text; ii) the user to repeat part or all of the read prompt text; and iii) the user to insert speech sounds between words in the read prompt text; and a word alignment unit configured to align different sequences of said acoustic speech models with the input utterance spoken by the user, each different sequence of acoustic speech models corresponding to one of the different possible utterances that a user might make in response to the read prompt text as represented by a path through said first network, and to output an alignment result identifying: i) a matching possible utterance from all of the possible utterances represented by the first network that matches with the input utterance spoken by the user; ii) any parts of the read prompt text that the user skipped; iii) any parts of the read prompt text that the user repeated; iv) any speech sounds that the user inserted between words of the read prompt text; and v) a time alignment between words and sub-word units of the matching possible utterance and the input utterance spoken by the user.

47. The speech processing system of claim 45, wherein the word alignment unit is configured to output a sequence of sub-word units corresponding to a dictionary pronunciation of the matching possible utterance.
7. The speech processing system of claim 1, wherein the word alignment unit is configured to output a sequence of sub-word units corresponding to a dictionary pronunciation of the matching possible utterance.
48. A speech processing system according to claim 47, further comprising a sub-word alignment unit configured to receive the sequence of sub-word units corresponding to the dictionary pronunciation and configured to align the sequence of sub-word units corresponding to the dictionary pronunciation received from the word alignment unit with the input utterance spoken by the user whilst allowing for sub-word units to be inserted between words and for sub-word units of a word to be replaced by other sub-word units to determine where the input utterance spoken by the user differs from the dictionary pronunciation and to output a sequence of sub-word units corresponding to an actual pronunciation of the input utterance spoken by the user.
8. A speech processing system according to claim 7, further comprising a sub-word alignment unit configured to receive the sequence of sub-word units corresponding to the dictionary pronunciation and configured to determine where the input utterance spoken by the user differs from the dictionary pronunciation and to output a sequence of sub-word units corresponding to an actual pronunciation of the input utterance spoken by the user
9. A speech processing system according to claim 8, wherein the sub-word alignment unit is configured to align the sequence of sub-word units corresponding to the dictionary pronunciation received from the word alignment unit with the input utterance spoken by the user whilst allowing for sub-word units to be inserted between words and for sub-word units of a word to be replaced by other sub-word units
50. A speech processing system according to claim 49, wherein the sub-word alignment unit is configured to maintain a score representing the closeness of the match between the acoustic speech models for the different paths defined by the second network and input utterance spoken by the user.
11. A speech processing system according to claim 10, wherein the sub-word alignment unit is configured to maintain a score representing the closeness of the match between the acoustic speech models for the different paths defined by the second network and input utterance spoken by the user.
51. A speech processing system according to claim 48, further comprising a speech scoring feature determining unit configured to receive and to determine a measure of similarity between the sequence of sub-word units output by the word alignment unit and the sequence of sub-word units output by the sub-word alignment unit.
	
12. A speech processing system according to claim 8, further comprising a speech scoring feature determining unit configured to receive and to determine a measure of similarity between the sequence of sub-word units output by the word alignment unit and the sequence of sub-word units output by the sub-word alignment unit.  

52. A speech processing system according to claim 45, further comprising a free align unit configured to align acoustic speech models with the input utterance spoken 4by the user and to output an alignment result including a sequence of sub-word units that matches with the input utterance spoken by the user.
13. A speech processing system according to claim 1, further comprising a free align unit configured to align acoustic speech models with the input utterance spoken by the user and to output an alignment result including a sequence of sub-word units that matches with the input utterance spoken by the user.
53. A speech processing system according to claim 45, comprising a speech scoring feature determining unit configured to receive and to determine a plurality of speech scoring feature values for the input utterance.
14. A speech processing system according to claim 1, comprising a speech scoring feature determining unit configured to receive and to determine a plurality of speech scoring feature values for the input utterance.
54. A speech processing system according to claim 53, further comprising a scoring unit operable to receive the plurality of speech scoring feature values for the input utterance determined by the speech scoring feature determining unit and configured to generate a score representing the language ability of the user.
29. A speech processing system according to claim 14, further comprising a scoring unit operable to receive the plurality of speech scoring feature values for the input utterance determined by the speech scoring feature determining unit and configured to generate a score representing the language ability of the user.
56. A speech processing method comprising: receiving an input utterance spoken by a user; using a speech recognition system to recognize the input utterance spoken by the user and to output a recognition result comprising a sequence of recognized words and sub-word units corresponding to the input utterance; and receiving the sequence of recognized words and sub-word units output by the speech recognition system and aligning a sequence of acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors representing the input utterance spoken by the user; and outputting an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user.
31. A speech processing system comprising: an input for receiving a sequence of acoustic feature vectors representative of an utterance spoken by a user in response to a read prompt text; an acoustic model store that stores acoustic models of sub-word units; a read prompt data store that stores text data identifying the sequence of words in the read prompt; a data store that stores a network representing different possible utterances that a user might make in response to the read prompt text, the network including a plurality of paths each representative of a different possible utterance, the different paths allowing for: i) the user to skip part of the read prompt text; ii) the user to repeat part or all of the read prompt text; and iii) the user to insert speech sounds between words in the read prompt text; and a word alignment unit configured to align different sequences of said acoustic models with the input sequence of acoustic feature vectors representative of the utterance spoken by the user, each different sequence of acoustic models corresponding to one of the different possible utterances that a user might make in response to the read prompt text as defined by a path through said network, the word alignment unit identifying a possible utterance that matches with the input utterance, the possible utterance identifying any parts of the read prompt text that the user skipped, identifying any parts of the read prompt text that the user repeated, and identifying any speech sounds that the user inserted between words of the read prompt text.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 45-56 are rejected under 35 U.S.C. 103 as being unpatentable over Yoon (U.S. Publication No. 20140141392) in view of Hwang (U.S. Publication No. 20050203738).
Regarding claim 45, Yoon discloses a speech processing system comprising:
an input for receiving an input utterance spoken by a user (See e.g., “…input device 674, such as a microphone,…” and how “…speech sample 202 is provided…,” YOON paras. 13-15, 31);
a speech recognition system that recognizes the input utterance spoken by the user and that outputs a recognition result comprising a sequence of recognized words and sub-word units corresponding to the input utterance (See e.g., “…A speech sample 202 is accessed and provided to an automatic speech recognizer 204 that generates word hypotheses for the speech sample 202 and time stamp associations for those word hypotheses that are output 206 to a speech sample scoring engine 208…,” YOON paras. 13-15, Figs. 1, 2, 4, 5);
an acoustic model store that stores acoustic speech models (See e.g., “…an acoustic model trained on native English speakers to generate word hypotheses, time stamp associations, and other acoustic measures 406…”; “…automatic speech recognizer 404 may include an acoustic model trained using non-native speakers…,” YOON paras. 13-15, 22-24, Figs. 1, 2, 4, 5).
However, Yoon does not disclose a word alignment unit configured to receive the sequence of recognized words and sub-word units output by the speech recognition system and to align a sequence of said acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors rep resenting the input utterance spoken by the user and to output an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user.
Hwang does teach a word alignment unit configured to receive the sequence of recognized words and sub-word units output by the speech recognition system and to align a sequence of said acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors rep resenting the input utterance spoken by the user and to output an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user (See e.g., how in Fig. 4 best phonetic sequence 407 and list of possible phonetic sequences 412 are inputted to alignment module 414, and further outputted to rescoring module 416 in combination with acoustic model 318, and see also how “best phonetic sequence 407 from SLU engine 405 and list of possible phonetic sequences 412 from grammar module 404 are provided to alignment module 414… alignment module 414 aligns phonetic sequences 407 and 412 …,” HWANG paras. 52-59, Fig. 4).
It would have been obvious to a person of ordinary skill in the art, before the effective filling date of the claimed invention, to modify the teachings of YOON with an architecture having alignment capabilities as techniques and applications of the same taught by HWANG in order to advantageously furnish alignment modules and/or methods having advantages for calculating speech recognition error rates due, for example, from substitution errors, deletion errors, and/or insertion errors when assessing pronunciation of user’s spoken input (HWANG paras. 52-61, Fig. 4).
Regarding claim 46, Yoon in view of Hwang teaches the speech processing system, wherein the word alignment unit is configured to output a sequence of sub-word units corresponding to a dictionary pronunciation of the recognized input utterance (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4).
Regarding claim 47, Yoon in view of Hwang teaches the speech processing system, wherein the word alignment unit is configured to output a sequence of sub-word units corresponding to a dictionary pronunciation of the matching possible utterance (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4).
Regarding claim 48, Yoon in view of Hwang teaches a speech processing system, further comprising a sub-word alignment unit configured to receive the sequence of sub-word units corresponding to the dictionary pronunciation (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4)
and configured to align the sequence of sub-word units corresponding to the dictionary pronunciation received from the word alignment unit with the input utterance spoken by the user whilst allowing for sub-word units to be inserted between words (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4) and for sub-word units of a word to be replaced by other sub-word units to determine where the input utterance spoken by the user differs from the dictionary pronunciation (See e.g., how “…in some cases the user's pronunciation of a new word can be very different than a typical pronunciation. For instance, a speaker might pronounce an English word by substituting a foreign translation of the English word. This feature, for example, would permit a speech recognition lexicon to store the text or spelling of a word in one language and the acoustic description in a second language different from the first language…,” HWANG paras. 52-61, Fig. 4) and to output a sequence of sub-word units corresponding to an actual pronunciation of the input utterance spoken by the user (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4).
Regarding claim 49, Yoon in view of Hwang teaches a speech processing system, wherein the sub-word alignment unit is configured to use the sequence of sub-word units corresponding to the dictionary pronunciation of the recognized input utterance to generate a network having a plurality of paths allowing for sub-word units to be inserted between recognized words and for sub-word units of a recognized word to be replaced by other sub-word units and wherein the sub-word alignment unit is configured to align acoustic speech models for the different paths defined by the network with the input utterance spoken by the user (See e.g., “…alignment module 414 places the aligned phonetic sequences in a single graph. During this process, identical phonetic units that are aligned with each other are combined onto a single path. Differing phonetic units that are aligned with each other are placed on parallel alternative paths in the graph… The single graph is provided to rescoring module 416…to rescore possible combinations of phonetic units represented by paths through the single graph… rescoring module 416 performs a Viterbi search to identify the best path through the graph using acoustic model scores generated by comparing the feature vectors 403 produced by the user's pronunciation of the word with the model parameters stored in acoustic model 318 for each phonetic unit along a path…,” HWANG paras. 52-61, 78, Fig. 4).
Regarding claim 50, Yoon in view of Hwang teaches a speech processing system, wherein the sub-word alignment unit is configured to maintain a score representing the closeness of the match between the acoustic speech models for the different paths defined by the second network and input utterance spoken by the user (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,” HWANG paras. 52-61, Fig. 4).
Regarding claim 51, Yoon in view of Hwang teaches a speech processing system, further comprising a speech scoring feature determining unit configured to receive and to determine a measure of similarity between the sequence of sub-word units output by the word alignment unit and the sequence of sub-word units output by the sub-word alignment unit (See e.g., how in Fig. 4 best phonetic sequence 407 and list of possible phonetic sequences 412 are inputted to alignment module 414, and further outputted to rescoring module 416 in combination with acoustic model 318, and see also how “best phonetic sequence 407 from SLU engine 405 and list of possible phonetic sequences 412 from grammar module 404 are provided to alignment module 414… alignment module 414 aligns phonetic sequences 407 and 412 …,” and please see e.g., “…score select and update module 418 selects the highest scoring phonetic sequence or path though the single graph. The selected sequence is provided to update user lexicon 314 at step 514 and language model 316 at step 516…,” HWANG paras. 52-60, Fig. 4).
Regarding claim 52, Yoon in view of Hwang teaches a speech processing system, further comprising a free align unit configured to align acoustic speech models with the input utterance spoken by the user and to output an alignment result including a sequence of sub-word units that matches with the input utterance spoken by the user (See e.g., “…alignment module 414 places the aligned phonetic sequences in a single graph. During this process, identical phonetic units that are aligned with each other are combined onto a single path. Differing phonetic units that are aligned with each other are placed on parallel alternative paths in the graph… The single graph is provided to rescoring module 416…to rescore possible combinations of phonetic units represented by paths through the single graph… rescoring module 416 performs a Viterbi search to identify the best path through the graph using acoustic model scores generated by comparing the feature vectors 403 produced by the user's pronunciation of the word with the model parameters stored in acoustic model 318 for each phonetic unit along a path…,” HWANG paras. 52-61, 78, Fig. 4).
Regarding claim 53, Yoon in view of Hwang teaches a speech processing system, comprising a speech scoring feature determining unit configured to receive and to determine a plurality of speech scoring feature values for the input utterance (See e.g., “…SLU engine 405 comprises or accesses SLU dictionary 409 and acoustic model 318 to generate the most likely sequence of SLUs, typically based on a highest probability score. SLU engine 403 then converts the most likely sequence of syllable-like units into a sequence of phonetic units, which is provided to alignment module 414…,”; See e.g., how in Fig. 4 best phonetic sequence 407 and list of possible phonetic sequences 412 are inputted to alignment module 414, and further outputted to rescoring module 416 in combination with acoustic model 318, and see also how “best phonetic sequence 407 from SLU engine 405 and list of possible phonetic sequences 412 from grammar module 404 are provided to alignment module 414… alignment module 414 aligns phonetic sequences 407 and 412 …,” and please see e.g., “…score select and update module 418 selects the highest scoring phonetic sequence or path though the single graph. The selected sequence is provided to update user lexicon 314 at step 514 and language model 316 at step 516…,”  HWANG paras. 52-61, Fig. 4).
Regarding claim 54, Yoon discloses a speech processing system, further comprising a scoring unit operable to receive the plurality of speech scoring feature values for the input utterance determined by the speech scoring feature determining unit and configured to generate a score representing the language ability of the user (See e.g., “The speech sample scoring engine 208 generates a plurality of difficulty measures 210, 212 that are provided to a scoring model 214 for generation of a difficulty score 216 that is associated with a speech sample 202 under consideration” YOON paras. 13).
Regarding claim 55, Yoon discloses a speech processing system, wherein the score represents the fluency and/or proficiency of the user's spoken utterance (See e.g., “a pure acoustic characteristic is determined by analyzing a number of pauses in the speech sample 202 to deter mine fluency difficulty measures Such as silences per unit time or silences per word. Such a second difficulty measure 212 is provided to the scoring model 214 for generation of a difficulty score 216 representative of the difficulty of the speech sample” YOON paras. 15).
Regarding claim 56, Yoon discloses a speech processing system comprising:
receiving an input utterance spoken by a user(See e.g., “…input device 674, such as a microphone,…” and how “…speech sample 202 is provided…,” YOON paras. 13-15, 31);
using a speech recognition system to recognize the input utterance spoken by the user and to output a recognition result comprising a sequence of recognized words and sub-word units corresponding to the input utterance (See e.g., “…A speech sample 202 is accessed and provided to an automatic speech recognizer 204 that generates word hypotheses for the speech sample 202 and time stamp associations for those word hypotheses that are output 206 to a speech sample scoring engine 208…,” YOON paras. 13-15, Figs. 1, 2, 4, 5);
However, Yoon does not disclose receiving the sequence of recognized words and sub-word units output by the speech recognition system and aligning a sequence of acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors representing the input utterance spoken by the user;
and outputting an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user.
Hwang does teach receiving the sequence of recognized words and sub-word units output by the speech recognition system and aligning a sequence of acoustic speech models corresponding to the received sequence of recognized words and sub-word units with a sequence of acoustic feature vectors representing the input utterance spoken by the user (See e.g., how in Fig. 4 best phonetic sequence 407 and list of possible phonetic sequences 412 are inputted to alignment module 414, and further outputted to rescoring module 416 in combination with acoustic model 318, and see also how “best phonetic sequence 407 from SLU engine 405 and list of possible phonetic sequences 412 from grammar module 404 are provided to alignment module 414… alignment module 414 aligns phonetic sequences 407 and 412 …,” HWANG paras. 52-59, Fig. 4);
 and outputting an alignment result identifying a time alignment between the received sequence of recognized words and sub-word units and the sequence of acoustic feature vectors representing the input utterance spoken by the user (See e.g., how in Fig. 4 best phonetic sequence 407 and list of possible phonetic sequences 412 are inputted to alignment module 414, and further outputted to rescoring module 416 in combination with acoustic model 318, and see also how “best phonetic sequence 407 from SLU engine 405 and list of possible phonetic sequences 412 from grammar module 404 are provided to alignment module 414… alignment module 414 aligns phonetic sequences 407 and 412 …,” HWANG paras. 52-59, Fig. 4).
It would have been obvious to a person of ordinary skill in the art, before the effective filling date of the claimed invention, to modify the teachings of YOON with an architecture having alignment capabilities as techniques and applications of the same taught by HWANG in order to advantageously furnish alignment modules and/or methods having advantages for calculating speech recognition error rates due, for example, from substitution errors, deletion errors, and/or insertion errors when assessing pronunciation of user’s spoken input (HWANG paras. 52-61, Fig. 4).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Deng (U.S. Publication No. 20120065976) teaches a deep belief network for large vocabulary continuous speech recognition. Waibel (U.S. Publication No. 20110307241) teaches an enhanced speech-to-speech translation system and methods.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405.  The examiner can normally be reached on Monday - Friday 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ETHAN DANIEL KIM/Examiner, Art Unit 2658

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658