Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement  
Acknowledgement is made of applicant’s amendment made on 12/30/2020. Applicant’s submission filed has been entered and made of record.
Status of the Claims
Claims 1-2, 7-9, 14-16, and 20-22 are pending. 
Response to Applicant’s Argument
In response to applicant’s arguments on methods 1-5:
Sung teaches, at  ¶53, a transcription of the input audio may be applied to a translation service, which may be programmed to generate an audio and/or textual translation of the input audio into another, different language (e.g., from English to French) for output to the user's computing device. In some examples, the user may be able to specify the accent or dialect of the target language for the output audio. For example, if the input language is North American English, the user may be able to specify, e.g., Quebec or Haitian French. The specification of the accent or dialect may be done in response to an input from the user or it may be performed based on analysis of the input audio. For example, the system may select a version of the target language that is closest, geographically, to the version of the input language. Alternatively, the system may select the most popular (e.g., in terms of numbers of speakers) version of the target language. Other appropriate criteria may be used to select the accent(s) and/or dialect(s) of the target language(s).
For the reasons explained in details below, Sung teaches at least methods 3-5 for determining target language type. 
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7-8, 14-15, 20, and 22 are rejected under 35 USC 103(a) as being unpatentable over Franz et al. (US 2002/0198713 A1) in view of Sung et al. (US 2013/0238336 A1).
Regarding Claims 1 and 8, Franz discloses a server (¶66, remote server), comprising: 
a memory, a processor and computer programs stored in the memory and executable by the processor, wherein when the computer programs are executed by the processor, a voice translation device is realized (¶61 an ¶277, speech translation system “STS” comprising a processor, memory, and program instruction), wherein the voice translation device is configured to perform the steps of: 
¶66, access server function remotely from a PDA or cell phone; ¶67, STS accepts spoken language in a source language and performs speech recognition in the source language); 
recognizing the voice data using a language model corresponding to a determined language type, to acquire first recognition information corresponding to the voice data, the first recognition information comprising voice data to be translated (¶67, STS performs speech to speech translation for use in facilitating communication between individuals who do not speak the same language by translating the recognized expression from the source language to a target language; ¶104, upon receipt of a speech input 1201, acoustic speech recognition component 1202 uses at least one word pronunciation dictionary 1222 and at least one acoustic model 1224 to generate at least one data structure 1204 encoding hypothesized words where data structure information 1204 is used for utterance hypothesis construction 1206; ¶119, utterance hypothesis construction component uses language model (i.e., data structure information 1204) to construct utterance hypothesis); 
determining a target language type and performing a translation process on the first recognition information according to the target language type to acquire a translation result corresponding to the voice data (¶102-103, perform matching and transfer recursively on parts of the shallow syntactic representation of the input to construct one or more hypotheses for speech recognition in a speech translation system),
wherein determining the language type of the voice data acquired from a terminal comprises when the voice data acquired from the terminal only includes the voice data to be ¶66, user may dial a translation service from a laptop; ¶67, STS translation system accepts spoken language in an input / source language and performs speech recognition in the source language while allowing the user to confirm the recognized expression), determining a target language type that is determined by a user by triggering a key having a function of selecting the target language type, as the language type of the voice data (Fig. 14 and see ¶110, allowing user to select the preferred source language-target language pair by activating source language expression 1410 with cursor 1412).
Franz does not disclose wherein determining the target language type comprises one or more of the following: 
determining a target language type having a highest frequency among historical translations as the target language type based on historical usage information of the terminal; 
determining a language type used in a latest translation as the target language type;
positioning the terminal to determine present positional information of the terminal, so as to determine a commonly-used target language type at the location of the terminal as the target language type; 
when the voice data acquired from the terminal includes both the voice data to be translated and a target language type of the voice data to be translated, determining the target language type of the voice data to be translated as the target language type; and

Sung teaches a speech recognition system for recognizing audio using multiple language models to identify a candidate language for the audio (Abstract) wherein a transcription of the input audio may be applied to a translation service to generate audio and textual translation of the input audio into a target language type (¶53) by determining the target language type using one or more of the following:
positioning the terminal to determine present positional information of the terminal, so as to determine a commonly-used target language type at the location of the terminal as the target language type (¶53, system may select a version of the target language that is geographically closest to the version of the input language; ¶59, location information / GPS coordinate of user’s mobile device may be used to determine the geographic location and therefore select the target language geographically closest to the input language); 
when the voice data acquired from the terminal includes both the voice data to be translated and a target language type of the voice data to be translated, determining the target language type of the voice data to be translated as the target language type (¶73-74, participants may speak words / phrases from each other’s native language (i.e., target language type of the voice data to be translated) and language recognition components for the collective assortment of users’ languages and applying them to all the users in the conversation such that speech from the multi-lingual discussion may be recognized and transcribed; in view of ¶53, system may select the most popular (e.g., in terms of numbers of speakers) version of the target language); 
when the voice data acquired from the terminal only includes the voice data to be translated, determining a target language type that is determined by a user by triggering a key having a function of selecting the target language type, as the target language type (¶53, when applying a transcription of the input audio to a translation service for output to user’s computing device, user may be able to specify the accent or dialect of the target language for the output audio; ¶55, the user may select, via a touch screen menu item or voice input, their native (or other) language).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to modify Franz to automatically determine a target language type according to the methods of Sung such that a translation service may generate an audio and textual translation of input audio into another, different language (Sung, ¶53).
Regarding Claims 7 and 14, Franz discloses wherein after the acquiring the translation result corresponding to the voice data, the method further comprises: sending the first recognition information and the translation result to the terminal (¶66, when remote server hosts the STS system and the user may dial the STS translation service from a PDA or cell phone; ¶67, after STS performs speech recognition in the source language, optionally allow the user to confirm the recognized expression would require transmitting the recognition information to the PDA or cell phone).
Regarding Claim 15, Franz discloses a non-transitory computer readable storage medium, having computer programs stored thereon, wherein when the computer programs are executed by a processor, a voice translation method of claims 1 and 8 is realized (¶61 an ¶277, speech translation system “STS” comprising a processor, memory, and program instruction).
Regarding Claim 20, Sung disclose wherein the target language type is determined according to present positional information of the terminal (¶59, location information / GPS coordinate of user’s mobile device may be used to select recognition candidate; for example, give Korean recognition candidates more weight when the user is in Seoul or to Japanese recognition candidates when the user is in Tokyo) or according to historical usage information of the terminal (¶63, language(s) spoken by the user may be determined automatically based on user’s past history (e.g., prior utterances)). 
Regarding Claim 22, Franz discloses recognizing an intention of the first recognition information to determine a translation intention corresponding to the first recognition information (¶71, the STS analyzes the input, determines the meaning of the input, and renders that meaning in the appropriate way in a target language), wherein different translation intentions correspond to different translation models (¶73-74, combing syntactic analysis with analogical or statistical transfer to produce high quality translation in different domains; see for example, ¶82, parse the input “I want to make a reservation for three people for tomorrow evening at seven o’clock” to identify syntactic constituents / parse tree; ¶85, the domain independent syntactic analysis is combined with domain dependent translation example database described in ¶73), and wherein translation results corresponding to the same recognition information are ¶96 and ¶100-101, perform an initial fast match to quickly check the compatibility of the input parse tree with a domain specific example database to rule out unlikely examples where the fast match is performed based on syntactic head of the constituents to be matched while constrained to equality or to a thesaurus based measure of close semantic similarity); 
determining a translation model corresponding to the determined translation intention according to the determined translation intention corresponding to the first identification information (¶97-99, after initial fast match, perform best match to find the best match from the example database given an input); and 
performing the translation process on the first recognition information to obtain according to the determined translation model to acquire the translation result corresponding to the voice data (in view of ¶19, match the input to source expressions of example pairs in the example database, find the most appropriate examples, take the target expression from best matching examples and construct an expression in the target language).
Claims 2, 9, and 16 are rejected under 35 USC 103(a) as being unpatentable over Franz et al. (US 2002/0198713 A1) and Sung et al. (US 2013/0238336 A1) as applied to claims 1, 8, and 15, in view of Chun (US 2011/0218804 A1).
Regarding Claims 2, 9, and 16, Franz does not disclose wherein determining the language type of the voice data acquired from the terminal comprises: determining a feature vector of the voice data acquired from the terminal; and determining the language type of the voice data based on a match degree between the feature vector and a preset language type model.  
Chun discloses a server (¶66 and Fig. 1, device 1 receiving audio data from a remote location over a network) determining a language type of voice data acquired from a terminal (¶66, receiving audio data from a remote location; ¶78, ¶81 and ¶141, determining a likelihood of a sequence of observations / vectors representing audio occurs in a given language) wherein determining the language type of the voice data acquired from the terminal comprises: 
determining a feature vector of the voice data acquired from the terminal (¶78 and ¶141, speech signals are converted into an input vector in n-dimensional acoustic space); and 
determining the language type of the voice data based on a match degree between the feature vector and a preset language type model (¶81 and ¶141, determining the likelihood of the sequence of observations occurring in a given language is evaluated using the language model). 
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to modify Franz to determine the target language type by matching a feature vector of the voice data with a preset language type model in order to output the sequence of words into a translation system where it is translated into a second language (Chun, ¶141).
Claim 21 is rejected under 35 USC 103(a) as being unpatentable over Franz et al. (US 2002/0198713 A1) in view of Sung et al. (US 2013/0238336 A1) as applied to Claim 1, in further view of Choi (US 2005/0182628 A1).
Regarding Claim 21, Franz discloses wherein before the performing the translation process on the first recognition information, the method further comprises: 
¶83 and ¶96, process incomplete or imperfectly grammatical natural human speech by performing morphological analysis to re-arrange syntactic constituents to generate a final feature structure like Fig. 7, “I want to make a reservation for three people for tomorrow morning” by rearranging syntactic features through insert, delete, or join parts of syntactic representation) and performing the translation process on the first recognition information comprises performing the translation process on the second recognition information (¶94-97, since natural human speech is not perfectly complete and grammatical, perform optimization procedure to insert, delete or join parts of the syntactic representation and perform matching with the appropriate domain specific example database). 
Franz does not disclose wherein before performing the translation process on the first recognition information, performing a post-process on the first recognition information to generate second recognition information, wherein the post-process comprises correction based on hot words.
Choi discloses a domain based speech recognition apparatus performing a first speech recognition on speech input to generate a first recognition information (Abstract and ¶46-48, using first acoustic model and first language model to recognize Korean language based speech input to generate first recognition result in Korean equivalent of “what time is the temperature now?”) and perform a post-process on the first recognition information to generate second recognition information by correcting the first recognition information via a correction based on hot words (¶49-51, determine a domain keyword “temperature” to select a proper candidate domain; ¶53, apply second acoustic model and second language model to generate second recognition sentence “what is the temperature now?”).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to modify Franz to perform a post-process on the first recognition information comprising a correction based on hot words as taught by Choi in order to minimize misrecognition of a word in a final recognition result (Choi, Abstract).
Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RICHARD Z ZHU/Primary Examiner, Art Unit 2675                                                                                                                                                                                                        02/26/2021