DETAILED ACTION
Introduction
This office action is in response to Applicant's submission filed on 1/31/2022. Claims 1-17 are pending in the application and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Response to Amendment
The response to filed on 1/31/2022 has been correspondingly accepted and considered in this Office Action. Claims 1, 4-9 and 13-17 have been examined. Claims 2-3, 10-12 have been cancelled. Applicant’s amendment to the title change have been noted to overcome the objections to the specifications. Applicant’s amendments to claim 14 and 15 overcome Claim objections.
Response to Arguments
Applicant’s arguments with respect to claims 1 and 7 state that
“The Office Action supports this assertion by citing p. 4816, column 2, lines 47-55 of Rao, which do not mention a recursive neural network. While this passage as cited does mention an acoustic model, Applicant respectfully submits that the disclosure of an acoustic model is not an inherent recitation of a neural network.
The Office Action supports this assertion by citing p. 358, lines 25-35 of Rentzepopoulos, which do not support the substantially exclusive use of entries associated in pairs. The cited portions appear to teach an unknown algorithm for converting a phonemic form into a graphemic form, which Applicant respectfully submits do not read on the substantially exclusive use of entries associated in pairs as recited in Applicant's independent claims 1 and 7.”

The examiner respectfully disagrees, Rao teaches “We train a grapheme recognizer (grapheme-based acoustic model) with 5 layers of bidirectional LSTM with a CTC loss function” in Rao, pg. 4816, col 2, lines 47-55 and the description of LSTM is described in Rao, pg. 4815, col 2, sect 2.1 as “Recurrent neural networks, specifically Long Short-Term Memory (LSTM) networks have proven to be the sate-of-the-art in numerous sequence modeling tasks”. Also in Applicant specifications [0022], the phoneme-to-grapheme module is described in context to a recurrent neural network, i.e. a recursive neural network. 
Rentzepopoulos teaches “The meaning of this algorithm is the following: If a pair of phonemes is written as either a single grapheme or a pair of graphemes, then this pair is considered a single state. The same holds for the reverse procedure when a pair of graphemes is pronounced as either a single phoneme or a pair of phonemes” in Rentzepopoulos, pg. 355. Rentzepopoulos teaches using pair of phonemes in segmentation rule to guarantee that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length, there is an example given for pair of phoneme as single phoneme symbol/monosyllabic phoneme with associated grapheme term and therefore, the rejections of Claims 1 and 7 rejected under 35 U.S.C. 103 are sustained and further updated accordingly. In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 01/31/2022, Examiner respectfully notes as follows. For completeness, should the mentioned claims are likewise traversed for similar reasons to independent claims 1 and 7 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1 and 7 correspondingly discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive. 


Claim Objections
Claim 17 objected to because of the following informalities:  
Claim 17 is objected as it is dependent on the canceled claim 12. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

6.	The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 

(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
7.	This application includes one or more claim limitations that do not use the word “means,” 
phoneme to grapheme module in claim 1;
phoneme generation module in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.








Claims 1 is rejected under 35 U.S.C. §103 as being unpatentable over Ruhl (U.S. Patent Application Publication 2002/0016669) in view of Konig, et. al. (U.S. Patent Application Publication 2009/0112593) further in view of Schalkwyk et. al. (U.S. Patent Application Publication 2015/0340034) further in view of Rentzepopoulos (Rentzepopoulos, Panagiotis A. and Kokkinakis, George K., "Efficient Multilingual Phoneme-to-Grapheme Conversion Based on HMM", Computational Linguistics Volume 22 Issue 3 September 1996 pp 351–376) where Schalkwyk has been cited in the IDS submitted on 03/04/2020.
Regarding claim 1, Ruhl teaches a transportation vehicle comprising: a navigation system (Ruhl, [0030] and Fig.1 shows an embodiment of an on-board navigation system according to the present invention); a user input control connected to the navigation system via a bus system for data interchange purposes, a microphone (Ruhl, [0030], [0032] teaches user interface with the navigation system and microphone); however Ruhl fails to teach a phoneme generation module that includes a statistical language model for generating phonemes from a voice signal and/or the output signal of the microphone, wherein the phonemes are part of a prescribed selection of exclusively monosyllabic phonemes; and  a phoneme-to-grapheme module that includes a recursive neural network for generating inputs for controlling the transportation vehicle based on a succession of monosyllabic phonemes generated by the phoneme generation module, wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term.  
	However, Konig et. al. teaches a phoneme generation module that includes a statistical language model for generating phonemes from a voice signal and/or the output signal of the microphone, wherein the phonemes are part of a prescribed selection of exclusively monosyllabic phonemes; (Konig et. al. [0047] teaches the speech recognition unit 106 may extract feature vectors and may then use hidden Markov models (HMMs) for phonemes of different languages to transcribe the speech input into a phonetic sequence. Using the vocabulary and the specialized vocabulary, words are then identified in the sequence of phonemes. For several phoneme segments of the phoneme sequence, there may be a number of possible words. The number of possible words can generally be reduced by using a speech model 206. The speech model 206 may, for example, be a rule base or a statistical model that contains information about in which sequences words occur; the recognition of the phoneme or phoneme sequences from the speech input is interpreted as part of a prescribed selection of monosyllabic phonemes; HMM is a statistical language model).
	Ruhl and Konig are considered to be analogous to the claimed invention because both relate generally to speech recognition methods for searching a database. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl to compares the features with phonemes stored in list with the phoneme generation techniques as taught by Konig provide an improved method and system for searching a database by speech input (see Konig, [0009]).  However, Ruhl and Konig fail to teach a phoneme-to-grapheme module that includes a recursive neural network for generating inputs for controlling the transportation vehicle based on a succession of monosyllabic phonemes generated by the phoneme generation module, wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term. However, Schalkwyk teaches a phoneme-to-grapheme module that includes a recursive neural network for generating inputs for controlling the transportation vehicle based on a succession of monosyllabic phonemes generated by the phoneme generation module (Schalkwyk [0016], [0028] teaches the conversion phone labels which includes sequences of other phonetic units, e.g., di-phones, syllables, triphone-states, monophone-states to generate the grapheme labels, Schalkwyk [0042] describes the adaptive system may be a different kind of recurrent neural network (RNN) or other kind of neural network based system that includes a memory mechanism). 
Ruhl, Konig and Schalkwyk are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl and Konig to generate phonemes based on phonemes stored in list with the phoneme to grapheme generation techniques as taught by Schalkwyk to provide improved speech recognition methods that can be the input to software applications that control systems or devices when the use of other input methods by a user of the system is constrained by physical limitations (see Schalkwyk, [0003]).  However Schalkwyk fails to teach wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term.  However, Rentzepopoulos teaches wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term ( Rentzepopoulos pgs. 354-355, “The rules for the segmentation of a phoneme string to a sequence of symbols conforming to the above condition are manually defined off-line according to the procedure presented below in an informal algorithmic language (Figure 1). The meaning of this algorithm is the following: If a pair of phonemes is written as either a single grapheme or a pair of graphemes, then this pair is considered a single state. The same holds for the reverse procedure when a pair of graphemes is pronounced as either a single phoneme or a pair of phonemes. This algorithm is the only language-specific part of the PTGC system and its formulation requires only familiarity with the spelling of the language and not sophisticated linguistic knowledge. The rules are incorporated in the PTGC system using an automated procedure as a separate input function that parses the input strings into states”; Rentzepopoulos teaches using pair of phonemes in segmentation rule to guarantee that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length, Rentzepopoulos teaches an example given for pair of phoneme as single phoneme symbol/monosyllabic phoneme with associated grapheme term). 
Ruhl, Konig, Schalkwyk and Rentzepopoulos are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl, Konig and Schalkwyk to generate phoneme to grapheme models trained with pair of phonemes as a single grapheme based on phonemes stored in list techniques as taught by Rentzepopoulos to improve accurate grapheme generation (see Rentzepopoulos, pg. 351, 352).
Claims 4, 5, 6, 16 are rejected under 35 U.S.C. §103 as being unpatentable over Ruhl (U.S. Patent Application Publication 2002/0016669) in view of Konig, et. al. (U.S. Patent Application Publication 2009/0112593) further in view of Schalkwyk et. al. (U.S. Patent Application Publication 2015/0340034) further in view of Rentzepopoulos (Rentzepopoulos, Panagiotis A. and Kokkinakis, George K., "Efficient Multilingual Phoneme-to-Grapheme Conversion Based on HMM", Computational Linguistics Volume 22 Issue 3 September 1996 pp 351–376) further in view of Rao (K. Rao and H. Sak, "Multi-accent speech recognition with hierarchical grapheme based models," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 4815-4819) where Schalkwyk and Rao have been cited in the IDS submitted on 03/04/2020.
Regarding claim 4, Ruhl, Konig, Schalkwyk and Rentzepopoulos teach the transportation vehicle of claim [[3]] 1, but fail to teach wherein the recursive neural network . However, Rao teaches the recursive neural network  (See Rao, pg. 4816,col 1, sect 2.4 and col 2, lines 47-55, “We create a hierarchical-CTC grapheme recognizer with a phoneme CTC loss in an intermediate layer, see Figure 1. We train a grapheme recognizer (grapheme-based acoustic model) with 5 layers of bidirectional LSTM with a CTC loss function. For the hierarchical models we use 8 layers of bidirectional LSTM with the primary CTC loss on the 5th layer and the secondary CTC loss on the final 8th layer. We experimented with architectures of 3 to 10 layers deep with the primary loss at the 3rd to 10th layer and found no futher improvements after 8 layers and the primary loss at the 5th layer”; the description of LSTM is described in Rao, pg. 4815, col 2, sect 2.1 as “Recurrent neural networks, specifically Long Short-Term Memory (LSTM) networks have proven to be the sate-of-the-art in numerous sequence modeling tasks”; Fig. 1 depicts the BLSTM intermediate layers, LSTM can be further extended to bidirectional networks, resulting in Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN), in which the hidden layers are used to  process the input data forwards and backwards with the recall layers. A person skilled in the art can process that the depth of the intermediate layers of the LSTM based on the design of the input to the network).
Ruhl, Konig, Schalkwyk, Rentzepopoulos and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl, Konig, Schalkwyk, Rentzepopoulos to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme generation using hierarchical grapheme based models as taught by Rao to improve recognition accuracy (see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 5, Ruhl, Konig Schalkwyk and Rentzepopoulos teach the transportation vehicle of claim [[3]] 1, but fail to teach wherein the dimension of the input layer is equal to the number of phonetic symbols from which syllables are formed. However, Rao teaches the dimension of the input layer is equal to the number of phonetic symbols from which syllables are formed (See Rao, pg. 4816, col 1, lines 25-34, “We use the CTC objective function to train a single network to directly predict graphemes given the acoustic input. We choose the lower cased English alphabet (a-z) as the grapheme target labels. A special label indicates boundary between words making a total of 27 grapheme labels. For the training labels we convert transcriptions to the spoken domain using a verbalizer [26]. This verbalizer is constructed manually based on language specific rules and may generate several alternative spoken transcriptions for a given written transcription”; Rao teaches constructing the verbalizer using the language specific rules, which would be interpreted as the phonetic symbols from which syllables are formed).
see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 6, Ruhl, Konig Schalkwyk and Rentzepopoulos teach the transportation vehicle of claim [[3]]1, but fail to teach wherein the dimension of the output layer is equal to the number of characters in the target language. However, Rao teaches the dimension of the output layer is equal to the number of characters in the target language (See Rao, pg. 4816, col 1, lines 25-34, “We use the CTC objective function to train a single network to directly predict graphemes given the acoustic input. We choose the lower cased English alphabet (a-z) as the grapheme target labels.”).
Ruhl, Konig, Schalkwyk, Rentzepopoulos and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl, Konig, Schalkwyk, Rentzepopoulos to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme generation using hierarchical grapheme based models as taught by Rao to improve recognition accuracy (see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 16, Ruhl, Konig, Schalkwyk and Rentzepopoulos teach the transportation vehicle of claim 1, but fail to teach wherein the recursive neural network comprises a recall layer having a dimension of between 10 and 30. However Rao teaches the recursive neural network comprises a recall layer having a dimension of between 10 and 30 (see Rao, pg. 4816, sect 2.1, col 1 and sect 3, col.2, in this work, we exclusively train networks as stacked bidirectional LSTM layers where at each depth two LSTM layers (one forward and one backward) are fully connected to the two LSTM layers at the next adjacent depth. We train a grapheme recognizer (grapheme-based acoustic model) with 5 layers of bidirectional LSTM with a CTC loss function. For the hierarchical models we use 8 layers of bidirectional LSTM with the primary CTC loss on the 5th layer and the secondary CTC loss on the final 8th layer. We experimented with architectures of 3 to 10 layers deep with the primary loss at the 3rd to 10th layer and found no further improvements after 8 layers and the primary; Fig. 1 depicts the BLSTM intermediate layers, LSTM can be further extended to bidirectional networks, resulting in Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN), in which the hidden layers are used to process the input data forwards and backwards with the recall layers. A person skilled in the art can process that the depth of the recall layers of the LSTM based on the design of the input and outputs of the network).
Ruhl, Konig, Schalkwyk, Rentzepopoulos and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Ruhl, Konig, Schalkwyk, Rentzepopoulos to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme generation using hierarchical grapheme based models as taught by Rao to improve recognition accuracy (see Rao, pg. 4815, col 2 lines 22-26.
Claims 7, 8 and 9 are rejected under 35 U.S.C. §103 as being unpatentable over Rentzepopoulos ( Rentzepopoulos, Panagiotis A. and Kokkinakis, George K., "Efficient Multilingual Phoneme-to-Grapheme Conversion Based on HMM", Computational Linguistics Volume 22 Issue 3 September 1996 pp 351–376) in view of Schalkwyk et. al. (U.S. Patent Application Publication 2015/0340034) further in view of Homma et. al. (U.S. Patent Application Publication 2012/0173574) where Schalkwyk has been cited in the IDS submitted on 03/04/2020.
Regarding claim 7, Rentzepopoulos teaches a method for manufacturing a transportation vehicle that includes a navigation system, a user input control connected to the navigation system via a bus system for data interchange purposes, a microphone, a phoneme generation module that includes a statistical language model for generating phonemes from a voice signal and/or the output signal of the microphone, wherein the phonemes are part of a prescribed selection of exclusively monosyllabic phonemes, and a phoneme-to-grapheme module that includes a recursive neural network for generating inputs for controlling the transportation vehicle based on a succession of monosyllabic phonemes generated by the phoneme generation module, wherein the method comprises: providing a first database of inputs or commands for controlling functions of the transportation vehicles generating a second database that comprises exclusively monosyllabic phonemes ( See Rentzepopoulos pgs. 354-355, “The rules for the segmentation of a phoneme string to a sequence of symbols conforming to the above condition are manually defined off-line according to the procedure presented below in an informal algorithmic language (Figure 1). The meaning of this algorithm is the following: If a pair of phonemes is written as either a single grapheme or a pair of graphemes, then this pair is considered a single state. The same holds for the reverse procedure when a pair of graphemes is pronounced as either a single phoneme or a pair of phonemes. This algorithm is the only language-specific part of the PTGC system and its formulation requires only familiarity with the spelling of the language and not sophisticated linguistic knowledge. The rules are incorporated in the PTGC system using an automated procedure as a separate input function that parses the input strings into states”; Rentzepopoulos teaches using pair of phonemes in segmentation rule to guarantee that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length, this is interpreted as the algorithm used to generate a second database).  However, Rentzepopoulos fails to teach training the phoneme generation module by using the first database; training the phoneme-to-grapheme module by using the second database, wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term; connecting the output of the phoneme generation module to the input of the phoneme-to-grapheme module for data interchange purposes; and implementing the phoneme generation module and the phoneme-to-grapheme module in a transportation vehicle.  However, Schalkwyk teaches training the phoneme generation module by using the first database (Schalkwyk [0038] teaches training the acoustic model on the audio training data to adjust the values of the parameters from the initial values to the pre-trained values. Generally, the audio training data contains audio inputs that are each associated with a sequence of phonemes that represents the audio input, i.e., audio inputs for which the sequence of phonemes that should be predicted by the acoustic model is known); training the phoneme-to-grapheme module by using the second database (Schalkwyk [0036] teaches training an inverse pronunciation model on text training data to obtain pre-trained values of the parameters of the inverse pronunciation model for the grapheme generation); connecting the output of the phoneme generation module to the input of the phoneme-to-grapheme module for data interchange purposes (Schalkwyk [0019] teaches the inverse pronunciation model is a neural network-based model that receives a set of phoneme label scores generated by the acoustic model and generates a respective score for each of a set of grapheme label sequences).  
Rentzepopoulos and Schalkwyk are considered to be analogous to the claimed invention because they relate generally to phoneme-to-grapheme based speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos to generate pair of phonemes as a single grapheme to train the phoneme to grapheme generation models as taught by Schalkwyk to provide improved speech recognition methods that can be the input to software applications that control systems or devices when the use of other input methods by a user of the system is constrained by physical limitations (see Schalkwyk, [0003])., Rentzepopoulos further teaches wherein the phoneme-to-grapheme module is trained substantially exclusively using entries associated in pairs, which entries comprise a monosyllabic phoneme that has a respective associated monosyllabic term (Rentzepopoulos pgs. 354-355, “The rules for the segmentation of a phoneme string to a sequence of symbols conforming to the above condition are manually defined off-line according to the procedure presented below in an informal algorithmic language (Figure 1). The meaning of this algorithm is the following: If a pair of phonemes is written as either a single grapheme or a pair of graphemes, then this pair is considered a single state. The same holds for the reverse procedure when a pair of graphemes is pronounced as either a single phoneme or a pair of phonemes. This algorithm is the only language-specific part of the PTGC system and its formulation requires only familiarity with the spelling of the language and not sophisticated linguistic knowledge. The rules are incorporated in the PTGC system using an automated procedure as a separate input function that parses the input strings into states”; Rentzepopoulos teaches using pair of phonemes in segmentation rule to guarantee that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length, Rentzepopoulos teaches an example given for pair of phoneme as single phoneme symbol/monosyllabic phoneme with associated grapheme term). However, Rentzepopoulos and Schalkwyk fail to teach implementing the phoneme generation module and the phoneme-to-grapheme module in a transportation vehicle.  However, Homma teaches implementing the phoneme generation module and the phoneme-to-grapheme module in a transportation vehicle (Homma [0043] & Fig. 1 teaches implementing phoneme generation and text string generation in a car navigation system).	Rentzepopoulos, Schalkwyk and Homma are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos and Schalkwyk to implement the phoneme to grapheme generation models in voice recognizing function in a car navigation systems as taught by Homma to decrease the inconvenience of the user with respect to use of a voice recognizing function, and to improve usage convenience of an information retrieving apparatus (see Homma, [0013]).  
Regarding claim 8, Rentzepopoulos teaches wherein the second database comprises phonemes that are exclusively monosyllabic (See Rentzepopoulos pg. 354, lines 21-24, “Every hidden state should produce one observation symbol. To achieve this, all the possible graphemic transcriptions of phonemes were coded as separate graphemic symbols”; Rentzepopoulos teaches multiple grapheme transcription for same phoneme). 
Regarding claim 9, Rentzepopoulos teaches wherein the second database comprises entries associated in pairs, wherein a monosyllabic phoneme has a respective associated monosyllabic term (See Rentzepopoulos, pg. 354, 355 “To overcome this problem, the hidden-state alphabet and the observation-symbol alphabet should contain not only single characters (single graphemes or phonemes respectively) but also clusters. This way, it is guaranteed that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length. The rules for the segmentation of a phoneme string to a sequence of symbols conforming to the above condition are manually defined off-line according to the procedure presented below in an informal algorithmic language (Figure 1). The meaning of this algorithm is the following: If a pair of phonemes is written as either a single grapheme or a pair of graphemes, then this pair is considered a single state. In this example the pair of phonemes/ks/is considered a single phonemic symbol. Accordingly, the pair "~cr" is also considered a single graphemic state since it is pronounced as /ks/. As can be seen, in order to disambiguate the case of ~cr the phonemic symbol/ks/and the graphemic state ~cr must be introduced.”;  Rentzepopoulos teaches using pair of phonemes in segmentation rule to guarantee that there will be no case where a sequence of graphemes produces a sequence of phonemes of a different length. Example given for pair of phoneme as single phoneme symbol/monosyllabic phoneme with associated grapheme term).
Claims 13, 14, 15 and 17 are rejected under 35 U.S.C. §103 as being unpatentable over Rentzepopoulos ( Rentzepopoulos, Panagiotis A. and Kokkinakis, George K., "Efficient Multilingual Phoneme-to-Grapheme Conversion Based on HMM", Computational Linguistics Volume 22 Issue 3 September 1996 pp 351–376) in view of Schalkwyk et. al. (U.S. Patent Application Publication 2015/0340034) further in view of Homma et. al. (U.S. Patent Application Publication 2012/0173574) further in view of Rao (K. Rao and H. Sak, "Multi-accent speech recognition with hierarchical grapheme based models," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 4815-4819, doi: 10.1109/ICASSP.2017.7953071) where Schalkwyk and Rao have been cited in the IDS submitted on 03/04/202.
Regarding claim 13, Rentzepopoulos, Schalkwyk and Homma teach the transportation vehicle of claim [[12]]7, but fail to teach wherein the recursive neural network [[RNN]] comprises between 2 and 4 intermediate layers. However, Rao teaches wherein the recursive neural network [[RNN]] comprises between 2 and 4 intermediate layers (See Rao, pg. 4816,col 1, sect 2.4 and col 2, lines 47-55, “We create a hierarchical-CTC grapheme recognizer with a phoneme CTC loss in an intermediate layer, see Figure 1. We train a grapheme recognizer (grapheme-based acoustic model) with 5 layers of bidirectional LSTM with a CTC loss function. For the hierarchical models we use 8 layers of bidirectional LSTM with the primary CTC loss on the 5th layer and the secondary CTC loss on the final 8th layer. We experimented with architectures of 3 to 10 layers deep with the primary loss at the 3rd to 10th layer and found no futher improvements after 8 layers and the primary loss at the 5th layer”; the description of LSTM is described in Rao, pg. 4815, col 2, sect 2.1 as “Recurrent neural networks, specifically Long Short-Term Memory (LSTM) networks have proven to be the sate-of-the-art in numerous sequence modeling tasks”; Fig. 1 depicts the BLSTM intermediate layers, LSTM can be further extended to bidirectional networks, resulting in Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN), in which the hidden layers are used to process the input data forwards and backwards with the recall layers. A person skilled in the art can process that the depth of the intermediate layers of the LSTM based on the design of the input to the network).
Rentzepopoulos, Schalkwyk, Homma and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos, Schalkwyk, Homma to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme generation using hierarchical grapheme based models as taught by Rao to improve recognition accuracy (see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 14, Rentzepopoulos, Schalkwyk and Homma teach the transportation vehicle of claim [[12]]7, but fail to teach wherein the dimension of the input layer is equal to the number of phonetic symbols from which syllables are formed. However, Rao teaches the dimension of the input layer is equal to the number of phonetic symbols from which syllables are formed (See Rao, pg. 4816, col 1, lines 25-34, “We use the CTC objective function to train a single network to directly predict graphemes given the acoustic input. We choose the lower cased English alphabet (a-z) as the grapheme target labels. A special label indicates boundary between words making a total of 27 grapheme labels. For the training labels we convert transcriptions to the spoken domain using a verbalizer [26]. This verbalizer is constructed manually based on language specific rules and may generate several alternative spoken transcriptions for a given written transcription”; Rao teaches constructing the verbalizer using the language specific rules, which would be interpreted as the phonetic symbols from which syllables are formed).
Rentzepopoulos, Schalkwyk, Homma and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos, Schalkwyk, Homma to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme generation using hierarchical grapheme based models as taught by Rao to improve recognition accuracy (see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 15, Rentzepopoulos, Schalkwyk and Homma teach the transportation vehicle of claim [[12]]7, but fail to teach wherein the dimension of the output layer is equal to the number of characters in the target language. However, Rao teaches the dimension of the output layer is equal to the number of characters in the target language (See Rao, pg. 4816, col 1, lines 25-34, “We use the CTC objective function to train a single network to directly predict graphemes given the acoustic input. We choose the lower cased English alphabet (a-z) as the grapheme target labels”).
Rentzepopoulos, Schalkwyk, Homma and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos, Schalkwyk, Homma to generate graphemes from recognized phonemes stored in list with the phoneme to grapheme see Rao, pg. 4815, col 2 lines 22-26).
Regarding claim 17, Rentzepopoulos, Schalkwyk and Homma teach the transportation vehicle of claim 7, but fail to teach wherein the recursive neural network comprises a recall layer having a dimension of between 10 and 30.  However, Rao teaches the recursive neural network comprises a recall layer having a dimension of between 10 and 30 (see Rao, pg. 4816, sect 2.1, col 1 and sect 3, col.2, in this work, we exclusively train networks as stacked bidirectional LSTM layers where at each depth two LSTM layers (one forward and one backward) are fully connected to the two LSTM layers at the next adjacent depth. We train a grapheme recognizer (grapheme-based acoustic model) with 5 layers of bidirectional LSTM with a CTC loss function. For the hierarchical models we use 8 layers of bidirectional LSTM with the primary CTC loss on the 5th layer and the secondary CTC loss on the final 8th layer. We experimented with architectures of 3 to 10 layers deep with the primary loss at the 3rd to 10th layer and found no futher improvements after 8 layers and the primary; Fig. 1 depicts the BLSTM intermediate layers, LSTM can be further extended to bidirectional networks, resulting in Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN), in which the hidden layers are used to process the input data forwards and backwards with the recall layers. A person skilled in the art can process that the depth of the recall layers of the LSTM based on the design of the input and outputs of the network).
Rentzepopoulos, Schalkwyk, Homma and Rao are considered to be analogous to the claimed invention because they relate generally to speech recognition methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Rentzepopoulos, Schalkwyk, Homma to see Rao, pg. 4815, col 2 lines 22-26).




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
B. Decadt, J. Duchateau, W. Daelemans and P. Wambacq, "Transcription of out-of-vocabulary words in large vocabulary speech recognition based on phoneme-to-grapheme conversion," 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2002, pp. I-861-I-864 teaches a phoneme-to-grapheme (p2G) conversion, using a memory-based machine learner where the input for the converter is the phoneme and its three-phoneme context. The output is the corresponding grapheme.
Nakano et. al., (US. Patent Application Publication Number 2011/0184737), discloses (Nakano, [0036] recognition of the phoneme or sequence of phonemes based on the known phoneme method).
Please, see additional references in form PTO-892 for more details.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 2:00pm - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about 





/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/EDGAR X GUERRA-ERAZO/Primary Examiner, Art Unit 2656