DETAILED ACTION

The present application is being examined under the pre-AIA  first to invent provisions. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/16/2022 has been entered.

Response to Amendment
This communication is responsive to the applicant's amendment and RCE both filed on 06/16/2022.  The applicant(s) amended claims 1 and 11, and added new claims 31-32 (see the amendment: pages 2-5).
The examiner withdrew previous rejection of under 35 USC 112 (a), because the applicant amended the claim(s).  However, it is noted that newly added claims 31-32 have same/similar new matter problem, which should be further rejected (see detail below).
Response to Arguments
Applicant's arguments filed on 06/16/2022 with respect to the claim rejection under 35 USC 103, have been fully considered but are moot in view of the new ground(s) of rejection, since the amended/added claims introduce new issue/matter, which change the scope of the claims.  Accordingly, response to the applicant’s arguments (see Remarks: page 6, paragraph 3 to page 7, paragraph 4) based on newly amended/added claims is directed to new claim rejection with necessitated new ground (see below).  
It is also noted that previously cited references are still applicable to the amended claims for prior art rejection with necessitated new ground(s) (may include newly combined prior art teachings and/or claim interpretations) (see detail rejection below).

Claim Rejections - 35 USC § 112
Claims 31-32 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.  
Regarding claims 31-32, newly added limitation of “determining (or to determine) that the second voice input is not directed to correcting the text string previously generated based on the first voice input when at least the portion of the first plurality of sound properties does not match at least the portion of the second plurality of sound properties” fails to comply with the written description requirement, because it is not specifically defined or described by the original specification, so as to introduce new matter.  It is also noted that the referenced contents of the specification (i.e. Fig. 4 and paragraphs 6 and 38-39) provided by the applicant (see Remarks: page 6, paragraph 1) does not fully support the amended/argued limitation. 

Claim Rejections - 35 USC § 103
Claims 1, 8 10-11, 18, 20 and 31-32 are rejected under 35 U.S.C. 103 as being unpatentable over KISS et al. (US 2006/0293889) hereinafter referenced as KISS.
As per claim 1, KISS discloses ‘error correction for speech recognition systems’ (title), comprising:
detecting (or ‘checks’) a second voice input (read on ‘(again) inputting a spoken representation’, ‘a new speech input sequence’, ‘further speech input’) (p(paragraph)13, p39, p44, p73); 
determining (or ‘checks’, or ‘recognize’), based at least in part on a sound property (read on one or more of ‘utterance’ matched with ‘acoustic models’, ‘a sound or phoneme’ represented by ‘acoustic model’ or ‘phonetic spelling’, ‘acoustic score’, ‘spoken representation’, ‘acoustical differences’, ‘phonemes’ themselves for ‘phoneme-to-text mapping’, or a combination thereof, in a broad sense, Note: wherein claimed “sound property” is not necessarily interpreted into a narrow scope, such as “intensity” and “pitch” as argued by applicant (see Remarks: paragraph 3) because (i) the claims do not recite the argued “intensity” and “pitch” and “specification does not specifically provide a user defined term of “sound property” either) of the second voice input, whether the second voice input is directed to correcting (correction of) a text string (read on ‘recognition results’ as ‘sequence of the words’ including ‘misrecognized words’ presented and/or displayed to ‘a user’) previously generated (‘recognized’) based on a first voice input (read on ‘an (said) input speech sequence’ or ‘initial input speech sequence’) (p7, p9-p11, p22-p23, p39, p44, p66,  p72-p73), by: 
identifying a first plurality of sound properties (read on one of above-mentioned combination of ‘sound’, ‘acoustic’, ‘phonetic’, ‘spoken’ representation/utterance/speech and related score/models/phonemes, such as ‘phoneme(s)’ or related ‘confidence value’ or ‘acoustic score’) corresponding to the first voice input (same above); 
identifying a second plurality of sound properties  (similar to the first plurality of sound properties, but may have acoustical differences) corresponding to the second voice input (same above) (p7, p10, p39, p66).
determining that the second voice input (same above) is directed to correcting (or replacing) the text string previously generated (‘misrecognized’) (p13-p15).
It is noted that KISS does not expressly disclose the determining regarding the correcting (or replacement) “based on the first voice input when at least a portion of the first plurality of sound properties matches at least a portion of the second plurality of sound properties”  However, it is noted that KISS further disclose an instance that ‘"Johnny" can be misrecognized as "John"’ in ‘word level recognition’ (p91), and determining ‘input speech sequence’/ ‘spoken representation’ match/mismatch to desired ‘sequence of words’ using ‘an acoustic score’ corresponding to ‘recognition confidence’/‘recognition confidence value’ with a ‘threshold’ to determine/select ‘emphasized word’ (‘likely’ being a recognition ‘error’) (p10, p16), and repeating ‘speech recognition’ for emphasized/selected word(s) based on ‘new input speech sequence’/‘new spoken representation’ (p44, p72) including exemplary spoken words: such as ‘free’ vs. ‘three’, ‘solely’ vs. ‘only’, and ‘Johnny’ vs. ‘John’ (p66, p90-p91).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to recognize that both words Johnny and John (also for spoken words ‘free’ and ‘three’, ‘solely’ and ‘only’, or spoken a name and the name with its spelling sounds) would have a matched portion of sound properties (such as matched/similar phonemes and/or related acoustic score/recognition confidence value), and to combine above teachings of KISS together by providing a mechanism of determining a correction (or replacement) based on that a spoken word/name previously generated (such as misrecognized) in an initial input speech sequence (such as a spoken word “Johnny”, “free”, or a name) has a matched portion of sound properties (such as phonemes and associated acoustic score(s)/recognition confidence value(s)/probability) comparing with that of same/similar spoken word/name recognized in the ‘new input speech sequence’ (such as such as ‘free’ vs. ‘three’, ‘solely’ vs. ‘only’, ‘Johnny’ vs. ‘John’, and/or a spoken name vs. later spoken name with its spelling sounds), as claimed, for the purpose (motivation) of improving error correction in speech recognition systems (KISS: p8).  In addition, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to recognize that the implementation of combining above mentioned teachings of KISS would be within the scope of capability of the skilled person in the art and the result would be predictable.
KISS further discloses:
in response to the determining that the second voice input is directed to correcting the text string (same above), modifying (such as replacing) the text string based on the second voice input (p9-p11, p22-p23, p44-p45, p66, p72).
As per claim 8 (depending on claim 1), the rejection is based on the same reason described for claim 1, because it also reads on the limitation(s) of claim 8.
As per claim 10 (depending on claim 1), KISS further discloses “the sound property corresponds to a correction expression (read on saying ‘Oops’, expressed ‘recognition confidence’ that ‘said the word candidate is a correct speech recognition result’, ‘likelihood’ being ‘expressed in the form of a language model score’, identified ‘phonemes’ being ‘associated with a recognition confidence value (or a probability)’ expressed as ‘recognition result’ being ‘correct’, or ‘a spoken representation of a word and its spelled representation’ in a form of ‘Memphis, MEMPHIS’, in a broad sense) (p7, p24-p25, p66, p99).
As per claim 31 (depending on claim 1), as best understood in view of claim rejection under 35 USC 112 (a), see above, KISS further discloses “determining that the second voice input is not directed to correcting the text string previously generated based on the first voice input when at least the portion of the first plurality of sound properties does not match at least the portion of the second plurality of sound properties” (wherein based on prior art teachings as stated for claim 1, see above,  it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to recognize that when a spoken word/name previously generated (such as recognized) in an initial input speech sequence (such as a spoken word/name “Johnny”, “free”, or ‘Memphis’) would have at least different (i.e. not match) portion of sound properties (such as phonemes and associated acoustic score(s)/ recognition confidence value(s)/probability) compared with that of a different spoken word/name recognized in a second (or a new) input speech sequence (such as including a spoken word/name ‘John’, ‘three’, or ‘Memphis, MEMPHIS’ i.e. a spoken name with its spelling sounds, respectively) could determine the second input not direct to a correction, as claimed, for which the corresponding implementation base on above combined teachings of KISS, would be within the scope of capability of the skilled person in the art and the result would be predictable.
As per claims 11, 18, 20 and 32, they recite a system (apparatus). The rejection is based on the same reasons described for claims 1, 8, 10 and 32 respectively, because the apparatus and method claims are related as apparatus and method of using the same, with each claimed element's function corresponding to the claimed method step.

Claims 4, 9, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over KISS in view of HOLMES (US 6,292,775). 
 As per claim 4 (depending on claim 1), KISS does not expressly disclose the above determined matching sound properties comprising “acoustic envelope(s)” as claimed.  However, the same/similar concept/feature is well known in the art as evidenced by HOLMES who discloses ‘speech processing system using format analysis’ (title), comprising ‘determining the phonetic properties (read on sound properties) of speech signals’ including ‘formant frequencies’, ‘shape of the short-term spectrum (read on acoustic envelope) of input sound’ (col. 1, lines 58-65, col. 3, line 56 to col, 4, line 5), ‘determining formant amplitudes for input speech signal spectral cross-sections’ with ‘speech recognition’ in ‘responsive to formant frequencies and formant amplitudes (reflecting acoustic envelope)’ such as for ‘word matching’ using ‘comparison with reference information’ and ‘indicating degree of confidence’ (col. 5, lines 23-42, col. 8, lines 12-33), and ‘performing a crude comparison of power spectra’ with ‘overall shape of the power spectrum (also read on acoustic envelope)’ or ‘formant spectral envelope (also read on acoustic envelope)’ (co. 11, lines 11-30, col. 19, lines 13-22). Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of KISS and HOLMES together by providing a mechanism of determining matching sound properties by comparing/matching two input sounds /voices by using shape of the short-term spectrum/formant frequencies against formant amplitudes/ power spectra/formant spectral envelope (i.e. acoustic envelope), for the purpose (motivation) of offering better speech indications to provide word recognition and/or being advantage as regards to word recognition accuracy (HOLMES: abstract, col. 9, lines19-43).  
As per claim 9 (depending on claim 1), the rejection is based on the same reason described for claim 4, because it also reads on the limitations of claim 9.
As per claims 14 and 19 (depending on claim 11), the rejection is based on the same reason described for claims 4 and 9, because the claims recite/include the same/similar limitation(s) as claims 4 and 9 respectively.

Claims 5-6 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over KISS in view of SEKINE (US 2021/0225361). 
 As per claim 5 (depending on claim 1), even though KISS further discloses “wherein the text string previously generated based on the first voice input is a first text string, and wherein determining whether the second voice input is directed to correcting the first text string comprises: converting the second voice input to a second text string (same above, as stated for claim 1)”, KISS does not expressly disclose “determining, based on a database of misrecognized words, that a first word in the first text string and a second word in the second text string are misrecognized for one another.”  However, the same/similar concept/feature is well known in the art as evidenced by SEKINE who discloses ‘the erroneous conversion dictionary creation system’ (title), comprising ‘voice (speech) recognition system’ including ‘incorrect conversion dictionary (read on claimed “database of misrecognition words”)’, ‘determines whether the analyzed term (word) matches any of incorrectly converted terms (misrecognized words) stored in the incorrect conversion dictionary’, so as to convert a related ‘term’ under ‘voice analysis’ to  ‘its related correct term’ (p18, p32).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of KISS and SEKINE together by providing a mechanism of determining one of words converted from first or second voice inputs as being misrecognized based on incorrect conversion dictionary (or “database of misrecognition words”), as claimed, for the purpose (motivation) of improving the accuracy of the voice/speech recognition (SEKINE: p23).
As per claim 6 (depending on claim 5), the rejection is based on the same reason described for claim 1, because it also reads on the limitations of claim 6.
As per claims 15-16 (depending on claim 11), the rejection is based on the same reason described for claims 5-6, because the claims recite/include the same/similar limitation(s) as claims 5-6 respectively.

Claims 2, 7, 12 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over KISS in view of SAKOE et al. (US 4,158,750) hereinafter referenced as SAKOE. 
 As per claim 7 (depending on claim 1), KISS does not expressly disclose “the first voice input is received at a first time and the second voice input is received at a second time, and wherein determining whether the second voice input is directed to correcting the text string comprises determining that less than a threshold amount of time has elapsed between the second time and the first time.”  However, the same/similar concept/feature is well known in the art as evidenced by SAKOE who discloses ‘speech recognition system with delayed output’ (title), comprising: ‘a speech recognition system’ for ‘recognizing the voice input to produce a recognition output’, including that ‘a start signal (at the first time) is produced whenever a voice input exceeds a threshold level (such as ‘amplitude levels ki’ above ‘a first predetermined threshold’) and a pause interval detection signal is produced whenever a voice input falls below a threshold level’, ‘an output timing signal is produced when the detection signal lasts (elapsed) a preselected interval of time that may be either about 250 milliseconds (read on “threshold amount of time”) or about 250 milliseconds plus a delay (or ‘L’ also as being ‘predetermined period of time’)’, ‘the delay may be given either by a predetermined duration or an interval between those instants at which the above-mentioned 250 milliseconds have just elapsed after production of the detection signal and after production of another pause interval detection signal for a next following voice input (at the second time) (abstract, col. 4, line 46 to col. 5, line 35, col. 6, lines 38-67, col. 8, lines 8-53; also Figs. 4 and 6).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of KISS and SAKOE together by providing a mechanism of determining a time period elapsed/lasted between beginning/staring time of the first voice input and that of the second voice input less that a predetermined duration/interval (threshold amount) of time, as claimed, for the purpose (motivation) of offering a speech recognition system with a sufficient interval of time for confirmation and correction (SAKOE: col. 2, lines 40-60).
As per claim 2 (depending on claim 2), the rejection is based on the same reason described for claim 7, because it also reads on the limitations of claim 2.
As per claims 17 and 12 (depending on claim 11), the rejection is based on the same reason described for claims 7 and 2, because the claims recite/include the same/similar limitation(s) as claims 7 and 2 respectively.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QI HAN whose telephone number is (571)272-7604.  The examiner can normally be reached on 9-19:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 

QH/qh
July 9, 2022
/QI HAN/Primary Examiner, Art Unit 2659