DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 27 November 2019 in reference to application 16/697,366.  Claims 1-20 are pending and have been examined.

Claim Objections
Claims 16 and 17 are objected to because of the following informalities:  Acoustic score and distance are labeled at “A12” but is not labeled as such in the antecedent claims.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 8 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  The term "similar" in claim 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 3, 10-12, 15, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US PAP 2020/0082808) in view of Ganapathiraju et al. (US PAP 2013/0289987).

Consider claim 3, Li teaches a method for improving the performance of an automatic speech recognition (ASR) system that uses a language model based on a corpus data set which includes words from a generic corpus data set, using a confusion index indicative of the amount of confusion between words from the generic corpus data set and the domain-specific data set (abstract, figure 6), comprising: 
determining the confusion index using a method (0060-65) comprising: 

calculating an acoustic score indicative of the acoustic distance between the first word and the second word using a lexicon having a phonetic breakdown of the first word and the second word (0064, phonetic distance portion of equation); 
calculating a weighted language score indicative of the likelihood of the first word and the second word occurring in the corpus data set (0064, word frequency of co-occurring word, and all words in data set); and 
calculating the confusion index (CI) using the acoustic score and the weighted language score (0065, calculating similarity score using phonetic distance and word frequency).
	Li does not specifically teach a corpus data set which includes words from a generic corpus data set and words from a domain-specific data set;
receiving a first word from the domain-specific data set and a second word from the generic corpus data set.
In the same field of predicting confusability, Ganapathiraju teaches a corpus data set which includes words from a generic corpus data set (0027-29, conversations stored in database) and words from a domain-specific data set (0035, domain lexicon);
receiving a first word from the domain-specific data set and a second word from the generic corpus data set (0036-37, keywords from conversations compared with domain specific words).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to compare generic words with domain specific words as taught by 

Consider claim 10, Li teaches the method of claim 3, wherein the CI value when the first word and the second word sound alike and have a high likelihood of occurring in the corpus is higher than when the first word and the second word sound alike and do not have a high likelihood of occurring in the corpus (0065, in equation, phonetic similarity and frequency of occurrence both increase similarity score).

Consider claim 11, Li teaches the method of claim 3, further comprising updating at least one of the corpus data set and the lexicon based on the value of CI (0065, updating confusion set based on threshold).

Consider claim 12, Li and Ganapathiraju teach the method of claim 3, further comprising boosting domain-specific words in the corpus (Ganapathiraju, 0033, domain words, 0013-14 examples of false positives may be flagged based on similarity and importance to domain) when CI is greater than a predetermined confusion threshold (Li 0065, updating confusion set based on threshold).

Consider claim 15, Li teaches the method of claim 3, wherein the calculating the weighted language score comprises determining a weighting factor (W), 0064, determining a weighting factor γ.

Consider claim 17, Li teaches The method of claim 3, wherein the acoustic distance A12 between the first word and the second word is determined using a string edit distance measurement tool (0064, edit distance used to determine similarity).

Consider claim 18, Li teaches a method for improving the performance of an automatic speech recognition (ASR) system that uses a language model based on a corpus data set which includes words, using an amount of confusion between words (abstract, figure 6), comprising: 
calculating a confusion index (CI), comprising: 
receiving a first word from the first data set and a second word from the second data set (0061, between any 2 words); 
calculating an acoustic score indicative of the acoustic distance between the first word and the second word using a lexicon having a phonetic breakdown of the first word and the second word (0064, phonetic distance portion of equation); 
calculating a weighted language score indicative of the likelihood of the first word and the second word occurring in the corpus data set (0064, word frequency of co-occurring word, and all words in data set); and 
calculating the confusion index (CI) using the acoustic score and the weighted language score (0065, calculating similarity score using phonetic distance and word frequency).
Li does not specifically teach a corpus data set which includes words from a corpus first data set and a domain-specific second data set.

Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to compare generic words with domain specific words as taught by Ganapathiraju in the system of Li in order to in order to reduce recognition errors in domain specific recognition situations (Ganapathiraju 0002, 0035).

Claim 6, 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li and Ganapathiraju as applied to claim 3 above, and further in view of Gonong et al (US PAP 2014/0012575).

Consider claim 6, LI and Ganapathiraju teach the method of claim 3, but do not specifically teach wherein the domain-specific data set comprises an uncommon word list.
In the same field of determining word confusion, Gonong teaches wherein the domain-specific data set comprises an uncommon word list (0033, keywords that are uncommon to domain).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to choose uncommon words for a domain for further consideration as 

Consider claim 7, Li and Gonong teach the method of claim 6, wherein the calculating CI is performed for each first word in the uncommon word list against each second word in the generic corpus data set (Gonong 0033, keywords that are uncommon to domain flagged for further consideration, Li 0065, calculating similarity of word against each other word in corpus).


Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li and Ganapathiraju as applied to claim 3 above, and further in view of Chengalvarayan et al (US PAP 20110288867).

Consider claim 13, Li and Ganapathiraju teach the method of claim 3, but does not specifically teach further comprising adding context to domain-specific words in the corpus when CI is less than a predetermined confusion threshold.
In the same field of determining confusability, Chengalvarayan teaches adding context to domain-specific words in the corpus when CI is less than a predetermined confusion threshold (0068-70, if words determined not to be confusable, they may be stored with context associated with words).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to add words to domain based on confusability as taught by .

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li and Ganapathiraju as applied to claim 3 above, and further in view of Daredia et al (US PAP 2020/0403818).

Consider claim 14, Li and Ganapathiraju teach the method of claim 3, but does not specifically teach further comprising removing an unimportant corpus word from the lexicon when CI is less than a predetermined confusion threshold.
In the same field of lexicon adaptation, Daredia teaches removing an unimportant corpus word from the lexicon when CI is less than a predetermined confusion threshold (0095, unimportant words may be removed from corpus, in combination with Li, if word was above threshold it would be added to confusable word list instead).
	Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to remove unimportant words as taught by Daredia in the system of Li and Ganapathiraju in order to allow the vocabulary to be built based on words that carry significance to the meaning of a sentence.

Allowable Subject Matter
Claims 1 and 2 are allowed.  The following is an examiner’s statement of reasons for allowance: 

Consider claim 1, Li teaches a method for improving the performance of an automatic speech recognition (ASR) system that uses a language model based on a corpus data set which includes words from a generic corpus data set, using a confusion index indicative of the amount of confusion between words from the generic corpus data set and the domain-specific data set (abstract, figure 6), comprising: 
determining the confusion index using a method (0060-65) comprising: 
receiving a first word from the data set and a second word from data set (0061, between any 2 words); 
calculating an acoustic score indicative of the acoustic distance between the first word and the second word using a lexicon having a phonetic breakdown of the first word and the second word (0064, phonetic distance portion of equation); 
calculating a weighted language score indicative of the likelihood of the first word and the second word occurring in the corpus data set (0064, word frequency of co-occurring word, and all words in data set); and 
calculating the confusion index (CI) using the acoustic score and the weighted language score (0065, calculating similarity score using phonetic distance and word frequency);
adjusting the corpus data set based on the value of CI (0065, updating confusion set based on threshold).
	Li does not specifically teach a corpus data set which includes words from a generic corpus data set and words from a domain-specific data set;
receiving a first word from the domain-specific data set and a second word from the generic corpus data set.

receiving a first word from the domain-specific data set and a second word from the generic corpus data set (0036-37, keywords from conversations compared with domain specific words).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to compare generic words with domain specific words as taught by Ganapathiraju in the system of Li in order to in order to reduce recognition errors in domain specific recognition situations (Ganapathiraju 0002, 0035).
However the prior art of record does not teach or fairly suggest the limitations of “calculating a weighted language score indicative of the likelihood of the first word and the second word occurring in the corpus data set, comprising performing an equation: W(U1+U2), where U1 and U2 are unigram values of the first word and the second word, respectively, and W is a weighting factor in the weighted language score;  calculating the confusion index (CI) using the acoustic score and the weighted language score, comprising performing an equation:             
                C
                I
                =
                1
                /
                (
                
                    
                        e
                    
                    
                        A
                        12
                    
                
                -
                 
                
                    
                        e
                    
                    
                        -
                        i
                        W
                        (
                        U
                        1
                        +
                        U
                        2
                        )
                    
                
            
        ”  when combined with each and every other limitation of the claim.  Therefore claim 1 is allowable.

Claim 2 depends on and further limits claim 1 and therefore is allowable as well.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably 

Claims 4, 5, 9, 16, 19, and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  The following is a statement of reasons for the indication of allowable subject matter:  

Consider claim 4, Li and Ganapathiraju teach the method of claim 3.  However the prior art of record does not teach or fairly suggest the limitations of “wherein the weighted language score comprises an equation: W(U1+U2), where U1 and U2 are unigram values of the first word and the second word, respectively; and W is a weighting factor” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 4 contains allowable subject matter. 

Consider claim 5, Li and Ganapathiraju teach the method of claim 3.  However the prior art of record does not teach or fairly suggest the limitations of “wherein the calculating the confusion index comprises performing an equation:                     
                        C
                        I
                        =
                        1
                        /
                        (
                        
                            
                                e
                            
                            
                                A
                                12
                            
                        
                        -
                         
                        
                            
                                e
                            
                            
                                -
                                i
                                W
                                (
                                U
                                1
                                +
                                U
                                2
                                )
                            
                        
                    
                , where A12 is the acoustic distance between the first word and the second word; W(U1+U2) is the weighted language score; U1 and U2 are the unigram values of the first word and the second word, respectively; and W is a weighting factor” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 5 contains allowable subject matter. 

Consider claim 9, Li and Ganapathiraju teach the method of claim 3. However the prior art of record does not teach or fairly suggest the limitations of “wherein at least one of the first word and the second word comprises a plurality of words and wherein the weighted language score comprises an equation: W(U1+U2), where U1 and U2 are n-gram values of the first word and the second word, respectively, and; and W is a weighting factor” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 9 contains allowable subject matter.  

Consider claim 16, Li and Ganapathiraju teach the method of claim 15. However the prior art of record does not teach or fairly suggest the limitations of “wherein the weighting factor (W) is based on the value of at least one of the acoustic score A12 and a language score (U1 + U2), where U1 and U2 are n-gram values of the first word and the second word, respectively” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 16 contains allowable subject matter.  

Claim 19 contains similar subject matter as claim 4 and is therefore allowable for the same reasons.

Claim 20 depends on and further limits claim 19 and therefore contains allowable subject matter as well.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451.  The examiner can normally be reached on 7:30-12 Monday and Friday, 7:30-6 Tuesday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


DOUGLAS GODBOLD
Examiner
Art Unit 2658



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2658