Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections-35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claims 1-3, 7, 8, 9-11, 15, 16 and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Du (CN-109036420A).
 
With respect to claims 1, 9 Du teaches A speech recognition method, comprising: 
obtaining a first text by recognizing an acquired speech signal ([0091]  S302: and carrying out recognition processing on the received voice data to obtain a recognition result.); 
performing a database search by using a first pinyin sequence corresponding to the first text (¶ [0129] Specifically, the pinyin information of all characters in the recognition result [first text] is used as first pinyin information to be matched, after the matching process of the pinyin information to be matched and each piece of pinyin information is controlled according to the matching result, whether at least one piece of pinyin information and the matching result of the first pinyin information to be matched meet a first preset condition exists in the pinyin information set, when the first preset condition refers to that the first pinyin information to be matched and each piece of pinyin information in the pinyin information set are matched, whether pinyin information with the matching degree reaching a first preset matching degree threshold exists in the pinyin information set, if yes, subsequent steps are not needed, at the moment, the pinyin information with the matching degree reaching the preset matching degree threshold with the first pinyin information to be matched is directly used as the voice recognition result ); 
performing a fuzzy search according to the first pinyin sequence in response to the first pinyin sequence not being found in the database (¶ [0129] … and if not, the pinyin information of all characters in the recognition result is fuzzified and then used as the pinyin information to be matched, or the keyword is extracted from the recognition result, the pinyin information of the keyword is fuzzified and then used as the pinyin information to be matched, and then the letters in the pinyin information to be matched and the letters in the pinyin information set are sequentially matched to obtain the result of the voice recognition), wherein the fuzzy search is used to look up a second pinyin sequence having at least one pinyin in the first pinyin sequence and a second text corresponding to the second pinyin sequence (¶[0089]  fuzzification processing is performed on the contact person "ni" into "ni ni ni ni" [second pinyin], the fuzzification processing is performed on the contact person "to obtain " ni li " li", "li ni", and the like [second text]); and 
selecting at least one second text obtained by the fuzzy search as a speech recognition result of the speech signal (¶[0129] … and if not, the pinyin information of all characters in the recognition result is fuzzified and then used as the pinyin information to be matched, or the keyword is extracted from the recognition result, the pinyin information of the keyword is fuzzified and then used as the pinyin information to be matched, and then the letters in the pinyin information to be matched and the letters in the pinyin information set are sequentially matched to obtain the result of the voice recognition). 

 With respect to claims 2 and 10 Du teaches wherein the selecting at least one second text obtained by the fuzzy search as a speech recognition result of the speech signal comprises: 
determining a similarity between the second pinyin sequence corresponding to each second text of the at least one second text and the first pinyin sequence to obtain multiple similarities ([Note: Fig. 4 describes the process of matching pinyin for maximum similarity in the flowchart that is applicable to fuzzifacation]. ¶[0101] Still taking the above example as an example, assuming that the recognition result is "zhang", at this time, the "zhang dong" is fuzzified to obtain "zhan don", "zang dong", and "zhan dong", at this time, the "zhan don", "zang dong", and "zhan dong" are used as the first pinyin information to be matched [from this point onwards paragraph [0114] describes the similarity matching] ¶[0114]  For better understanding, the description is made by way of example, for example, assuming that the first pinyin information to be matched is "li ning", and meanwhile, assuming that the pinyin information existing in the pinyin information set is "zhang min", "xiao hong", "li jiang", "zhu tan yu", "ni ni ni ni", "ni ling", [second text] respectively, and selecting "ni ling" from the pinyin information set for matching, in the process of matching "li ning" with "ni ling", the correct matching rate of "li ning" and "ni ling" is calculated in real time, specifically, when the first letter "l" of "li ning" is matched with the first letter "n" of "ni ling", the current correct matching rate is 0%, when the second letter "i" of "li ning" is matched with the second letter "i" of "ni ling", the current correct matching rate is 16.7%, when the third letter "n" of "li ning" is matched with the third letter "l" of "ni ling", the current correct matching rate is 16.7%, when the fourth letter "i" of "li ning" is matched with the fourth letter "i" of "ni ling", the current correct matching rate is 33.4%, when the fifth letter "n" of "li ning" is matched with the fifth letter "n" of "ni ling", the current correct matching rate is 50.1%, and when the sixth letter "g" of "li ning" is matched with the sixth letter "g" of "ni ling", the current correct matching rate is 66.8%.); and 
determining the speech recognition result of the speech signal from the at least one second text according to a maximum similarity among the multiple similarities (¶[0115] S402: judging whether the correct matching rate of the first pinyin information to be matched and the first pinyin information is greater than a preset correct rate threshold value, wherein the first pinyin information is any one of pinyin information in a pinyin information set;).

With respect to claims 3 and 11 Du teaches wherein the determining the speech recognition result of the speech signal from the at least one second text according to a maximum similarity among the multiple similarities comprises: 
determining that the second text corresponding to the maximum similarity is the speech recognition result of the speech signal, in a case that the maximum similarity is greater than or equal to a preset threshold (¶[0115]  S402: judging whether the correct matching rate of the first pinyin information to be matched and the first pinyin information is greater than a preset correct rate threshold value, wherein the first pinyin information is any one of pinyin information in a pinyin information set;); or, 
determining that the first text is the speech recognition result of the speech signal, in a case that the maximum similarity is less than the preset threshold. 

With respect to claim 7 and 15 Du teaches after the performing a database search by using a first pinyin sequence corresponding to the first text, further comprising: determining that the speech recognition result of the speech signal is a third text corresponding to the first pinyin sequence, in a case that the first pinyin sequence is found in the database (¶ [0118] S403: and interrupting the matching process of the first pinyin information [first pinyon sequence corresponding to the first text] to be matched and the first pinyin information, and taking the first pinyin information as the voice recognition result¶[0119]  It can be understood that, when the correct matching rate of the first pinyin information to be matched and the first pinyin information is greater than the preset correct rate threshold, the matching process of the first pinyin information to be matched and the first pinyin information is interrupted, the above example is carried out, further, the process of matching the sixth letter "g" of the "li ning" with the sixth letter "g" of the "ni ling" is interrupted, and at this time, the "ni ling" is taken as the result of the speech recognition [third text].)

With respect to claims 8 and 16 Du teaches before the performing a fuzzy search according to the first pinyin sequence, in response to the first pinyin sequence not being found in the database, further comprising: determining that the first text does not exist in the database (¶ [0101] Still taking the above example as an example, assuming that the recognition result is "zhang", at this time, the "zhang dong" is fuzzified [first pinyin sequence is not in the database therefore there is a need to fuzzify]to obtain "zhan don", "zang dong", and "zhan dong", at this time, the "zhan don", "zang dong", and "zhan dong" are used as the first pinyin information to be matched. and ¶ [0101] …. and if not, the pinyin information of all characters in the recognition result is fuzzified and then used as the pinyin information to be matched, or the keyword is extracted from the recognition result, the pinyin information of the keyword is fuzzified and then used as the pinyin information to be matched, and then the letters in the pinyin information to be matched and the letters in the pinyin information set are sequentially matched to obtain the result of the voice recognition).

With respect to claim 17 Du teaches A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to enable a computer to execute the method according to claim 1 (¶[0044]  The invention provides a voice recognition control method, a terminal and a computer readable storage medium. ¶[0158]  In this embodiment, the communication bus 703 is used to implement connection communication between the processor 701 and the memory 702, and the processor 701 is used to execute one or more first programs stored in the memory 702, so as to implement the following steps:)

With respect to claim 18 Du teaches recognizing a speech signal to obtain a first text; 
performing a fuzzy matching according to a first pinyin sequence corresponding to the first text to obtain multiple second pinyin sequences and second texts corresponding to the second pinyin sequences (¶ [0129] … and if not, the pinyin information of all characters in the recognition result [first text] is fuzzified and then used as the pinyin information to be matched, or the keyword is extracted from the recognition result, the pinyin information of the keyword is fuzzified and then used as the pinyin information [second pinyin] to be matched, and then the letters in the pinyin information to be matched and the letters in the pinyin information set are sequentially matched to obtain the result of the voice recognition), wherein each of the second pinyin sequence has at least one pinyin in the first pinyin sequence (¶[0089]  fuzzification processing is performed on the contact person "ni" into "ni ni ni ni", the fuzzification processing is performed on the contact person "to obtain" ni li "li", "li ni", and the like)); and 
determining a speech recognition result of the speech signal from the multiple second texts (¶[0129] … and if not, the pinyin information of all characters in the recognition result is fuzzified and then used as the pinyin information to be matched, or the keyword is extracted from the recognition result, the pinyin information of the keyword is fuzzified and then used as the pinyin information to be matched, and then the letters in the pinyin information to be matched and the letters in the pinyin information set are sequentially matched to obtain the result of the voice recognition)). 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 4, 5, 6, 12, 13, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Du as applied to claims 1 and 9 respectively, in further view of McCraw (US 10978069 B1).

With respect to claims 4 and 12 Du does not explicitly disclose but McCraw teaches after the determining that the second text corresponding to the maximum similarity is the speech recognition result of the speech signal, in a case that the maximum similarity is greater than or equal to a preset threshold, further comprising: 
adding the first text to a generalization table of the second text corresponding to the maximum similarity, wherein the generalization table is used to store a generalized text of the second text corresponding to the maximum similarity, and the generalized text has a same intent as the second text (Col 15 ll 31-44) In an example, the user input manager 410 may not perform at least steps 406 and 426. For example, the user input manager 410 may determine an alternate word(s) in a user input [first text], determine previous instances when the alternate word was included in previous user inputs, determine the previous number of instances satisfy a condition (e.g., a threshold frequency), user word embedding to determine a default word(s) that is associated with the alternate word and that is included in system(s) 120 outputs, and store [add]an association between the user identifier, the alternate word(s), and the default word(s) [generalization of the second text; the default words be definition indicate maximum similarity]. As such, one skilled in the art will appreciate that a vocabulary mapping, in the vocabulary map storage 420, need not be tied to a particular intent indicator.). 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify Du in view of McCraw, to add the first text to a generalization table of the second text corresponding to the maximum similarity, wherein the generalization table is used to store a generalized text of the second text corresponding to the maximum similarity, and the generalized text has a same intent as the second text in order to enable speech-based user control of a computing device to perform tasks based on the user's spoken commands (Col 1 ll 13-14, McCraw).
With respect to claims 5 and 13 Du further teaches [[sending a first data stream to a server, wherein the first data stream carries a corresponding relationship among the second text corresponding to the maximum similarity,]] the first text and the first pinyin sequence ((¶ [0129] Specifically, the pinyin information of all characters in the recognition result [first text] is used as first pinyin information to be matched, after the matching process of the pinyin information to be matched …)
Du does not explicitly disclose but McCraw teaches sending a first data stream to a server (Col 10 ll 46-48: In some instances, aspects of the system(s) 120 may be configured at a computing device (e.g., a local server).  Col 11 ll 1-5: The system(s) 120 may also include a user input manager 410 and a vocabulary map storage 420. The user input manager 410 may generate vocabulary mappings, which are stored in the vocabulary map storage 420 (as illustrated in FIGS. 4A through 4C).), wherein the first data stream carries a corresponding relationship among the second text corresponding to the maximum similarity, [[the first text and the first pinyin sequence]] (Col 15 ll 31-44) In an example, the user input manager 410 may not perform at least steps 406 and 426. For example, the user input manager 410 may determine an alternate word(s) in a user input [first text], determine previous instances when the alternate word was included in previous user inputs, determine the previous number of instances satisfy a condition (e.g., a threshold frequency), user word embedding to determine a default word(s) that is associated with the alternate word and that is included in system(s) 120 outputs, and store [add]an association between the user identifier, the alternate word(s), and the default word(s) [generalization of the second text; the default words be definition indicate maximum similarity]. As such, one skilled in the art will appreciate that a vocabulary mapping, in the vocabulary map storage 420, need not be tied to a particular intent indicator)
mapping, in the vocabulary map storage 420, need not be tied to a particular intent indicator.). 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify Du in view of McCraw, so the first data stream carries a corresponding relationship among the second text corresponding to the maximum similarity in order to enable speech-based user control of a computing device to perform tasks based on the user's spoken commands (Col 1 ll 13-14, McCraw).

With respect to claims 6 and 14 Du further teaches [[recognizing a second data stream input by a user, wherein the second data stream is used to indicate a correspondence relationship between]] the second text and the second pinyin sequence, [[and a generalized text of the second text]] (¶[0089]  fuzzification processing is performed on the contact person "ni" into "ni ni ni ni" [second pinyin], the fuzzification processing is performed on the contact person "to obtain " ni li " li", "li ni", and the like [second text]); and 
[[sending the second data stream to a server]]. 
Du does not explicitly disclose but McCraw teaches recognizing a second data stream input by a user, wherein the second data stream is used to indicate a correspondence relationship between [[the second text and the second pinyin sequence]], and a generalized text of the second text (Col 15 ll 31-44) In an example, the user input manager 410 may not perform at least steps 406 and 426. For example, the user input manager 410 may determine an alternate word(s) in a user input [second text], determine previous instances when the alternate word was included in previous user inputs, determine the previous number of instances satisfy a condition (e.g., a threshold frequency), user word embedding to determine a default word(s) that is associated with the alternate word and that is included in system(s) 120 outputs, and store [add]an association between the user identifier, the alternate word(s), and the default word(s) [generalization of the second text; the default words be definition indicate maximum similarity]. As such, one skilled in the art will appreciate that a vocabulary mapping, in the vocabulary map storage 420, need not be tied to a particular intent indicator);
sending the second data stream to a server (Col 10 ll 46-48: In some instances, aspects of the system(s) 120 may be configured at a computing device (e.g., a local server).  Col 11 ll 1-5: The system(s) 120 may also include a user input manager 410 and a vocabulary map storage 420. The user input manager 410 may generate vocabulary mappings, which are stored in the vocabulary map storage 420 (as illustrated in FIGS. 4A through 4C).).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify Du in view of McCraw, to recognize a second data stream input by a user, wherein the second data stream is used to indicate a correspondence relationship between [[the second text and the second pinyin sequence]], and a generalized text of the second text in order to enable speech-based user control of a computing device to perform tasks based on the user's spoken commands (Col 1 ll 13-14, McCraw).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ATHAR N PASHA whose telephone number is (408)918-7675. The examiner can normally be reached on Monday-Thursday Alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-10Examiner, Art Unit 2657   

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657