Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 3/14/2022 have been fully considered but they are not persuasive. 
The applicant contends
Claim Rejections - 35 U.S.C. § 102 
Claims 1-4, 7-9, 12-17,19, and 20 
Claims 1-4, 7-9, 12-17,19, and 20 stand rejected as being anticipated by Hoffmeister. In order for a reference to anticipate a claim, the reference must disclose each and every feature of the claim. Here, features of the claims are missing from Hoffmeister. 
In particular, independent claim 1 has been amended to recite: 
"A method comprising: receiving, at a device, a portion of a voice query; determining a characteristic of the portion of the voice query; comparing the characteristic of the portion of the voice query to one or more characteristics of portions of one or more other voice queries; determining, based on the comparing the characteristic of the portion of the voice query to the one or more characteristics of the portions of one or more other voice queries, that the portion of the voice query is not capable of being processed at a cache associated with the device; and 
sending, based on the determining that the portion of the voice query is not capable of being processed at the cache associated with the device, the portion of the voice query for processing." (emphasis added). 
Independent claim 8 has been similarly amended. Hoffmeister does not teach at least these features of independent claims 1 and 8. 
In rejecting claims 1 and 8, the Office Action alleges that Figure 11 and Column 4, line 34-44 of Hoffmeister teaches "comparing the portion of the voice signal to one or more guidance
queries," and "sending the portion of the voice signal for processing." But that is not what Applicant claims. Applicant has amended claims I and 8 to clarify that what is to be compared is "the characteristic of the portion of the voice query to one or more characteristics of portions of one or more other voice queries." Similarly, the Applicant has amended claims 1 to clarify that sending the portion of the voice query for processing is "based on determining that the portion of the voice query is not capable of being processed at the cache associated with the device." Because these features are missing from Hoffmeister, Hoffmeister cannot be found to anticipate claim 1 and 8. 

	The examiner disagrees. Claims 1,8 have been amended to include new limitations as indicated in the applicant’s remarks. The office action below addresses the amendments. The amendments recite “the characteristic of the portion of the voice query …”. Such limitation broadly recites some characteristic or quality or feature of the some portion of the receive voice query. Due to the breath of the term “characteristic” and the lack of language indicating the boundaries of such terminology, such term is interpreted as a feature or quality or characteristic of some portion of the received voice query. Hoffmeister discloses in Col. 4, lines 53-62, a recognition score may represent a probability that a portion of audio data corresponds to a particular phoneme, word or phrase …”. Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.” The office action clearly addresses the portion of Hoffmeister that discloses such recited limitation as well as all other amendments. Please see below.
Additionally, independent claim 16 has been amended to recite: 
"A method comprising: accessing, by a device, a voice query; determining, based on the voice query, a first portion of the voice query and a second portion of the voice query that includes the first portion of the voice query and a subsequent portion of the voice query; 
determining a first characteristic of the first portion of the voice query and a second characteristic of the second portion of the voice query; and storing the first characteristic and the second characteristic." (emphasis added). 
Hoffmeister does not teach at least these features of independent claim 16. 
In rejecting claim 16, the Office Action alleges that Figure 10 and Column 12, line 50- Column 13, line 25 of Hoffmeister teaches "determining, based on the voice query, a plurality of guidance queries, wherein each of the plurality of guidance queries correspond to a portion of the voice query." But that is not what Applicant claims. Applicant has amended claim 16 to clarify that what is to be determined is "a first portion of the voice query and a second portion of the voice query that includes the first portion of the voice query and a subsequent portion of the voice query." Additionally, Applicant has further amended claim 16 to state "determining a first characteristic of the first portion of the voice query and a second characteristic of the second portion of the voice query." Because these features are missing from Hoffmeister, Hoffmeister cannot be found to anticipate claim 16. 

As indicated in the applicant’s remarks, claim 16 has been amended and scope of the claim has changed. As a result, the office action below clearly correlates Hoffmeister and the recited claimed language as amended. Please see the office action below.
Because these features are missing from Hoffmeister, claims 1, 8 and 16 are not anticipated by Hoffmeister and are therefore patentable. Because claims 2-4, 7, 9, 12-15, and 17 depend, either directly or indirectly, from one of the independent claims 1, 8 or 16, they too are Page 8 of 9DOCKET NO.: 102005.011580PATENTApplication No.: 16/659,262Office Action Dated: September 14, 2021patentable for the same reasons, as well as for the additionally features they recite. Accordingly, Applicant respectfully requests reconsideration and withdrawal of the rejection of claims 1-4, 7- 9, 12-17,19, and 20 as being anticipated by Hoffmeister. 

	The examiner disagrees. such claims are dependent on respective independent claims. Please see the office action below and rebuttal above.
Claim Rejections - 35 U.S.C. § 103 Claim 18 has been rejected as being unpatentable over Hoffmeister in view of U.S. Patent No. 9,536,151 (hereinafter "Postelnicu"). Claim 18 is canceled without prejudice or disclaimer rendering the rejection moot. Reconsideration of the 35 U.S.C. § 103 rejection is respectfully requested.

Such claim has been cancelled. Please see the office action below for update to the status of claim 18.
New Claims New claims 21-23 respectively depend from independent claims 1, 8 and 16. Applicant submits that new claims 21-23 are patentable both by virtue of their dependency on one of claims 1, 8, or 16 and by the virtue of the additional features they recite. 

Such claims are newly added and hence, considered below. Please see the office action below.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-4,6-9,12-17,19-20 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Hoffmeister et al (Patent No.: 9070367).
Claim 1, Hoffmeister et al discloses
receiving, at a device, a portion of a voice query (Fig. 11, label 1102, Col. 4, lines 53-62 discloses a recognition score may represent a probability that a portion of audio data corresponds to a particular phoneme, word or phrase …”. Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.”);
determining a characteristic of the portion of the voice query (Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) …”. The speech units or phonemes are characteristic of the portion of the voice query or the sounds spoken in the utterance of the audio data.);
comparing the characteristic of the portion of the voice query to one or more characteristics of portions of one or more voice queries (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds” (portion of one or more other voice queries, sounds or speech units or phonemes indicates characteristics of such portion) “to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data” (portion of the voice query, sounds or speech units or phonemes indicates characteristics of portion of the voice query). Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input.);
determining, based on the comparing the characteristic of portion of the voice query to the one or more characteristics of the portions of one or more other voice queries, that the portion of the voice query is not capable of being processed at a cache associated with device (Fig. 11, label 1110 shows the audio signal is transmitted to the server for speech recognition processing as a result of the determination at label 1106.); and
sending, based on the determining that the portion of the voice query is not capable of being processed at the cache associated with the device, the portion of the voice signal for processing (Fig. 11, label 1110 is based on the result of the determination at label 1106).
Claim 2, Hoffmeister et al discloses the voice query comprises a plurality of utterances (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.”), and wherein the portion of the voice query corresponds to a subset of the plurality of utterances (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.” Any portion of the audio data or audio signal received includes a subset or the spoken utterances.).
Claim 3, Hoffmeister et al disclose wherein determining that portion of the voice query is not capable of being processed at the cache associated with the device (Fig. 11, label 1110,1106) comprises determining, prior to receiving another portion of the voice query, that the portion of the voice query is not capable of being processed at the cache associated with the device. (Col. 11, lines 44-50 disclose the local device processes the portion of the audio signal containing frequent phrase or word and transmits all or only a remainder (such as additional speech) of the audio over the network to the remote device for ASR processing. Such can occur before or after reception of at least one other portion of the audio file since speech recognition is conducted according to speech models of frequently spoken utterances as disclosed in Col. 11, lines 57-62.)
Claim 4, Hoffmeister et al disclose sending the portion of the voice query for processing comprises sending, after receiving the another portion of the voice query, the portion of the voice query for processing (Col. 11, lines 44-50 disclose the local device processes the portion of the audio signal containing frequent phrase or word and transmits all or only a remainder (such as additional speech) of the audio over the network to the remote device for ASR processing. Such can occur before or after reception of another portion of the audio file since speech recognition is conducted according to speech models of frequently spoken utterances as disclosed in Col. 11, lines 57-62.).
Claim 6, Hoffmeister et al discloses wherein determining that the portion of the voice query is not capable of being processed at the cache associated with the device (Fig. 11, label 1110 shows the audio signal is transmitted to the server for speech recognition processing as a result of the determination at label 1106.) comprises
determining that the characteristic of the portion of the voice query does not correspond to a characteristic of the one or more characteristics of the portions of the one or more other voice queries (Fig. 11, label 1106 determines whether the audio signal match when compared to speech models of frequent phrases. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds” (portion of one or more other voice queries, sounds or speech units or phonemes indicates characteristics of such portion) “to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data” (portion of the voice query, sounds or speech units or phonemes indicates characteristics of portion of the voice query). Label No,1110 shows the audio signal is transmitted to the server for speech recognition processing as a result of the determination at label 1106, which indicates there is no match.).
Claim 7, Hoffmeister et al discloses the portion of the voice query during a particular time interval of the voice query (Col. 4, lines 54-62 discloses “a probability that a portion of voice query corresponds to a particular phoneme, word or phrase, the recognition score may also incorporate other information which indicates ASR processing quality of the scored audio data relative to the ASR processing quality of the scored audio data relative to the ASR processing of the other audio data.” This indicates the portion of the voice query is determined based on particular phoneme, word or phrase, which includes a time interval or time period.) 
Claim 8, Hoffmeister et al discloses
receiving, at a device (Fig. 2), a portion of a voice query (Fig. 11, label 1102, Col. 4, lines 53-62 discloses a recognition score may represent a probability that a portion of audio data corresponds to a particular phoneme, word or phrase …”. Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.”);
determining a characteristic of the portion of the voice query (Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) …”. The speech units or phonemes are characteristic of the portion of the voice query or the sounds spoken in the utterance of the audio data.);
comparing the characteristic of the portion of the voice query to one or more characteristics of portions of one or more voice queries (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds” (portion of one or more other voice queries, sounds or speech units or phonemes indicates characteristics of such portion) “to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data” (portion of the voice query, sounds or speech units or phonemes indicates characteristics of portion of the voice query). Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input.); and
determining, based on the comparing the characteristic of portion of the voice query to the one or more characteristics of the portions of one or more other voice queries to monitor for another portion of the voice query (Fig. 11, label 1110 shows the audio signal is transmitted to the server for speech recognition processing as a result of the determination at label 1106. Col. 4, lines 34-44 discloses comparing the input audio data with models for sounds and sequences of sounds. Lines 54-67 discloses portion of audio data corresponding to a particular phoneme in order to determine whether a particular set of words matches those spoken in the utterance (lines 36-54). As the speech recognition module determines whether the spoken utterance matches models for sounds and sequences of sounds to identify words and phrases, monitoring for at least one other portion of the voice query as well as the portion of the voice query is performed.).
Claim 9, Hoffmeister et al discloses determining to monitor for the another portion of the voice query comprises determining that the characteristic of the portion of the voice query corresponds to at least one of the one or more characteristics of the portions of one or more voice queries (Col. 4, lines 34-44 discloses comparing the input audio data or voice data with models for sounds or phonemes or speech units (characteristics) and sequences of sounds or phonemes or speech units (characteristics). Lines 54-67 discloses portion of audio data or voice data corresponding to a particular phoneme in order to determine whether a particular set of words matches those spoken in the utterance (lines 36-54). Such indicates determination of whether the portion of the audio file or voice file corresponds to models for sounds and sequences of sounds. As the speech recognition module determines whether the spoken utterance matches models for sounds and sequences of sounds to identify words and phrases, monitoring for at least one other portion of the voice query as well as the portion of the voice query is performed.).
Claim 12, Hoffmeister et al discloses the voice query comprises a plurality of utterances (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.”), and 
wherein the portion of the voice query corresponds to a subset of the plurality of utterances (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.” Any portion of the audio data or audio signal received includes a subset or the spoken utterances.).
Claim 13, Hoffmeister et al discloses the portion of the voice query is during a particular time interval of the voice query (Col. 4, lines 30-50 discloses comparing the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds to identify words and phrases of the spoken utterances. Such indicates particular time interval such as time interval of speech units or phonemes.) 
Claim 14, Hoffmeister et al discloses 
receiving, at the device (Fig. 2), the another portion of the voice query (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.” Col. 4, lines 28-30 discloses the audio data includes spoken utterances, which indicates any portion of the audio data in the audio file includes at least one other portion of the voice query or spoken utterances.);
determining a characteristic of the another portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.”); 
determining that the characteristic of the portion of the voice query and the another portion of the voice query correspond to the one or more characteristics of the portions of the one or more other voice queries (Col. 4, lines 53-62,34-38 discloses determining matching the stored voice query or models of sounds and sequences of sounds to the spoken utterance of the audio file occurs via portion of audio data corresponding to a particular phoneme, word or phrase. Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input.); and
processing, at a cache of the device (Fig. 11, label 1108), based on the determining that the characteristic of the another portion of the voice query corresponds to the one or more characteristics of the portions of one or more other voice queries, the portion of the voice query and the another portion of the voice query (Fig. 11, label 1106,1108, Col. 4, lines 53-62, 34-38 discloses processing the audio file at the local device is performed based on the comparison of the audio file to stored voice query or models of sounds and sequences of sounds.).
Claim 15, Hoffmeister et al discloses
receiving, at the device (Fig. 2), the another portion of the voice query (Fig. 11, label 1102, Col. 4, lines 53-62 discloses a recognition score may represent a probability that a portion of audio data corresponds to a particular phoneme, word or phrase …”. Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.”);
determining a characteristic of the another portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.”); 
determining that the characteristic of another portion of the voice query does not correspond to the one or more characteristics of the portions of one or more other voice queries (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input. Fig. 11, label 1106,1110 indicates the audio file does not match the models or sounds, sequences of sounds to id words and phrases, such as models and information stored in speech storage 220.); and
sending, based on the determining that the characteristic of the another portion of the voice query does not correspond to the one or more characteristics of the portions of one or more other voice queries, the portion of the voice query and the another portion of the voice query for processing (Fig. 11, label 1110, 1106. When 1106 determines a match between the models of sounds and sequence of sounds to utterances in the audio file cannot be found, the portion of the audio file is sent to the server.).
Claim 16, Hoffmeister et al discloses
accessing, by a device (Fig. 2), a voice query (Fig. 10, label 1002);
determining, based on the voice query, to a first portion of the voice query and a second portion of the voice query that includes the first portion of the voice query and a subsequent portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Such discloses the input audio as a sequence of sounds with multiple speech units and phonemes. This indicates the first portion of the voice query as a sound in the input audio, and the second portion of the voice query as a sound in the input audio subsequent to the first portion.); and
determining a first characteristic of the first portion of the voice query and a second characteristic of the second portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Such discloses the input audio as a sequence of sounds with multiple speech units and phonemes. This indicates the first portion of the voice query as a sound in the input audio, the first characteristic of the first portion of the voice query as speech unit or phoneme of the sound of the input audio, the second portion of the voice query as a sound in the input audio subsequent to the first portion and characteristic of the second portion of the voice query as speech unit or phoneme associated with the second portion.); 
storing the first characteristic and the second characteristic (Fig. 10, label 1014, Col. 12, line 50 - Col. 13, line 25 discloses storing the speech recognition models such as models for sounds as disclosed in Col. 4, lines 34-44.).
Claim 17, Hoffmeister et al discloses the voice query comprises a plurality of utterances (Fig. 10, label 1002, Col. 12, lines 50-53 discloses audio signal is received. Col. 11, lines 44-50 discloses the received audio signal includes frequent phrase or word with additional speech.),
the first portion of the voice query comprises a first subset of the plurality of utterances (Col. 11, lines 44-50 discloses the received audio signal includes frequent phrase or word with additional speech, wherein the frequency phrase or word can be one portion and the additional speech can be another portion.) and
the second portion of the voice query comprises a second subset of the plurality of utterances that includes the first subset of the plurality of utterances (Col. 11, lines 44-50 discloses the received audio signal includes frequent phrase or word with additional speech, wherein the frequency phrase or word can be one portion and the additional speech can be another portion.).
Claim 20, Hoffmeister et al discloses wherein the first portion of the voice query corresponds to a first time interval of the voice query and the second portion of the voice query corresponds to the first time interval of the voice query and a second time interval of the voice query. (Col. 4, lines 28-30 discloses “Audio data including spoken utterances may be processed in real time or may be saved and processed at a later time.” Col. 4, lines 28-30 discloses the audio data includes spoken utterances, which indicates a first portion of the voice query or audio data corresponds to a first time interval and subsequent first portion of the voice query or audio data corresponds to the end of the first time interval and a second time interval of the voice query or audio data.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 21-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hoffmeister et al (US Patent No.: 9070367) in view of Jiang (US Patent No.: 20210193167).
Claim 21, Hoffmeister et al discloses determining the characteristic of the portion of the voice query based on the portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Such discloses the input audio as a sequence of sounds with multiple speech units and phonemes. This indicates the first portion of the voice query as a sound in the input audio, and the second portion of the voice query as a sound in the input audio subsequent to the first portion.),
comparing the characteristic of the portion of the voice query to the one or more characteristics of the portions of the one or more other voice queries (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input. Fig. 11, label 1106,1110 indicates the audio file does not match the models or sounds, sequences of sounds to id words and phrases, such as models and information stored in speech storage 220.), but fails to disclose the determining comprises generating an audio fingerprint and the comparison comprises comparing the audio fingerprint based on the portion of the voice query to one or more audio fingerprints that are based on the portions of the one or more stored voice queries.
Jiang discloses determining comprises generating an audio fingerprint (Fig. 1, label s100-s300 receives an audio file, extract audio feature information or audio fingerprint.) and the comparison comprises comparing the audio fingerprint based on the portion of the voice query to one or more audio fingerprints that are based on the portions of the one or more stored voice queries (Fig. 1, label s100-s300 receives an audio file, extract audio feature information or audio fingerprint and matches the audio finger print to audio feature information or audio finger print set stored in a fingerprint index database. Paragraph 95 discloses “The fingerprint index database stores the corresponding audio fingerprints and the audio attribute information. Therefore, after extracting the audio feature information of the audio file to be recognized, the audio attribute information matched with the audio feature information can be searched in the fingerprint index.” This indicates the fingerprint index databases includes one or more audio fingerprints based on the portions or acoustic features of the one or more stored voice queries or audio feature information of the audio file. The comparison is based on the one or more audio fingerprints stored in the database. Label s400 outputs audio attribute information such as recognized language type or contents of a language to a user (paragraph 97).).
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of a database of guidance queries as disclosed by Hoffmeister et al with another well-known element of a database or storage or audio prints of audio track as disclosed by Jiang to yield predictable results of a comparison of audio fingerprint or biometric data of the input voice and stored audio fingerprint or biometric data in order to output audio attribute for audio recognition. 
Claim 22, Hoffmeister et al discloses determining the characteristic of the portion of the voice query based on the portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Such discloses the input audio as a sequence of sounds with multiple speech units and phonemes. This indicates the first portion of the voice query as a sound in the input audio, and the second portion of the voice query as a sound in the input audio subsequent to the first portion.),
comparing the characteristic of the portion of the voice query to the one or more characteristics of the portions of the one or more other voice queries (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Col. 5, lines 15-18 discloses “compares the speech recognition data with acoustic, language and other data models and information stored in the speech storage 220 for recognizing the speech containing the original audio data.” Col. 6, lines 25-30 discloses the speech storage 220 contains individual user speech input. Fig. 11, label 1106,1110 indicates the audio file does not match the models or sounds, sequences of sounds to id words and phrases, such as models and information stored in speech storage 220.), but fails to disclose the determining comprises generating an audio fingerprint and the comparison comprises comparing the audio fingerprint based on the portion of the voice query to one or more audio fingerprints that are based on the portions of the one or more stored voice queries.
Jiang discloses determining comprises generating an audio fingerprint (Fig. 1, label s100-s300 receives an audio file, extract audio feature information or audio fingerprint.) and the comparison comprises comparing the audio fingerprint based on the portion of the voice query to one or more audio fingerprints that are based on the portions of the one or more stored voice queries (Fig. 1, label s100-s300 receives an audio file, extract audio feature information or audio fingerprint and matches the audio finger print to audio feature information or audio finger print set stored in a fingerprint index database. Paragraph 95 discloses “The fingerprint index database stores the corresponding audio fingerprints and the audio attribute information. Therefore, after extracting the audio feature information of the audio file to be recognized, the audio attribute information matched with the audio feature information can be searched in the fingerprint index.” This indicates the fingerprint index databases includes one or more audio fingerprints based on the portions or acoustic features of the one or more stored voice queries or audio feature information of the audio file. The comparison is based on the one or more audio fingerprints stored in the database. Label s400 outputs audio attribute information such as recognized language type or contents of a language to a user (paragraph 97). ).
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of a database of guidance queries as disclosed by Hoffmeister et al with another well-known element of a database or storage or audio prints of audio track as disclosed by Jiang to yield predictable results of a comparison of audio fingerprint or biometric data of the input voice and stored audio fingerprint or biometric data in order to output audio attribute for audio recognition. 
Claim 23, Hoffmeister et al discloses determining the first characteristic of the first portion of the voice query and the second characteristic of the second portion of the voice query (Fig. 11, label 1104. Col. 4, lines 34-44 discloses “the ASR module 214 may compare the input audio data with models for sounds (e.g., speech units or phonemes) and sequences of sounds (e.g., speech units or phonemes) to identify words and phrases that match the sequence of sounds spoken in the utterance of the audio data.” Such discloses the input audio as a sequence of sounds with multiple speech units and phonemes. This indicates the first portion of the voice query as a sound in the input audio, the first characteristic of the first portion of the voice query as speech unit or phoneme of the sound of the input audio, the second portion of the voice query as a sound in the input audio subsequent to the first portion and characteristic of the second portion of the voice query as speech unit or phoneme associated with the second portion.),
storing the first characteristic and the second characteristic (Fig. 10, label 1014, Col. 12, line 50 - Col. 13, line 25 discloses storing the speech recognition models such as models for sounds as disclosed in Col. 4, lines 34-44.), but fails to disclose the determining comprises generating an audio fingerprint and the storing comprises storing the first audio fingerprint and the second audio fingerprint.
Jiang discloses determining comprises generating an audio fingerprint (Fig. 1, label s100-s300 receives an audio file, extract audio feature information or audio fingerprint.) and the storing comprises storing the first audio fingerprint and the second audio fingerprint (Paragraph 95 discloses “The fingerprint index database can store the correspondence between the audio fingerprints and the audio attribute information.” Paragraph 88 discloses “extracting audio feature information of the audio file to be recognized, where the audio feature information includes audio fingerprints.” Paragraph 89 discloses “The audio fingerprints of the audio file can reflect identification information of important acoustic features of the audio file.” Such paragraphs indicate storage of audio fingerprints (first audio fingerprint and second audio fingerprint) of acoustic features of the audio file.).
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of a queries as disclosed by Hoffmeister et al with another well-known element of a database or storage or audio prints of audio track as disclosed by Jiang to yield predictable results of a comparison of audio fingerprint or biometric data of the input voice and stored audio fingerprint or biometric data in order to output audio attribute for audio recognition. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA WONG whose telephone number is (571)272-6044. The examiner can normally be reached 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571) 272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/LINDA WONG/Primary Examiner, Art Unit 2655