Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 2, 7, 8, 11, and 12 are rejected under 35 U.S.C. 102(a)(1) & (a)(2) as being anticipated by Haiut (United States Patent Application Publication US 2014/0365225), hereinafter Haiut.

1, Haiut teaches an information processing method, wherein the method is applied to an electronic device and comprises: collecting audio information received by the electronic device before waking up the electronic device; (Fig. 4 404 “Transition electronic device to power-saving or low-power state (e.g., ‘sleep’ mode)”…408 “Monitor for triggering voice/phrase, using the ultra-low-power voice trigger” As shown in Fig. 4, after transitioning to power-saving or low-power state (e.g., ‘sleep’ mode) and before transitioning to active mode, which is interpreted as before waking up the electronic device, the electronic device monitors for triggering voice/phrase, which is interpreted as collecting audio information received by the electronic device.) storing the audio information; ([0025] “the VT processor 130 may be limited to only processing audio (to determine a match with pre-configured voice triggering commands and/or match with authorized users) and/or to store the small database needed for VT operations.” The VT processor for VT operations store the audio in the small database, which is interpreted as storing the audio information.) and in response to a processing power status of the electronic device indicating an idle time of a wake mode of the electronic device, generating, based on the stored audio information, alternative wake words to wake up the electronic device, ([0025] “the VT speech recognition scheme may be implemented by use of only limited components in low-power modes.” [0027] “with VT speech recognition scheme in accordance with the present disclosure, two-dimensional HMM state-machines are used, and configured such that they may comprise different states, which may be produced from representatives of feature extraction vectors that are taken from the input phrase in real time-i.e., with multiple states corresponding to the same phrase (or portions thereof). Further, the states may be arranged in lines (i.e., different sequences may correspond to the same phrase). The phrases may not be necessarily synchronized with the syllables.” [0028] “the VT speech recognition scheme (and processing performed during VT operations), using such two-dimensional HMM state machines, may be optimized since it is based on combination of an initial fixed database coupled with a learning algorithm. The fixed database is the set of one of more pre-determined VT phrases that are pre-stored (e.g., into the VT processor 130). The fixed database may enable the generation of feedback to the learning process, so that the user does not have to initiate the device with a training sequence.” As shown in Fig. 4 and the paragraph [0025], the electronic device starts the VT speech recognition after transitioning the power-saving or low-power state. Furthermore, the VT speech recognition uses the stored audio information and the triggered audio information using the learning process to determine if the captured phrase is recognized as one of preset triggering phrase and to add or replace adaption incantations, which is interpreted as generating alternative wake words as shown in Fig. 5. Furthermore, as shown in Fig. 4, based on the trigger phrase received and verified, the electronic device transitions back to active mode.) wherein the wake words facilitate the ([0029] “in addition to using triggering phrases to simply turning on or activating (waking up) the device, additional triggering phrases may be used to trigger particular actions once the device is turned on and/or is activated.” As discussed above, Fig. 4 shows transitioning the electronic device to active mode from sleep mode based on the wake words or the triggering phrase.)

Regarding claim 7, the claim 7 is the apparatus claims of the method claim 1. The claim 7 does not further teach or define the limitation over the limitations recited in the rejected claims above. Therefore, Haiut teaches all the limitations of the claim 7.

Regarding claim 11, the claim 1 teaches all the limitations of the claim 11 except a server including a memory configured to store application programs and data generated by executing the application programs, and to store audio information. ([0018] “Examples of electronic devices may comprise…computers (e.g., servers)...” Furthermore, as discussed above in claim 1, the electronic device includes a memory to store software and/or firmware and the VT speech recognition using learning device or statistics of database of the phrases.) Claim 11 does not further teach or define the limitation over the limitations recited in the rejected claims above. Therefore, Haiut teaches all the limitations of the claim 11.

	

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 8, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Haiut (United States Patent Application Publication US 2014/0365225), hereinafter Haiut, in view of Prasad et al. (United States Patent Application Publication US 2015/0245154), hereinafter Prasad.

Regarding claim 1, Haiut teaches wherein after collecting the audio information received by the electronic device before waking up the electronic device, the method ([0031] “the VT processor 130 may be configured to process possible triggering phrases that may be captured via the microphone 120, by using two-dimensional HMM state machine 200 to determine if the captured phrase is recognized as one of preset triggering phrases.” [0046] “the received triggering voice/commands may be verified. The verification may comprise verifying that the captured command matches the preset triggering command. Also, the verification may comprise determining that the voice matches that of an authorized user...the process proceeds to step 412, the electronic device is transitioned from the power saving or low-power state, such as back to fully active state (thus reactivating or powering on the resources that where shut off or deactivated when the electronic device transitioned to the power-saving or low-power state).” The voice or the audio information is verified to determine matching the preset triggering command and an authorized user to wake up or transition to active state. The condition is to match the preset triggering command and the authorized user. Based on determination of the condition, the electronic device wakes up to the active state.)
Haiut does not teach in response to the semantic information represented by the audio information not satisfying the condition, storing the audio information.
Prasad teaches in response to the semantic information represented by the audio information not satisfying the condition, storing the audio information. ([0040] “the server-side wake word detector may determine at [5] that the wake word was not uttered. This determination may come before or after the audio stream at [4] is initiated… the speech processing system 210 may maintain wake word audio and/or features extracted from the wake word audio in order to train or otherwise improve the wake word detection model, as described in greater detail below…Acoustic features extracted from the audio may nevertheless be stored in order to improve the wake word detection model and reduce future false detections when similar acoustic features are extracted from an audio stream.” When the wake word is determined not to be uttered, which is interpreted as in response to the semantic information represented by the audio information not satisfying the condition, the speech processing system maintain those wake words to train the wake word detection model, which is interpreted as storing the audio information.)
It would have been have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Haiut by incorporating the teaching of Prasad of storing the audio information in response to the Prasad improves upon Haiut by storing the audio information even when being determined not to correspond to the wake word. As recognized by Prasad, even if the collected audio does not correspond to the wake word, the acoustic features extracted from the audio may be used to reduce the future false detections, which may exhibit the similar acoustic features. ([0040]) Therefore, it would be advantageous to incorporate the teaching of Prasad of storing the audio information in response to the semantic information represented by the audio information not satisfying the condition in order to reduce the future false detection.

Regarding claim 8, the claim 8 is the apparatus claims of the method claim 2. The claim 8 does not further teach or define the limitation over the limitations recited in the rejected claims above. Therefore, Haiut in view of Prasad teaches all the limitations of the claim 8.

Regarding claim 12, the claim 12 is the apparatus claims of the method claim 2. The claim 12 does not further teach or define the limitation over the limitations recited in the rejected claims above. Therefore, Haiut in view of Prasad teaches all the limitations of the claim 12.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Haiut in view of KO et al. (United States Patent Application Publication US 2018/0102125), hereinafter KO.
However, Haiut does not teach dividing the audio information into at least one unit of to-be-processed information; calculating a similarity between each unit of the to-be-processed information and target wake words; determining whether the similarity between the unit of the to-be-processed information and the target wake words is greater than a threshold; and in response to determining the similarity between the unit of the to-be-processed information and the target wake words being greater than the threshold, determining the unit of the to-be-processed information as the alternative wake words.
KO teaches wherein generating the alternative wake words based on the stored audio information further includes: 
dividing the audio information into at least one unit of to-be-processed information; ([0050] “The user's speech voice may be converted into the digital signal by the processor 220, and the digital signal may be converted into character data by the voice recognition application.” [0056] “The characteristic of the wakeup word is identified on the basis of a phoneme and a syllable of the wakeup word, and a length of the speech voice when a human speaks the wakeup word.” [0058] “the voice recognition application extracts a keyword of the voice included in the digital signal.” Identifying the basis of a phoneme and a syllable of the wakeup word is a unit of organization for a sequence of speech sound. Identifying the basis of a syllable is to identify the units of a sequence of speech sound, which is interpreted as dividing the audio information into at least one unit of to-be-processed information. The identified basis of a syllable of the wakeup word as the characteristic of the wakeup word is then used for the subsequent process to extract a key word of the voice by the voice recognition application, which is interpreted as at least one unit of to-be-processed information.)
calculating a similarity between each unit of the to-be-processed information and target wake words; determining whether the similarity between the unit of the to-be-processed information and the target wake words is greater than a threshold; ([0059] “The voice recognition application may identify whether the keyword coincides with the pre-stored wakeup word. Further, if it is identified that the similarity between the keyword and the pre-stored wakeup word is equal to or higher than the predetermined similarity, the voice recognition application may identify that the keyword coincides with the pre-stored wakeup word. If it is identified that the keyword coincides with the wakeup word, the voice recognition application may activate the voice command recognition mode.” [0070] “The voice recognition application may automatically create and store a similar word of the wakeup word on the basis of the user's speech history.” The keyword identified on the basis of a syllable of the wakeup word, which is interpreted as each unit of the to-be-processed information is compared the pre-stored, which is interpreted as target wake word, in order to identify the similarity between the keyword and the pre-stored wakeup word, which is interpreted as calculating a similarity between each unit of the to-be-processed information and target wake words. If the similarity by comparison is greater than the predetermined similarity, the keyword is identified to coincide with the pre-stored wakeup word, which is interpreted as determining where the similarity between the unit of the to-be-processed information and the target wake words is greater than a threshold.) and 
in response to determining the similarity between the unit of the to-be-processed information and the target wake words being greater than the threshold, determining the unit of the to-be-processed information as the alternative wake words. ([0062] “the voice recognition application may store a word having the same phoneme and syllable as those of the wakeup word as the similar word.” [0070] “The voice recognition application may automatically create and store a similar word of the wakeup word on the basis of the user's speech history.” As discussed above, based on the similarity of being greater than the threshold, the keyword or the unit of the to-be-processed information is determined to be coincided with the target wake words. Thus, based on comparison, a similar word of the wakeup word is determined, created and stored, which is interpreted as determining the unit of the to-be-processed information as the alternative wake words.) 
It would have been have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Haiut by incorporating the teaching of KO to determine a similarity between the unit of the to-be-processed information and the target wake words and the unit of the-to-be-processed information as the alternative wake words. They are all directed toward voice recognition in the electronic devices. KO further improves upon Haiut by determining a similarity between the unit of the to-be-processed information and the target wake words and the unit of the-to-be-processed information as the alternative wake words. As recognized by KO, a conventional voice recognition technique uses a specific wakeup word registered in the device in manufacturing process causing increases in development cost of a separate digital signal processor chip and difficulty to change the wakeup word. ([0006]) By determining the similarity between the user’s voice command to the target word, the alternative wake words can be updated without updating software or hardware. Therefore, it would be advantageous to incorporate the teaching of KO to determine a similarity between the unit of the to-be-processed information and the target wake words and the unit of the-to-be-processed information as the alternative wake words in order to reduce development cost of the DSP and increase .

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Haiut in view of KO as applied to claim 1 above, and further in view of Clark et al. (United States Patent Application Publication US 2019/0122656), hereinafter Clark.

Regarding claim 5, Haiut in view of KO teaches all the limitations of the method according to claim 4, as discussed above.
However, Haiut in view of KO does not teach the method further includes: determining whether the number of words of the unit of the to-be-processed information is within a number range and in response to determining that the number of words of the unit of the to-be-processed information is within the number range, determining the unit of the to-be-processed information as the alternative wake words.
Clark teaches the method further includes: determining whether the number of words of the unit of the to-be-processed information is within a number range; ([0019] “To estimate the length of the spoken trigger phrase, the device may count the number of frames having voice activity in the audio signal.” [0020] “The device measures a number of segments in the audio signal having voice activity, and compares the measured number of segments to a threshold value.” Fig. 5, 506 “LENGTH < 70 FRAMES?”, Fig. 7 706 “NUMBER OF SEGMENTS > 3?” The number of segments in the audio signal is interpreted as the number of words of the unit of the to-be-processed information. As shown in the Fig. 5 506 and Fig. 7 706, the number of segments is compared to the lower and upper limits of the number of frames, which is interpreted as determining whether the number of words of the unit of the to-be-processed information is within a number range.) and 
in response to determining that the number of words of the unit of the to-be-processed information is within the number range, determining the unit of the to-be-processed information as the alternative wake words. (KO in the paragraph [0056] discloses that “The characteristic of the wakeup word is identified on the basis of a phoneme and a syllable of the wakeup word, and a length of the speech voice when a human speaks the wakeup word…it may be identified that the user's speech voice has similarity that is equal to or higher than the predetermined value with respect to the wakeup word.” [0057] “if the length of the sound signal received by the sensor 210 is included in a predetermined range of the length of the speech voice when the user speaks the wakeup word, the voice recognition application may identify that the characteristic value of the digital signal is equal to or higher than the threshold level.” [0062] “the voice recognition application may store a word having the predetermined similarity of the wakeup word as the similar word.” KO teaches that based on the length of the sound signal to be within predetermined range, the similar word is determined as a wakeup word. The length of the sound signal in a predetermined range in Ko does not explicitly teach determining the number of words of the unit of the to-be-processed information is within the number range. However, as discussed above, Clark teaches determining that the number of words of the unit of the to-be-processed information is within the number range. Clark teaches that, based on the determination, the next process to determine the spoken trigger phrase to be acceptable for trigger phrase model training is processed. Therefore, the combination of the teaching of Clark to determine that the number of words of the unit of the to-be-processed information is within the number range and the teaching of KO to determine the unit of the to-be-processed information as the alternative wake words in response to determination that the length of the sound signal to be within predetermined range teaches in response to determining that the number of words of the unit of the to-be-processed information is within the number range, determining the unit of the to-be-processed information as the alternative wake words.) 
It would have been have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Haiut in view of KO by incorporating the teaching of Clark to determine whether the number of words of the unit of the to-be-processed information is within a number range for determining the alternative wake words. They are all directed toward voice recognition in the electronic devices. Clark further improves upon Haiut in view of KO by Clark, in order to improve the trigger phrase recognizer accuracy and to employ speaker recognition to help reject the trigger phrase, if the phrase are too long or too short, the phrase becomes susceptible to noises. ([0006]) Having the length or number of words has to be within a certain number of words can improve the accuracy. Therefore, it would be advantageous to incorporate the teaching of Clark determine whether the number of words of the unit of the to-be-processed information is within a number range for determining the alternative wake words in order to improve recognition accuracy.

Claims 6, 10, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Haiut in view of CHOI et al. (United States Patent Application Publication US 2016/0063995), hereinafter CHOI.

Regarding claim 6, Haiut teaches all the limitations of the method according to claim 1, as discussed above.
However, Haiut does not teach collecting the audio information received by the electronic device by the user before waking up the electronic device further includes: performing a voiceprint recognition on the audio information; and based on voiceprint recognition features, grouping the audio information having a same voiceprint recognition feature into a same group; and generating the alternative wake words 
CHOI teaches collecting the audio information received by the electronic device by the user before waking up the electronic device further includes:   
performing a voiceprint recognition on the audio information; ([0098] “in response to a user's uttered voice being received through a remote controller (not shown) or a microphone of the display apparatus 100, the display apparatus 100 extracts voice information from the received uttered voice and measures the reliability with respect to a plurality of pre-registered words based on the extracted voice information (S710, S720).” The extraction of voice information from a user’s uttered voice and the reliability measurement is interpreted as performing a voice print recognition on the audio information.) and 
based on voiceprint recognition features, grouping the audio information having a same voiceprint recognition feature into a same group; ([0101] “when the plurality of similar words which are similar to the user's uttered voice are extracted, it is preferred to group the plurality of extracted similar words into a similar word group.” Based on the measured reliability, which is interpreted as based on voiceprint recognition features, the plurality of similar words are grouped into a similar word group, which is interpreted as grouping the audio information having a same voiceprint recognition feature into a same group.) and
([0102] “In response to the plurality of similar words being extracted, the display apparatus 100 sets a similar word which satisfies a predetermined condition from among the plurality of extracted similar words as a target word (S740).” For the group, a target word is set, which is interpreted as generating a word for each group of the audio information. However, CHOI does not explicitly teach the alternative wake words. Prasad teaches “generating customized detection models for individual users, or for groups of users that exhibit similar wake word detection system usage patterns (e.g., similar contextual information, similar environmental information, similar acoustic information, some combination thereof, etc.)” in paragraph [0044], which is interpreted as generating the alternative wake words for different groups. The combination of the teaching of CHOI to generate a target word for each group of the audio information and the teaching of Prasad to generate the alternative wake words for each group teaches generating the alternative wake words for each group of the audio information.) 
It would have been have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Haiut by incorporating the teaching of CHOI to group the audio information having a same voiceprint recognition feature into a same group and generate a word for each group. CHOI improves upon Haiut. As recognized by CHOI, a conventional method for recognizing a plurality of similar words which are similar to a user’s uttered voice provides a list of similar words for the user to select, and thus lacks a practical use in terms of convenience of controlling an operation through a user’s uttered voice. ([0008]) By grouping the similar words into a same group and providing a target word for the group, the user experience can be improved by eliminating additional step to select the similar word from the list. Therefore, it would be advantageous to incorporate the teaching of CHOI to group the audio information having a same voiceprint recognition feature into a same group and generate a word for each group to improve the user experience.

Regarding claims 10 and 14, the claims 10 and 14 are the apparatus claims of the method claim 6. The claims 10 and 14 do not further teach or define the limitation over the limitations recited in the rejected claims above. Therefore, Haiut in view of CHOI teaches all the limitations of the claims 10 and 14.

Response to Arguments
Applicant’s arguments, see Remarks, filed 1/6/2021, with respect to Claim rejection under 112(b) have been fully considered and are persuasive.  The rejection under 35 U.S.C. 112(b) of claim 5 has been withdrawn. 

Applicant’s arguments, see Remarks, filed 1/6/2021, with respect to Claim rejection under 103 have been fully considered and are persuasive.  The rejection under 35 U.S.C. 103 of claims 1-14 has been withdrawn. However, upon further consideration, a new ground of rejection is made in view of Haiut to teach ultra-low-power adaptive voice triggering in electronic devices in power-saving or low-power state using the learning device.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jaweed Abbaszadeh can be reached on (571) 270-1640.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the 






/H.K./Examiner, Art Unit 2187                    

/JAWEED A ABBASZADEH/Supervisory Patent Examiner, Art Unit 2187