DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 29, 2020 and May 06, 2021 are being considered by the examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 9-14, and 19-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Choi (U.S. Pat. App. Pub. No. 2018/0173494, hereinafter Choi).

Regarding claim 1, Choi discloses A method for processing speech in a speech recognition system, the method comprising (The method as performed by speech recognition apparatus 100; Choi, ¶¶ [0172]) :detecting a user event indicative of a user intention to interact with a speech recognition device (With reference to the examples provided in FIGS. 7A and 7B, “the user 10 may utter ‘I do not know what the weather will be like tomorrow’ expressing an intention to wonder about tomorrow’s weather during a dialog with another speaker.” where the speech recognition apparatus 100 detects the word “weather” in the context of the dialog, where the context of the dialog is one of a plurality of “situations” {a user event indicative of a user intention}, as an indication that the speech recognition apparatus 100 should process commands in recognized speech {to interact with a speech recognition device}; Choi, ¶¶ [0173], [0065]);in response to detecting the user event, enabling an active mode of the speech recognition device to record speech data (“The speech recognition apparatus 100 may perform speech recognition on ‘I do not know what the weather will be like tomorrow,’ {enabling an active mode of the speech recognition device to record speech data} which is a whole speech command including the speech signal for uttering the activation word ‘weather’.”; Choi, ¶¶ [0174]) based on an audio signal captured at the speech recognition device irrespective of whether the speech data comprises a signature word (The speech recognition is based on the detected dialog at the speech recognition apparatus 100 including context, and is not related to {irrespective of whether the speech data comprises} the speech command “say robot” {the signature word}; Choi, ¶¶ [0175], [0181]); and while the active mode is enabled: generating a recording of the speech data (“speech recognition apparatus 100 may activate a speech recognition function when the speech signal for uttering the activation word “weather” is received {while the active mode is enabled}” and activating a speech recognition function includes collecting the speech data, where “the structure of the data used in the above-described embodiments may be recorded on the computer-readable medium through various means {generating a recording of}”; Choi, ¶¶ [0173], [0240]); detecting the signature word in a portion of the speech data other than a beginning portion of the speech data (the speech recognition apparatus 100 may wait “for the response and may issue a confirmation command to request the response to the speech command” where the user 10 may respond by speaking “Say Robot,” where “Say Robot” is the “previously confirmed confirmation command.” and the confirmation command is detected by the speech recognition apparatus 100 and the signature word is delivered at the end of the speech data {other than a beginning portion of the speech data}; Choi, ¶¶ [0179]); and in response to detecting the signature word, processing the recording of the speech data to recognize a user-uttered phrase (“When the confirmation command is received from the user 10 {in response to detecting the signature word}, the speech recognition apparatus 100 may output speech ‘It will be sunny tomorrow’ as an operation to respond to the speech command {processing the recording of the speech data to recognize a user-uttered phrase}”; Choi, ¶¶ [0179]).

Regarding claim 2, Choi discloses wherein generating the recording is performed at the speech recognition device (“The speech recognition apparatus 100-1 may receive an audio signal including a speech signal uttered by a user 10 and perform speech recognition on the speech signal,” thus, the recording is performed at the speech recognition apparatus 100 {the speech recognition device}; Choi, ¶¶ [0038]). 

Regarding claim 3, Choi discloses wherein processing the recording of the speech data is performed at a server remote from the speech recognition device (“The server 120 may perform speech recognition based on the signal received from the speech recognition apparatus 100”; Choi, ¶¶ [0044]). 

Regarding claim 4, Choi discloses wherein detecting the signature word is performed based on an acoustic model (“The speech recognition apparatus 100 may extract a frequency Choi, ¶¶ [0161]).

Regarding claim 9, Choi discloses wherein detecting the user event comprises detecting a user activity suggestive of a user movement in closer proximity to the speech recognition device (“The speech recognition apparatus 100 may include one or more sensors and may sense various information for determining the situation in which the speech recognition apparatus 100 operates. For example, the sensor included in the speech recognition apparatus 100 may sense a location of the speech recognition apparatus 100, information related to a movement of the speech recognition apparatus 100, information capable of identifying a user who is using the speech recognition apparatus 100, and surrounding environment information of the speech recognition apparatus 100 {the user event comprising detecting user activity}” where “the speech recognition apparatus 100 may determine at least one activation word based on a movement of the user of the speech recognition apparatus 100... [such as] depending on whether the user of the speech recognition apparatus 100 stops moving, is walking, or is running.”; Choi, ¶¶ [0140], [0124]).

Regarding claim 10, Choi discloses wherein detecting the user activity comprises sensing the user movement with a device selected from one or more of a motion detector device, an infrared recognition device, an ultraviolet-based detection device, and an image capturing device (“speech recognition apparatus 100 may include one or more sensors and may sense various information for determining the situation in which the speech recognition apparatus 100 operates” including “at least one of an illuminance sensor, a biosensor, a tilt sensor, a position sensor {detects motion of the device}, a proximity sensor {detects motion of objects around the device}, a geomagnetic sensor, a gyroscope sensor, a temperature/humidity sensor, an infrared ray sensor {infrared recognition device}, and a speed/acceleration sensor, or a combination Choi, ¶¶ [0140]-[0141], [0038]).

Regarding claim 11, Choi discloses A system for processing speech in a speech recognition system, the system comprising (The speech recognition apparatus 100; Choi, ¶¶ [0172]): a sensor configured to (“speech recognition apparatus 100 may include one or more sensors and may sense various information for determining the situation in which the speech recognition apparatus 100 operates”; Choi, ¶¶ [0140]) detect a user event indicative of a user intention to interact with a speech recognition device (With reference to the examples provided in FIGS. 7A and 7B, “the user 10 may utter ‘I do not know what the weather will be like tomorrow’ expressing an intention to wonder about tomorrow’s weather during a dialog with another speaker.” where the speech recognition apparatus 100 detects the word “weather” in the context of the dialog, where the context of the dialog is one of a plurality of “situations” {a user event indicative of a user intention}, as an indication that the speech recognition apparatus 100 should process commands in recognized speech {to interact with a speech recognition device}; Choi, ¶¶ [0173], [0065]); a memory (“speech recognition apparatus 100 may further include at least one of a memory 1140”; Choi, ¶¶ [0208]); and control circuitry communicatively coupled to the memory and the sensor and configured to (“The processor 1120 according to an embodiment may be implemented with hardware and/or software components that perform particular functions” and “functions performed by the processor 1120 may be implemented by at least one microprocessor, or by circuit components for related functions”; Choi, ¶¶ [0220]-[0221]): in response to detecting the user event, enable an active mode of the speech recognition device to record speech data (“The speech recognition apparatus 100 may perform speech recognition on ‘I do not know what the weather will be like tomorrow,’ {enabling an active mode of the speech recognition device to record speech data} which is a whole speech command including the speech signal for uttering the activation word ‘weather’.”; Choi, ¶¶ [0174]) based on an audio signal captured at the speech recognition device irrespective of whether the speech data comprises a signature word (The speech recognition is based on the detected dialog at the speech recognition apparatus 100 including context, and is not related to {irrespective of whether the speech data comprises} the speech command “say robot” {the signature word}; Choi, ¶¶ [0175], [0181]); and while the active mode is enabled: generate a recording of the speech data (“speech recognition apparatus 100 may activate a speech recognition function when the speech signal for uttering the activation word “weather” is received {while the active mode is enabled}” and activating a speech recognition function includes collecting the speech data, where “the structure of the data used in the above-described embodiments may be recorded on the computer-readable medium through various means {generating a recording of}”; Choi, ¶¶ [0173], [0240]); detect the signature word in a portion of the speech data other than a beginning portion of the speech data (the speech recognition apparatus 100 may wait “for the response and may issue a confirmation command to request the response to the speech command” where the user 10 may respond by speaking “Say Robot,” where “Say Robot” is the “previously confirmed confirmation command.” and the confirmation command is detected by the speech recognition apparatus 100 and the signature word is delivered at the end of the speech data {other than a beginning portion of the speech data}; Choi, ¶¶ [0179]); and in response to detecting the signature word, process the recording of the speech data to recognize a user-uttered phrase (“When the confirmation command is received from the user 10 {in response to detecting the signature word}, the speech recognition apparatus 100 may output speech ‘It will be sunny tomorrow’ as an operation to respond to the speech Choi, ¶¶ [0179]).

Regarding claim 12, the rejection of claim 11 is incorporated. Claim 12 is substantially the same as claim 2 and is therefore rejected under the same rationale as above.

Regarding claim 13, the rejection of claim 11 is incorporated. Claim 13 is substantially the same as claim 3 and is therefore rejected under the same rationale as above.

Regarding claim 14, the rejection of claim 11 is incorporated. Claim 14 is substantially the same as claim 4 and is therefore rejected under the same rationale as above.

Regarding claim 19, the rejection of claim 11 is incorporated. Claim 19 is substantially the same as claim 9 and is therefore rejected under the same rationale as above.

Regarding claim 20, the rejection of claim 19 is incorporated. Claim 20 is substantially the same as claim 10 and is therefore rejected under the same rationale as above.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 5 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi as applied to claims 4 and 14 above, and further in view of Yavagal (U.S. Pat. App. Pub. No. 2020/0184966, hereinafter Yavagal) and Chen (U.S. Pat. App. Pub. No. 2019/0221206, hereinafter Chen).

Regarding claim 5, the rejection of claim 4 is incorporated. Choi disclose all of the elements of the current invention as stated above. However, Choi fail(s) to expressly disclose wherein the acoustic model is selected from one of a hidden Markov model (HMM), a long short-term memory (LSTM) model, and a bidirectional LSTM.
 Yavagal teaches systems and methods for wakeword detection. (Yavagal, ¶ [0015]). Regarding claim 5, Yavagal teaches wherein the acoustic model is selected from one of a hidden Markov model (HMM) (“Hidden Markov Model (HMM) or Gaussian Mixture Model (GMM) techniques may be applied to compare the audio input to one or more acoustic models”; Yavagal, ¶¶ [0023]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the systems and methods for AI speech recognition of Choi to incorporate the teachings of Yavagal to include wherein the acoustic model is selected from one of a hidden Markov model (HMM). The systems and methods of Yavagal can improve computing devices with relation to speech recognition by reducing power consumption. (Yavagal, ¶ [0016]). However, Choi and Yavagal fail to expressly recite wherein the acoustic model is selected from …a long short-term memory (LSTM) model, and a bidirectional LSTM.
Chen teaches systems and methods for “a spoken keyword detection based utterance-level wake on intent system.” (Chen, ¶ [0017]). Regarding claim 5, Chen teaches wherein the acoustic model is selected from …a long short-term memory (LSTM) model, and a bidirectional LSTM (“The KALDI-based GENTLE tool was used with its own DNN acoustic model {the acoustic model is selected from...} to perform forced alignment... including: 1) Forced aligned MFCC LSTM; 2) Forced aligned phone sequence LSTM; and 3) manual keyword transcript based GLOVE LSTM {each of which is a LSTM model}”; Chen, ¶¶ [0050]). 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the systems and methods for AI speech recognition of Choi, as modified by the systems and methods for wakeword detection of Yavagal, to incorporate the teachings of Chen to include wherein the acoustic model is selected from one of a hidden Markov model (HMM), a long short-term memory (LSTM) model, and a bidirectional LSTM. “The fusion of the LSTM systems (phone+MFCC+manual GLOVE) further improves the intent detection performance with the bagging decision mechanism as compared with the individual feature input systems,” as recognized by Chen. (Chen, ¶ [0050]).

Regarding claim 15, the rejection of claim 4 is incorporated. Claim 15 is substantially the same as claim 5 and is therefore rejected under the same rationale as above.

Claims 6 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi as applied to claim 1 and 11 above, and further in view of Peden (U.S. Pat. App. Pub. No./U.S. Pat. No. 8,380,504, hereinafter Peden).

Regarding claim 6, the rejection of claim 1 is incorporated. Choi discloses all of the elements of the current invention as stated above. However, Choi fails to expressly recite wherein detecting the signature word is based on heuristics of audio signatures of a demographic region.
Peden teaches “systems, methods, and computer-readable media for generating voice profiles for users based on information communicated during calls.” (Peden, Col. 3, lines 33-36). Regarding claim 6, Peden teaches wherein detecting the signature word is based on heuristics of audio signatures of a demographic region (“The sound components {of the audio signal} that are selected to be monitored may be based on sound components that are associated with particular voice characteristics, such as dialects {heuristics of audio signatures}... For example, the keyword “eh” may be monitored by the sound component monitoring device {heuristics of audio signatures}. Further, the detection of the keyword “eh” may help to indicate a caller is associated with Vancouver, Canada {of a demographic region}”; Peden, ¶¶ Col. 4, lines 15-24)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the systems and methods for AI speech recognition of Choi to incorporate the teachings of Peden to include wherein detecting the signature word is based on heuristics of audio signatures of a demographic region. The systems and methods described in Peden can “prioritize words that are more common” based on dialect, thus “increasing the accuracy of voice recognition.” (Peden, Col. 5, lines 17-27).

Regarding claim 16, the rejection of claim 11 is incorporated. Claim 16 is substantially the same as claim 6 and is therefore rejected under the same rationale as above.

Claims 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi as applied to claims 1 and 11 above, and further in view of Lesso (U.S. Pat. App. Pub. No. 2019/0228779, hereinafter Lesso).

Regarding claim 7, the rejection of claim 1 is incorporated. Choi discloses all of the elements of the current invention as stated above. However, Choi fails to expressly recite further 
Lesso teaches “methods and devices for analysing speech signals.” (Lesso, ¶ [0001]). Regarding claim 7, Lesso teaches further comprising determining whether the speech data corresponds to human speech (discloses a “speaker identification system comprises: a voice activity detector for attempting to detect human speech in the received audio signal” where “the first biometric process 82 may act as a voice activity detector.”; Lesso, ¶¶ [0067], [0068], [0175]) based on a spectral characteristic analysis of the audio signal captured at the speech recognition device (the system “determine[s] that the received signal” contains “[human] speech...[by] analysing a... spectrum of the speech… to determine that the spectrum is that of human speech rather than of a noise source, a mechanical sound, or the like.”; Lesso, ¶¶ [0177]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the systems and methods for AI speech recognition of Choi to incorporate the teachings of Lesso to include further comprising determining whether the speech data corresponds to human speech based on a spectral characteristic analysis of the audio signal captured at the speech recognition device. The determination of a human user can allow a system to avoid unauthorized user or replay attacks, as recognized by Lesso. (Lesso, ¶ [0170]).

Regarding claim 17, the rejection of claim 16 is incorporated. Claim 17 is substantially the same as claim 7 and is therefore rejected under the same rationale as above.

Claims 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi and Lesso as applied to claims 7 and 17 above, and further in view of Williams (U.S. Pat. App. Pub. No. 2018/0190296, hereinafter Williams).

Regarding claim 8, the rejection of claim 7 is incorporated. Choi and Lesso disclose all of the elements of the current invention as stated above. However, Choi and Lesso fail to expressly recite further comprising determining whether the speech data corresponds to human speech based on a comparison of the audio signal captured at the speech recognition device and a list of black-listed audio signals.
Williams teaches “systems and methods for analyzing digital recordings of the human voice in order to find characteristics unique to an individual.” (Williams, ¶ [0003]). Regarding claim 8, Williams teaches further comprising determining whether the speech data corresponds to human speech (The system can further include “attributes, voiceprint models and groups of validation or blacklist people to be used by biometric detection” which are used in identifying the speech data as coming from {determining whether the speech data corresponds to} a human speaker {human speech}; Williams, ¶¶ [0036]) based on a comparison of the audio signal captured at the speech recognition device and a list of black-listed audio signals (the customer voice is “passed through the biometrics engine” to create a voiceprint where “the voiceprint created from the customer voice is compared {comparison of the audio signal} against the previously stored fraudster voiceprints {the blacklisted audio signals}”; Williams, ¶¶ [0143]-[0145]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the systems and methods for AI speech recognition of Choi as modified by the methods and devices for analysing speech signals of Lesso to incorporate the teachings of Williams to include further comprising determining whether the speech data corresponds to human speech based on a comparison of the audio signal captured at the speech recognition device and a list of black-listed audio signals. The black-listed audio signals of Williams can help prevent fraud, as recognized by Williams. (Williams, ¶ [0142]).


Regarding claim 18, the rejection of claim 17 is incorporated. Claim 18 is substantially the same as claim 8 and is therefore rejected under the same rationale as above.





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kim et al. (U.S. Pat. App. Pub. No. 2015/0302855) discloses systems and methods for target device application including activation keywords and speech command buffering.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sean E. Serraguard whose telephone number is (313)446-6627. The examiner can normally be reached 07:00-17:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel C. Washburn can be reached on (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 





/Sean E Serraguard/Patent Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657