DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sreedhara (US 20200143806 A1) in view of Kennewick (US 20210082412 A1).

Regarding Claim 1, Sreedhara discloses a system, comprising: 
a microphone input device (Fig. 5; [0146]: User input interface 510 may be any suitable user interface, such as a remote control, mouse, trackball, keypad, keyboard, touch screen, touchpad, stylus input, joystick, voice recognition interface, or other user input interfaces); 
a memory configured to store computer-executable instructions (Fig. 5; [0147]: instructions of the application are stored locally (e.g., in storage 508 )); and 
a processor configured to access the memory and execute the computer- executable instructions (Fig. 5; [0142]: Control circuitry 504 may be based on any suitable processing circuitry such as processing circuitry 506) to at least: 
receive first voice input data via the microphone input device (Fig. 7, [0175]: Process 700 begins at 702, where control circuitry 504 receives, via a user input device (e.g., user input interface 510, wireless communications device 606 ), first speech (e.g., first speech 106 )); 
generate a first search query for searching an item database, the first search query comprising search terms derived from the first voice input data (Fig. 1, speech 106; Fig. 7; [0176]: Process 700 continues to 704 , where control circuitry 504 determines, using automatic speech recognition (ASR), a first input (e.g., first input 108), based on the first speech. For example, control circuitry 504 may determine the first input by converting the first speech to text using known automatic speech recognition techniques); 
generate first search results responsive to the first search query ([0178]: Process 700 continues to 706, where control circuitry 504 retrieves, from a database (e.g., from media content source 616 or media guidance data source 618 through communications network 614, or from storage 508) search results (e.g., search results 112) based on the first input); 
receive second voice input data via the microphone input device (Fig. 7; [0181]: Process 700 continues to 712, where control circuitry 504 receives, via the user input device, subsequent to receiving the first speech, second speech (e.g., second speech 116 )); 
generate a second search query for searching the item database, the second search query comprising second search terms derived from the second voice input data (Fig. 7; [0182]: Process 700 continues to 714, where control circuitry 504 determines, using automatic speech recognition (ASR), a second input (e.g., second input 120 ) based on the second speech. For example, control circuitry 504 may determine the second input by converting the second speech to text using known automatic speech recognition techniques); 
However, Sreedhara does not explicitly teach “determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generate, in response to the score exceeding a threshold, second search results based on the first search query and the second search query.”
On the other hand, in the same field of endeavor, Kennewick teaches 
determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query (Fig. 2, Fig. 3; [0066]: The dialog system may compute the scores using a machine learning model such as a logistic regression model or a neural network... the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202); and 
generate, in response to the score exceeding a threshold, second search results based on the first search query and the second search query (Fig. 2, step 216; Fig. 3; [0068]-[0070]: The dialog system may compare the score for each intent, as computed at 302, to the threshold value identified at 304, until finding that the score for a particular intent, of the plurality of potential intents, exceeds the threshold value… Causing display of the visual indication of the initial discerned intent is triggered by the determination that the score for a particular intent exceeds the threshold value at 306. See also para [0009]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Sreedhara to incorporate the teachings of Kennewick to include “determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generate, in response to the score exceeding a threshold, second search results based on the first search query and the second search query.”
The motivation for doing so would be to determine the initial intent of the user, as recognized by Kennewick ([0009] of Kennewick: In some aspects, determining the initial discerned intent comprises computing, by the dialog system, a plurality of scores for a respective plurality of potential intents and the first portion of the speech input and determining, by the dialog system, that the score for a particular intent, of the plurality of potential intents, exceeds a threshold value).

Regarding Claim 2, the combined teachings of Sreedhara and Kennewick disclose the system of claim 1.
Sreedhara further teaches wherein the computer-executable instructions further cause the processor to: prior to receiving the second voice input data, receive contextual information related to one or more user with respect to the item database ([0016]: In some embodiments, the media guidance application captures, via the user input device, between the first time and the second time, an image of the face of a user. See also para [0019]-[0024]).
Additionally, Kennewick teaches wherein determining the score further comprises using the machine learning algorithm having the contextual information as an additional input ([0026]: In some embodiments, the dialog system continuously presents its understanding of context as the user speaks one or more sentences; [0066]: The dialog system may compute the scores using a machine learning model… Such a model may have previously been trained on datasets pairing sample text inputs to corresponding intents. In some embodiments, the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202).

Regarding Claim 3, the combined teachings of Sreedhara and Kennewick disclose the system of claim 2. 
Kennewick further teaches wherein the contextual information comprises at least one of: historical search queries for searching the item database; screen context of a view of the item database displaying the first search results; or a time of the first voice input data and a time of the second voice input data (Fig. 6; [0100]: For example, cloud infrastructure system 602 uses historical context to influence dialog tasks).

Regarding Claim 4, the combined teachings of Sreedhara and Kennewick disclose the system of claim 1.  
Kennewick further teaches wherein the machine learning algorithm comprises a Bidirectional Encoder Representations from Transformers (BERT) algorithm ([0066]: The dialog system may compute the scores using a machine learning model such as a logistic regression model or a neural network. Specific examples of suitable models include… Bidirectional Encoder Representations from Transformers (BERT)).

Regarding Claim 5, Sreedhara discloses a computer-implemented method, comprising:
receiving first input data associated with a first search query (Fig. 7, [0175]: Process 700 begins at 702 , where control circuitry 504 receives, via a user input device (e.g., user input interface 510 , wireless communications device 606 ), first speech (e.g., first speech 106 )); 
generating first search results responsive to the first search query ([0178]: Process 700 continues to 706 , where control circuitry 504 retrieves, from a database (e.g., from media content source 616 or media guidance data source 618 through communications network 614 , or from storage 508 ) search results (e.g., search results 112 ) based on the first input); 
receiving second input data associated with a voice request (Fig. 7; [0181]: Process 700 continues to 712, where control circuitry 504 receives, via the user input device, subsequent to receiving the first speech, second speech (e.g., second speech 116 )); 
generating a second search query by processing the voice request with a natural language processing algorithm (Fig. 7; [0182]: Process 700 continues to 714, where control circuitry 504 determines, using automatic speech recognition (ASR), a second input (e.g., second input 120 ) based on the second speech. For example, control circuitry 504 may determine the second input by converting the second speech to text using known automatic speech recognition techniques); 
However, Sreedhara does not explicitly teach “determining, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generating, in response to the score exceeding a threshold, second search results based on the first search query and the second search query.”
On the other hand, in the same field of endeavor, Kennewick teaches
determining, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query (Fig. 2, Fig. 3; [0066]: The dialog system may compute the scores using a machine learning model such as a logistic regression model or a neural network... the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202); and 
generating, in response to the score exceeding a threshold, second search results based on the first search query and the second search query ([Fig. 2, step 216; Fig. 3; [0068]-[0070]: The dialog system may compare the score for each intent, as computed at 302, to the threshold value identified at 304, until finding that the score for a particular intent, of the plurality of potential intents, exceeds the threshold value… Causing display of the visual indication of the initial discerned intent is triggered by the determination that the score for a particular intent exceeds the threshold value at 306. See also para [0009]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Sreedhara to incorporate the teachings of Kennewick to include “determining, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generating, in response to the score exceeding a threshold, second search results based on the first search query and the second search query.”
The motivation for doing so would be to determine the initial intent of the user, as recognized by Kennewick ([0009] of Kennewick: In some aspects, determining the initial discerned intent comprises computing, by the dialog system, a plurality of scores for a respective plurality of potential intents and the first portion of the speech input and determining, by the dialog system, that the score for a particular intent, of the plurality of potential intents, exceeds a threshold value).

Regarding Claim 6, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
 Sreedhara further teaches wherein generating the second search results comprises filtering a subset of the first search results based on the second search query ([0078]: Second speech 116 may lack an explicit indication whether the user intends to correct an error in first input 108 with second speech 116 , or whether the user intends, for example, to begin a new search or filter the previously presented search results 112 with second speech 116).

Regarding Claim 7, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Kennewick further teaches wherein generating the second search results comprises performing a new search using both the first search query and the second search query (Fig. 2; [0062]-[0063]: At 214, if feedback does not suggest modifying the initial discerned intent at 208, then the dialog system processes a second portion of the speech input in connection with the initial discerned intent… At 216, the dialog system provides a response).

Regarding Claim 8, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Sreedhara further teaches wherein the first input data is at least one of: a voice input; or a typed input ([0074]: For example, the media guidance application may determine first input 108 by converting first speech 106 to text using known automatic speech recognition techniques; [0146]: A user may send instructions to control circuitry 504 using user input interface 510. User input interface 510 may be any suitable user interface, such as a… voice recognition interface, or other user input interfaces).

Regarding Claim 9, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Sreedhara further teaches, further comprising: prior to receiving the second voice input data, receive contextual information related to one or more user actions with respect to the item database ([0016]: In some embodiments, the media guidance application captures, via the user input device, between the first time and the second time, an image of the face of a user. See also para [0019]-[0024]).
 Additionally, Kennewick teaches wherein determining the score further comprises using the machine learning algorithm having the contextual information as an additional input ([0026]: In some embodiments, the dialog system continuously presents its understanding of context as the user speaks one or more sentences; [0066]: The dialog system may compute the scores using a machine learning model… Such a model may have previously been trained on datasets pairing sample text inputs to corresponding intents. In some embodiments, the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202).

Regarding Claim 10, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 9. 
Kennewick further teaches wherein the contextual information comprises: historical search queries for searching the item database; screen context of a view of the item database displaying the first search results; browsing history of the user device; or a time of the first input data and a time of the second input data (Fig. 6; [0100]: For example, cloud infrastructure system 602 uses historical context to influence dialog tasks).

Regarding Claim 11, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Sreedhara further teaches wherein: the first search query comprises an item class ([0095]: For example, if first input 108 is “Show me Sox games,” the media guidance application may display search results 112 for both the Boston Red Sox and the Chicago White Sox (both sports teams)); and generating the second search results comprises determining that the second search query comprises an item property of a subset of the item class ([0095]: The media guidance application may also present to the user a disambiguating question, such as “Did you mean the Boston Red Sox or the Chicago White Sox?” The user may respond to this disambiguating question with second speech 116, such as “Boston Red Sox.”).

Regarding Claim 12, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Sreedhara further teaches wherein the first input data comprises an initial voice request ([0074]: For example, the media guidance application may determine first input 108 by converting first speech 106 to text using known automatic speech recognition techniques, the method further comprising: 
identifying, by processing the initial voice request with the natural language processing algorithm, an item identifier associated with an item class ([0095]: For example, if first input 108 is “Show me Sox games,” the media guidance application may display search results 112 for both the Boston Red Sox and the Chicago White Sox (both sports teams)); and
determining that a search term of the second search query is associated with a filter category related to the item class ([0095]: the media guidance application may automatically consider, regardless of time difference 118 between first time 110 and second time 114, that second speech 116 should not be used to correct first input 108, but rather should be used to disambiguate it, or filter search results 112), and 
Additionally Kennewick teaches wherein determining the score comprises having the item class and the filter category as inputs of the machine learning algorithm ([0026]: In some embodiments, the dialog system continuously presents its understanding of context as the user speaks one or more sentences; [0066]: The dialog system may compute the scores using a machine learning model… Such a model may have previously been trained on datasets pairing sample text inputs to corresponding intents. In some embodiments, the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202).

Regarding Claim 13, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Sreedhara further teaches wherein the first search query comprises first search terms derived from the first input data ([0074]: For example, the media guidance application may determine first input 108 by converting first speech 106 to text using known automatic speech recognition techniques. See also para [0075]), and the second search query comprises second search terms derived from the second input data ([0079]: For example, the media guidance application may determine second input 120 by converting second speech 116 to text using known automatic speech recognition techniques).

Regarding Claim 14, the combined teachings of Sreedhara and Kennewick disclose the computer-implemented method of claim 5.
Kennewick further teaches, further comprising: receiving third input data associated with an additional voice request; generating a third search query by processing the additional voice request with the natural language processing algorithm (Fig. 2; [0054]: At 208… The feedback may be part of the continuous stream of speech input received (e.g., a second portion of the speech input, a third portion of the speech input, etc.)); 
determining, using a machine learning algorithm having inputs of the first search query, the second search query, and the third search query, a second score indicative of a second probability that the third search query is a refinement of the first search query and the second search query (Fig. 3; [0066]: At 302, the dialog system computes a plurality of scores for a respective plurality of potential intents and the first portion of the speech input. The dialog system may compute the scores using a machine learning model); and 
generating, in response to the second score exceeding the threshold, third search results based on the first search query, the second search query, and the third search query (Fig. 2, Fig. 3; [0070]: After 308, the dialog system may cause display of the visual indication as described above at 206. Causing display of the visual indication of the initial discerned intent is triggered by the determination that the score for a particular intent exceeds the threshold value at 306).

Regarding Claim 15, Sreedhara discloses a system, comprising: 
a memory configured to store computer-executable instructions (Fig. 5; [0147]: instructions of the application are stored locally (e.g., in storage 508 )); and 
a processor configured to access the memory and execute the computer- executable instructions (Fig. 5; [0142]: Control circuitry 504 may be based on any suitable processing circuitry such as processing circuitry 506) to at least: 
receive first input data associated with a first search query (Fig. 7, [0175]: Process 700 begins at 702 , where control circuitry 504 receives, via a user input device (e.g., user input interface 510 , wireless communications device 606 ), first speech (e.g., first speech 106 )); 
generate first search results responsive to the first search query ([0178]: Process 700 continues to 706 , where control circuitry 504 retrieves, from a database (e.g., from media content source 616 or media guidance data source 618 through communications network 614 , or from storage 508 ) search results (e.g., search results 112 ) based on the first input); 
receive second input data associated with a voice request (Fig. 7; [0181]: Process 700 continues to 712, where control circuitry 504 receives, via the user input device, subsequent to receiving the first speech, second speech (e.g., second speech 116 )); 
generate a second search query by processing the voice request with a natural language processing algorithm (Fig. 7; [0182]: Process 700 continues to 714, where control circuitry 504 determines, using automatic speech recognition (ASR), a second input (e.g., second input 120 ) based on the second speech. For example, control circuitry 504 may determine the second input by converting the second speech to text using known automatic speech recognition techniques); 
However, Sreedhara does not explicitly teach “determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generate, in response to the score exceeding a threshold, second search results based on the second search query.”
On the other hand, in the same field of endeavor, Kennewick teaches
determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query (Fig. 2, Fig. 3; [0066]: The dialog system may compute the scores using a machine learning model such as a logistic regression model or a neural network... the output of the model is a score indicating a probability that a system-supported intent corresponds to the intent of the speech input received at 202); and 
generate, in response to the score exceeding a threshold, second search results based on the second search query (Fig. 2, step 216; Fig. 3; [0068]-[0070]: The dialog system may compare the score for each intent, as computed at 302, to the threshold value identified at 304, until finding that the score for a particular intent, of the plurality of potential intents, exceeds the threshold value… Causing display of the visual indication of the initial discerned intent is triggered by the determination that the score for a particular intent exceeds the threshold value at 306. See also para [0009]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Sreedhara to incorporate the teachings of Kennewick to include “determine, using a machine learning algorithm having inputs of the first search query and the second search query, a score indicative of a probability that the second search query is a refinement of the first search query; and generate, in response to the score exceeding a threshold, second search results based on the second search query.”
The motivation for doing so would be to determine the initial intent of the user, as recognized by Kennewick ([0009] of Kennewick: In some aspects, determining the initial discerned intent comprises computing, by the dialog system, a plurality of scores for a respective plurality of potential intents and the first portion of the speech input and determining, by the dialog system, that the score for a particular intent, of the plurality of potential intents, exceeds a threshold value).

Regarding Claim 16, the combined teachings of Sreedhara and Kennewick disclose the system of claim 15.
Sreedhara further teaches wherein the computer-executable instructions to generate the second search results comprise further instructions that, when executed, cause the processor to filter a subset of the first search results base on the second search query ([0078]: Second speech 116 may lack an explicit indication whether the user intends to correct an error in first input 108 with second speech 116 , or whether the user intends, for example, to begin a new search or filter the previously presented search results 112 with second speech 116).

Regarding Claim 17, the combined teachings of Sreedhara and Kennewick disclose the system of claim 15.
Kennewick further teaches wherein the computer-executable instructions to generate the second search results comprise further instructions that, when executed, cause the processor to perform a new search using both the first search query and the second search query (Fig. 2; [0062]-[0063]: At 214, if feedback does not suggest modifying the initial discerned intent at 208, then the dialog system processes a second portion of the speech input in connection with the initial discerned intent… At 216, the dialog system provides a response or performs an action based upon the initial discerned intent).

Regarding Claim 18, the combined teachings of Sreedhara and Kennewick disclose the system of claim 15.
Sreedhara further teaches wherein the computer-executable instructions to generate the second search results to refine the first search results to only include items related to both the first search query and the second search query ([0095]: In a case where the media guidance application presented a disambiguating question, the media guidance application may automatically consider, regardless of time difference 118 between first time 110 and second time 114, that second speech 116 should not be used to correct first input 108, but rather should be used to disambiguate it, or filter search results 112).

Regarding Claim 19, the combined teachings of Sreedhara and Kennewick disclose the system of claim 15.
Sreedhara further teaches wherein the computer-executable instructions to generate the second search results comprises further instructions that, when executed, cause the processor to: determine a subset of the first search results associated with the second search query; and  present the subset of the first search results as the second search results (Fig. 1; [0078]: Second speech 116 may lack an explicit indication whether the user intends to correct an error in first input 108 with second speech 116 , or whether the user intends, for example, to begin a new search or filter the previously presented search results 112 with second speech 116).

Regarding Claim 20, the combined teachings of Sreedhara and Kennewick disclose the system of claim 15.
Kennewick further teaches wherein the machine learning algorithm comprises a transformer-based machine learning algorithm ([0066]: The dialog system may compute the scores using a machine learning model such as a logistic regression model or a neural network. Specific examples of suitable models include… Bidirectional Encoder Representations from Transformers (BERT)).




Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY D. HICKS whose telephone number is (571)272-3304.  The examiner can normally be reached on Mon - Fri 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.D.H./Examiner, Art Unit 2168                                                                                                                                                                                                        


/MICHELLE N OWYANG/Primary Examiner, Art Unit 2168