EXPLANATION OF REJECTIONS/OBJECTIONS

Allowable Subject Matter
Claim 17 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-12 and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In particular, claims 1 and 15 recite the phrase “and/or,” which renders the scope of the associated limitation indefinite. For purposes of examination, the phrase “and/or” will be interpreted as “or.” 
Dependent claims 2-12 incorporate the subject matter of claim 1 and are rejected for the same reason.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 13-15 are rejected under 35 U.S.C. 102(a)(2) and (a)(2) as being anticipated by Agarwal et al. (US Pub. 20170308535).
Referring to claim 13, Agarwal discloses a system comprising: 
one or more processors [fig. 1; par. 38; a computer device comprises one or more processing units]; 
a processing module executable by the one or more processors [fig. 1; par. 41; the computer device comprises modules executable by the one or more processing units] to receive a user message from a device and to parse the user message, the user message comprising at least one of: a search query, or a response from a user during an interaction of the user with the system, the interaction is to generate results of the search query [par. 17; a query is received from a user]; 
a search engine to receive the parsed user message, and to generate search results based on the parsed user message [par. 17; results are provided in response to the query]; and 
an artificial intelligence module executable by the one or more processors [fig. 1; par. 27; the computer device is used to implement a computational model system (e.g., a deep neural network and/or multi-model training system)] to receive the parsed user message and the search results [par. 17; note the query and the results], generate a vector representation of a plurality of values corresponding to a plurality of possible actions that the system can take, based at least in part on the parsed user message and the search results [pars. 17-20 and 99-103; computational models are used by the system to respond to the query; the computational models comprise actions associated with different states; note that the computational models are represented using matrices (i.e., a vector representation)], each of the possible actions associated with a corresponding one of the values that indicates a rank of that action relative to the other possible actions [pars. 17-20; each action is associated with a result value (e.g., a score) based on a policy function], and select a first action of the possible actions, based at least in part on a value corresponding to the selected action [pars. 17-21; the system selects actions based on associated result values].
Referring to claim 14, Agarwal discloses wherein the processing module is a natural language processing (NLP) module [par. 17; the query is a natural language query, which requires natural language processing], and wherein processing module is to: receive an indication of the selection of the first action of the possible actions; generate a system message, based on the indication of the selection of the first action; and cause the system message to be transmitted to the device, for displaying on the device [par. 18; example actions include transmitting specific information (e.g., text or hyperlinks) to the user].
Referring to claim 15, Agarwal discloses wherein the system message includes at least one of the search results, a request for more information about the search query, a request to refine the search query, and/or a request to select one of a plurality of categories of results of the search query [par. 18; example actions include transmitting specific information (e.g., text or hyperlinks) to the user].

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 10-12, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal in view of Cuayahuitl et al. (NPL “Spatially-Aware Dialogue Control Using Hierarchical Reinforcement Learning”).
Referring to claim 1, Agarwal discloses a method for providing an interactive search session, the method comprising: 
receiving, at a search engine, a search query from a device, the search query provided by a user, the search engine configured with a Reinforcement Learning (RL)-based agent programmed to interact with the user, to help the user in refining the search query by providing the user with contextual assistance [pars. 17-21; computational models, which are improved based on result values, are used by a system to respond to a user query by reducing the number of system interactions required to achieve the user’s goals]; 
generating, by the RL-based agent and based at least in part on the search query, a vector representation of a plurality of values corresponding to a plurality of possible actions that the search engine can take in response to the search query [pars. 17-20 and 99-103; the computational models comprise actions associated with different states; note that the computational models are represented using matrices (i.e., a vector representation)], each of the possible actions associated with a corresponding one of the values that indicates a rank of that action relative to the other possible actions [pars. 17-20; each action is associated with a result value (e.g., a score) based on a policy function], wherein a given value encodes a sequential aggregation of one or both agent and user actions in last k cycles of the search session to capture both a local context and a global context, wherein one cycle of the search session includes a first action by the user and a second action by the search engine [pars. 17-21; each computational model corresponds to at least one session during which the system responded to the user with selected actions, where the computational model is determined using data collected from previous interactions between the system and the user], and the local context includes a current cycle and/or a just previous cycle, and the global context includes one or more relatively older historical cycles not reflected in the local context [pars. 17-21; note computational models are determined based on result values from previous computational models, and each computational model is associated with at least one interaction between the system and the user; this means that one or more previous interactions are used to determine the computational models]; 
selecting, by the RL-based agent, an action from the possible actions, based at least in part on the value corresponding to the selected action [pars. 17-21; the system selects actions based on associated result values]; 
transmitting, by the search engine, a message to the device, for displaying on the device, the message based at least in part on the selected action, wherein the message is different from results of the search query and solicits further action from the user [par. 18; example actions include transmitting specific information (e.g., text) to the user or other types of actions (e.g., ordering tickets, connecting to a ticket-purchasing site); each session can include one or more exchanges (i.e., interactions) with the user].
Agarwal does not appear to explicitly disclose refining the search query based at least in part on action by the user that is responsive to the message.
However, Cuayahuitl discloses refining the search query based at least in part on action by the user that is responsive to the message [pg. 5:5; Table I; the second sample dialog shows how a system helps a user refine a query via clarification questions].
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the search system taught by Agarwal so that the actions include dialog for helping the user to refine the query as taught by Cuayahuitl. The motivation for doing so would have been to more effectively respond to the user’s query [Agarwal, par. 17].
Referring to claim 2, Agarwal discloses wherein the selected action is a first selected action, the method further comprising: receiving, by the search engine, a user response to the message from the device; generating, by the RL agent and based at least in part on the user response, another vector representation of a plurality of values corresponding to a plurality of possible actions that the search engine can take in response to the user response, each of the actions associated with a corresponding one of said values that reflects one or both agent and user actions in the most recent cycle of the search session; selecting, by the RL agent, a second action from the possible actions, based at least in part on the value corresponding to the second selected action; and transmitting, by the search engine, another message to the device, based at least in part on the second selected action [pars. 17-21; each computational model corresponds to at least one session during which the system responded to a user with selected actions, where the computational model is determined using data collected from previous interactions between the system and the user].
Referring to claim 3, Agarwal discloses iteratively repeating receiving a user response, generating another vector, selecting a corresponding action from the possible actions, and transmitting a corresponding message to the device, until an end of the search session is identified [pars. 17-21; each computational model corresponds to at least one session, and each session includes one or more interactions between the system and the user].
Referring to claim 4, Cuayahuitl discloses wherein the selected action and the message are to request additional information about the search query, or to request to refine the search query [pg. 5:5; Table I; note the second sample dialog].
Referring to claim 5, Agarwal discloses retrieving, by the search engine and from one or more databases, search results responsive to the search query; and providing, by the search engine, the search results to the device to cause display of the search results on the device simultaneously with a display of the message on the device [par. 18; example actions can include transmitting specific information to the user such as text and hyperlinks].
Referring to claim 6, Agarwal discloses wherein the RL-based agent comprises an artificial intelligence model that is trained artificially through a virtual user [par. 78; a test driver (e.g., a code module) can operate a target (i.e., the system) in the absence of an entity (i.e., the user) during a test phase].
Referring to claim 7, Agarwal discloses wherein the RL-based agent operates on an Asynchronous Advantage Actor-Critic (A3C) algorithm that generates an actor output and a critic output, the actor output including the possible actions and the critic output including a state of the search session, the state encoding the agent actions and user actions in the last k cycles of the search session [pars. 17-21; note the actions, states, and result values (i.e., rewards)].
Referring to claim 10, Agarwal discloses iteratively repeating generating a vector and selecting a corresponding action; and defining, at each cycle of the search session, a corresponding state of the RL-based agent, wherein the state at a specific cycle includes one or more of at least a partial history of actions selected so far in the search session, at least a partial history of responses received from the device so far in the search session, and/or a length of the search session so far [pars. 17-21 and 81; the computational models comprise actions associated with different states; each computational model corresponds to at least one session during which the system responded to a user with selected actions, where the computational model is determined using data collected from previous interactions between the system and the user].
Referring to claim 11, Agarwal discloses wherein at least the partial history of actions selected so far in the search session comprises a plurality of action vectors, wherein each action vector is indicative of a corresponding action undertaken during a corresponding cycle of the search session query [pars. 17-20 and 99-103; the computational models comprise actions associated with different states; note that the computational models are represented using matrices (i.e., a vector representation)].
Referring to claim 12, Agarwal discloses wherein at least the partial history of responses received from the device so far in the search session comprises a plurality of user vectors, wherein each user vector is indicative of a corresponding response received during a corresponding cycle of the search session [pars. 17-21 and 99-103; the computational models are updated based on the result values indicating value to the user; note that the computational models are represented using matrices (i.e., a vector representation)].
Referring to claim 18, Agarwal discloses a computer program product including one or more non-transitory machine-readable mediums encoded with instructions that when executed by one or more processors cause a process to be carried out [fig. 1; par. 41; a computer device comprises computer-readable media storing instructions executable by one or more processing units] for causing an interactive search session with a user [fig. 1; pars. 18 and 41; the instructions may comprise running a session between a system and a user in which the system and the user have one or more interactions], the process comprising: 
receiving a search query from a device, the search query provided by a user [par. 17; a query is received from a user]; 
generating, based at least in part on the search query, a plurality of values corresponding to a plurality of possible actions that can be taken in response to the search query, each of the possible actions associated with a corresponding one of the values that indicates a rank of that action relative to the other possible actions [pars. 17-20; computational models are used by the system to respond to the query; the computational models comprise actions associated with different states; each action is associated with a result value (e.g., a score) based on a policy function]; 
selecting an action from the possible actions, based at least in part on the value corresponding to the selected action being a maximum among the plurality of values [pars. 17-21 and 86; the system selects actions based on associated result values (e.g., rank)]; 
transmitting a message to the device, for displaying on the device, the message based at least in part on the selected action [par. 18; example actions include transmitting specific information (e.g., text or hyperlinks) to the user].
Agarwal does not appear to explicitly disclose refining the search query based at least in part on a user response to the message.
However, Cuayahuitl discloses refining the search query based at least in part on a user response to the message [pg. 5:5; Table I; the second sample dialog shows how a system helps a user refine a query via clarification questions].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the search system taught by Agarwal so that the actions include dialog for helping the user to refine the query as taught by Cuayahuitl. The motivation for doing so would have been to more effectively respond to the user’s query [Agarwal, par. 17].
Referring to claim 19, Agarwal discloses receiving the user response to the message; selecting another action of the possible actions, based on the user response; and transmitting another message to the device, for displaying on the device, the another message based at least in part on the selected another action [pars. 17-21; each computational model corresponds to at least one session, and each session includes one or more interactions between the system and the user].
Referring to claim 20, Cuayahuitl discloses iteratively repeating receiving a user response, selecting a corresponding action of the possible actions, and transmitting a corresponding message, to engage in an interactive chat session with the user, the interactive chat session is to at least in part receive contextual cues about the search query [pg. 5:5; Table I; note the second sample dialog].

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Agarwal and Cuayahuitl in view of Sarikaya et al. (US Pub. 20170337478).
Referring to claim 8, Agarwal discloses training the RL-based agent by causing the RL-based agent to interact with a virtual user [par. 78; a test driver (e.g., a code module) can operate a target (i.e., the system) in the absence of an entity (i.e., the user) during a test phase].
Agarwal and Cuayahuitl do not appear to explicitly disclose wherein the virtual user is modeled using conversation history of the RL-based agent with one or more actual users.
However, Sarikaya discloses wherein the virtual user is modeled using conversation history of the RL-based agent with one or more actual users [par. 38; a simulated user (SU) component simulates real user interaction with a personal digital assistant (PDA); the SU component may be trained based on training data which reflects interactions between real users and PDA components].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the search system taught by the combination of Agarwal and Cuayahuitl so that the test driver is trained using real user data as taught by Sarikaya. The motivation for doing so would have been to have an initial set of training data [Sarikaya, par. 136] to build the computational models.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Agarwal in view of Sarikaya.
Referring to claim 16, Agarwal discloses the artificial intelligence module implements a Reinforcement Learning (RL) model that operates on an Asynchronous Advantage Actor-Critic (A3C) algorithm [pars. 17-21; note the actions, states, and result values (i.e., rewards); and the artificial intelligence module is to train the RL model by causing the RL model to interact with a virtual user [par. 78; a test driver (e.g., a code module) can operate a target (i.e., the system) in the absence of an entity (i.e., the user) during a test phase].
Agarwal does not appear to explicitly disclose wherein the virtual user is modeled using conversation history of the RL model with one or more real users.
However, Sarikaya discloses wherein the virtual user is modeled using conversation history of the RL model with one or more real users [par. 38; a simulated user (SU) component simulates real user interaction with a personal digital assistant (PDA); the SU component may be trained based on training data which reflects interactions between real users and PDA components].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the search system taught by Agarwal so that the test driver is trained using real user data as taught by Sarikaya. The motivation for doing so would have been to have an initial set of training data [Sarikaya, par. 136] to build the computational models.





Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GRACE PARK whose telephone number is (571) 270-7727.  The examiner can normally be reached on M-F 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMES TRUJILLO can be reached on (571) 272-3677.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.

/Grace Park/Primary Examiner, Art Unit 2157