Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Application filed in the U.S. on 11/26/2019. Claims 1-28 are pending in the case. Claims 1, 9, 17, and 25 are written in independent form.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4, 8, 9, 12, 16, 17, 20, 24, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Sordoni et al. (U.S. Pre-Grant Publication No. 2017/0351663, hereinafter referred to as Sordoni), and further in view of Baughman et al. (U.S. Pre-Grant Publication No. 2016/0240095, hereinafter referred to as Baughman).

Regarding Claim 1:
Sordoni teaches a method comprising:
receiving a user query at a multi-step question answering system from a user;
Sordoni teaches “a question relating to the text in a document is received from a client-computing device” (Para. [0035] & Fig. 3 Element 300).
performing, using at least one processor of the multi-step question answering system, multiple rounds of an answer generation process,
Sordoni teaches a multi-step question answering system with multiple rounds of an answer generation process by teaching processing multiple time-steps (Figure 4B Elements 430 and 440) for the query and text glimpses produced in steps 405 and 410 of Figure 4A, respectively (Para. [0040]).
wherein each round of the answer generation process comprises:
selecting one of multiple functions to be performed based on an input state,
Sordoni teaches “the query glimpse 520 and the text glimpse 610 produced at the time-step t are processed…to produce a result 705….[which] may be a set of candidate answers or information that is not useful or relevant to predicting an answer to the question” (Para. [0050]). Sordoni further teaches multiple functions to be performed based on the input query glimpse and text glimpse and the associated time-step by teaching “a determination is made as to whether the results produced by the processing of the query and text glimpses results in a set of candidate answers” which determines whether the result is discarded at step 435 or candidate answers are stored at step 425 as well as  (Para. [0041] & Fig. 4).
the input state for each round comprising an embedding of the user query in a feature space, the input state for at least one round further comprising an embedding of information to be used to identify an answer to the user query in the feature space; and
Sordoni teaches “the question 500 is a sequence of words…drawn from a vocabulary V” where “each word can be represented by a continuous word embedding…that is stored in a word embedding matrix” (Para. [0046]). Sordoni further teaches “processing, at a first time-step, a question relating to a document to produce a query glimpse, and processing, at the first time-step, one or more passages of the text in the document to produce a text glimpse” (Para. [0006]) thereby teaching an input state comprising an embedding of document information used to identify an answer to the question.
performing the selected function, wherein the multiple functions include (i) an answer generation function that produces the answer to the user query; and
Sordoni teaches a first of multiple functions by teaching determining and storing candidate answers for the user query (Para. [0041] and Fig. 4A Steps 420 & 425). 
providing the answer to the user.
Sordoni teaches “the predicted answer is provided to the client-computing device” (Para. [0037]).

Sordoni explicitly teaches all of the elements of the claimed invention as stated above except:
performing the selected function, wherein the multiple functions include (ii)at least one additional function that updates the input state for the current round of the answer generation process for use during a subsequent round of the user generation process;

However, in the related field of endeavor of generating answers to a question by a user, Baughman teaches:
performing the selected function, wherein the multiple functions include (ii)at least one additional function that updates the input state for the current round of the answer generation process for use during a subsequent round of the user generation process;
Baughman teaches a first of multiple functions by teaching “the mechanism determines whether to generate deepening questions from the local evidence” where “if the mechanism determines to generate deepening questions in block 909, the mechanism generates a function call for each generated deepening question including the question and non-local context evidence” and performing another round of the answer generation process (Para. [0114] & Fig. 9).
Baughman teaches a second of the multiple functions by teaching identifying “candidate answers that surface as being significantly stronger than others and thus, generates a final answer, or ranked set of answers, for the input question” (Para. [0036]).

Thus it would have been obvious to a person having ordinary skill in the art, having the teachings of Baughman and Sordoni at the time that the claimed invention was effectively filed, to have combined the updating of the status of the input state at each iteration, as taught by Baughman, with the system and method for iteratively performing a search to predict an answer to a question, as taught by Sordoni.
One would have been motivated to make such combination because Baughman teaches iteratively updating the input state of an original question by adding context evidence to a question at each iteration to deepen the question and provide further results to each question that has been deepened at each iteration (Paras. [0080]-[0081]) and it would have been obvious to a person having ordinary skill in the art that further deepening a question using context evidence to provide additional results would improve the ability of answering the user’s question.

Regarding Claim 4:
Sordoni and Baughman further teach:
the embedding of the information to be used to identify the answer defines a document context associated with a document containing the answer; and
Sordoni further teaches “processing, at a first time-step, a question relating to a document to produce a query glimpse, and processing, at the first time-step, one or more passages of the text in the document to produce a text glimpse” (Para. [0006]) thereby teaching an input state comprising an embedding of document information used to identify an answer to the question.
the at least one additional function comprises:
a function that selects specific sentences from the document context to produce an updated document context; and
Sordoni teaches selecting specific sentences from the document context to produce an updated document context by teaching “processing one or more additional passages of text in the document to produce an additional text glimpse, the additional text glimpse including one or more different entities in a different portion of the text that are relevant to answering the question” (Para. [0006]).
a function that excludes an incorrect answer from the document context to produce the updated document context.
Sordoni teaches at block 420 “if a set of candidate answers is not produced at block 415 (e.g., result is information or data that is not relevant or useful to predicting an answer), the result is discarded at block 435 and the process passes to block 430” (Para. [0041]).

Regarding Claim 8:
Baughman and Sordoni further teaches wherein the answer generations function comprises:
a context-query attention layer configured to generate context-query attention information identifying how words or tokens in a document embedding relate to words or tokens in the embedding of the user query, the document embedding associated with the information to be used to identify the answer;
Sordoni teaches “the question 500 is a sequence of words…drawn from a vocabulary V” where “each word can be represented by a continuous word embedding…that is stored in a word embedding matrix” (Para. [0046]). Sordoni further teaches “processing, at a first time-step, a question relating to a document to produce a query glimpse, and processing, at the first time-step, one or more passages of the text in the document to produce a text glimpse” (Para. [0006]) thereby teaching an input state comprising an embedding of document information associated with and used to identify an answer to the question.Sordoni teaches identifying how words or tokens in the document embedding relate to words or tokens in the question embedding by teaching “the third processing circuitry 215 processes the query and the text glimpses to determine if the text glimpse provides a set of candidate answers to the question” (Para. [0031]) where the third processing circuitry 215 is a neural network (Fig. 7 Element 700).
multiple model encoder layers configured to generate representations of the words or tokens in the document embedding based on knowledge of the user query; and
Sordoni teaches an encoding step “where a set of one or more vector representations are computed” as “vector representation(s)…of the content of the question and the text” (Para. [0027]). Sordoni further teaches “the alternating attention mechanism permits the NLSC 200 to reason about different query glimpses in a sequential way based on the text glimpses that were gathered previously from the text (Para. [0033]) and it would have been obvious to have tried the alternative of reasoning about different text glimpses in a sequential way based on the query glimpses that were gathered previously from the query.
multiple output layers configured to generate probabilities of each word or token in the document embedding representing starting and ending positions of the answer in the information;
Sordoni teaches “each set of candidate answers can include one or more candidates answers and a probability associated with each candidate answer that the candidate answer is the predicted answer” and “the predicted answer is determined based on the probability associated with each candidate answer (e.g., the candidate answer with the highest probability is selected as the predicted answer)” (Para. [0043]). Sordoni further teaches “each word can be represented by a continuous word embedding…that is stored in a word embedding matrix” (Para. [0048]) and “in non-limiting examples, the output device is a display that displays the predicted answer and/or a speaker that ‘speaks’ the predicted answer (e.g., using a text-to-speech device (TTS) 120)” (Para. [0024]) thereby teaching having starting and ending positions of the predicted answer comprising candidate answers with probabilities high enough to be determined as the predicted answer.
wherein the answer is based on the word or token having a highest starting probability and the word or token having the highest ending probability.
Sordoni teaches “each word can be represented by a continuous word embedding…that is stored in a word embedding matrix” (Para. [0048]) where the probability of the answer is predicted given the text and the question (Para. [0052]) thereby teaching that the predicted answer has a highest probability associated with the starting and ending of the predicted answer.

Regarding Claim 9:
Some of the limitations herein are similar to some or all of the limitations of Claim 1.

Sordoni and Baughman further teach an apparatus comprising:
at least one memory (Sordoni – Para. [0056]); and
at least one processor operatively coupled to the at least one memory and configured to perform steps (Sordoni – Para. [0056]).

Regarding Claim 12:
All of the limitations herein are similar to some or all of the limitations of Claim 4.

Regarding Claim 16:
All of the limitations herein are similar to some or all of the limitations of Claim 8.

Regarding Claim 17:
Some of the limitations herein are similar to some or all of the limitations of Claim 1.

Sordoni and Baughman further teach:
a non-transitory machine-readable medium containing instructions that when executed cause at least one processor to perform steps (Sordoni - Paras. [0056] & [0061]).

Regarding Claim 20:
All of the limitations herein are similar to some or all of the limitations of Claim 4.

Regarding Claim 24:
All of the limitations herein are similar to some or all of the limitations of Claim 8.

Regarding Claim 25:
All of the limitations herein are similar to some or all of the limitations of Claim 1.

Sordoni and Baughman further teach:
training an action selection function of a multi-step question answering system
Sordoni teaches training and evaluating the natural language comprehension system (NLCS) with “a sufficient amount of training and text data…obtained without human intervention” (Para. [0053]).



Claims 2, 3, 10, 11, 18, 19, and 26-28 are rejected under 35 U.S.C. 103 as being unpatentable over Sordoni and Baughman, and further in view of Matsuoka et al. (U.S. Pre-Grant Publication No. 2020/0167834, hereinafter referred to as Matsuoka).

Regarding Claim 2:
Sordoni and Baughman explicitly teach all of the elements of the claimed invention as stated above except:
wherein selecting one of the multiple functions in each round of the answer generation process comprises using a deep reinforcement learning-based model to select one of the functions.

However, in the related field of endeavor of answering users’ questions, Matsuoka teaches:
wherein selecting one of the multiple functions in each round of the answer generation process comprises using a deep reinforcement learning-based model to select one of the functions.
Matsuoka teaches using a deep reinforcement learning-based model as a machine-learned model by teaching “the machine-learned model can perform or be subject to one or more reinforcement learning techniques, such as…deep Q-networks…[and] actor-critics” (Para. [0119]).
Sordoni teaches “the query glimpse 520 and the text glimpse 610 produced at the time-step t are processed…to produce a result 705….[which] may be a set of candidate answers or information that is not useful or relevant to predicting an answer to the question” (Para. [0050]). Sordoni further teaches multiple functions to be performed based on the input query glimpse and text glimpse and the associated time-step by teaching “a determination is made as to whether the results produced by the processing of the query and text glimpses results in a set of candidate answers” which determines whether the result is discarded at step 435 or candidate answers are stored at step 425 as well as  (Para. [0041] & Fig. 4).
Therefore, Matsuoka in combination with Sordoni teaches selecting one of multiple function in each round of the answer generation process (Sordoni) using a deep reinforcement learning based model (Matsuoka).



Thus it would have been obvious to a person having ordinary skill in the art, having the teachings of Matsuoka, Baughman and Sordoni at the time that the claimed invention was effectively filed, to have combined the deep reinforcement learning techniques, as taught by Matsuoka, with the updating of the status of the input state at each iteration, as taught by Baughman, and the system and method for iteratively performing a search to predict an answer to a question, as taught by Sordoni.
One would have been motivated to make such combination because Matsuoka teaches  “the reinforcement learning algorithm may use the compliance metric to learn and provide better future recommendations to other users” (Para. [0098]) where the “recommendations” are answers to questions from users (Abstract) and it would have been obvious to a person having ordinary skill in the art that a user of the systems and methods taught by SOrdoni and Baughman would benefit by a question-answering system that continued to learn and provide improved answers to questions.



Regarding Claim 3:
Matsuoka, Baughman, and Sordoni further teach:
the deep reinforcement learning-based model is trained using actor-critic reinforcement learning to select, in each round of the answer generation process, the function that maximized an expected reward, different ones of the functions associated with different rewards; and
Matsuoka teaches “the machine-learned model can perform or be subject to one or more reinforcement learning techniques, such as…deep Q-networks…[and] actor-critics” (Para. [0119]) where “in reinforcement learning, an agent (e.g., model) can take actions in an environment and learn to maximize rewards and/or minimize penalties that result from such actions” (Para. [0152]).
the actor-critic reinforcement learning involves an actor neural network and a critic neural network that are trained using different loss functions.
Matsuoka teaches “the machine-learned model can perform or be subject to one or more reinforcement learning techniques, such as…deep Q-networks…[and] actor-critics” (Para. [0119]) where “in reinforcement learning, an agent (e.g., model) can take actions in an environment and learn to maximize rewards and/or minimize penalties that result from such actions” (Para. [0152]).
Matsuoka further teaches “the machine-learned model can be trained by optimizing an object function…[which] can include a loss function that compares (e.g., determines a difference between) output data generated by the model from the training data and labels (e.g., ground-truth labels) associated with the training data) (Para. [0149]).

Regarding Claim 10:
All of the limitations herein are similar to some or all of the limitations of Claim 2.

Regarding Claim 11:
All of the limitations herein are similar to some or all of the limitations of Claim 3.

Regarding Claim 18:
All of the limitations herein are similar to some or all of the limitations of Claim 2.

Regarding Claim 19:
All of the limitations herein are similar to some or all of the limitations of Claim 3.

Regarding Claim 26:
All of the limitations herein are similar to some or all of the limitations of Claim 2.

Regarding Claim 27:
All of the limitations herein are similar to some or all of the limitations of Claim 3.

Regarding Claim 28:
All of the limitations herein are similar to some or all of the limitations of Claim 3.


Claims 5, 6, 13, 14, 21, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Sordoni and Baughman, and further in view of Costa et al. (U.S. Pre-Grant Publication No. 2006/0288001, hereinafter referred to as Costa).

Regarding Claim 5:
Sordoni and Baughman explicitly teach all of the elements of the claimed invention as stated above except:
wherein the embedding of the information to be used to identify the answer comprises at least one of:
an embedding identifying one or more websites identified based on the user query; and
an embedding identifying one or more domain-specific databases identified based on the user query.

However, in the related field of endeavor of search engines, Costa teaches:
wherein the embedding of the information to be used to identify the answer comprises at least one of:
an embedding identifying one or more websites identified based on the user query; and
Costa teaches “given a user’s particular query, the system (1) consults the representative index; (2) measures relevancies; and (3) determines which are the most adequate search engines and databases to that query” (Para. [0016]) thereby teaching identifying, as a feature of the query, one or more websites and/or databases to be used to identify the answer.
an embedding identifying one or more domain-specific databases identified based on the user query.
Costa teaches “given a user’s particular query, the system (1) consults the representative index; (2) measures relevancies; and (3) determines which are the most adequate search engines and databases to that query” (Para. [0016]) thereby teaching identifying, as a feature of the query, one or more websites and/or databases to be used to identify the answer.

Thus it would have been obvious to a person having ordinary skill in the art, having the teachings of Costa, Baughman and Sordoni at the time that the claimed invention was effectively filed, to have combined the determination of most adequate search engines and databases for a given particular query, as taught by Costa, with the updating of the status of the input state at each iteration, as taught by Baughman, and the system and method for iteratively performing a search to predict an answer to a question, as taught by Sordoni.
One would have been motivated to make such combination because Costa teaches considering multiple search engines that are most adequate for performing a search for a given query (Para. [0016]) because “a complete search requires consideration of the highest number of engines as possible” (para. [0008]) and therefore a person having ordinary skill in the art would be motivated to modify Baughman and Sordoni to consider more than a single search engine in order to perform a more complete search.


Regarding Claim 6:
Costa, Baughman, and Sordoni further teach:
wherein performing at least one of the rounds of the answer generation process further comprises:
providing an identification of possible sources of information for answering the user query to the user; and
Baughman teaches providing candidate answer(s) for answering the user query to the user in each iteration (Fig. 9 Elements 905-907) regardless of whether or not a deepening question is performed at step 909 (Paras. [0113]-[0114]).
receiving, from the user, feedback indicating whether the user has a preference for using a specific one of the possible sources of information.
Costa teaches using a “user’s implicit feedback” that indicates a preferred search engine and searchable database (Claim 9) where “given a list of results, once the user chooses and clicks on a specific result (search engine), the Search Assistant displays its page of results to the given query dynamically” (Para. [0016]).

Regarding Claim 13:
All of the limitations herein are similar to some or all of the limitations of Claim 5.

Regarding Claim 14:
All of the limitations herein are similar to some or all of the limitations of Claim 6.

Regarding Claim 21:
All of the limitations herein are similar to some or all of the limitations of Claim 5.

Regarding Claim 22:
All of the limitations herein are similar to some or all of the limitations of Claim 6.


Claims 7, 15, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Sordoni and Baughman, and further in view of Chen et al. (U.S. Pre-Grant Publication No. 2017/0124432, hereinafter referred to as Chen).

Regarding Claim 7:
Sordoni and Baughman further teach:
generating each embedding in the input states by:
obtaining word-level embeddings and character-level embeddings that are combined to produce word vectors; and
Sordoni teaches “the question 500 is a sequence of words…drawn from a vocabulary V” where “each word can be represented by a continuous word embedding…that is stored in a word embedding matrix” (Para. [0046]).

Sordoni and Baughman explicitly teach all of the elements of the claimed invention as stated above except:
processing the word vectors using a convolutional neural network layer, a self-attention layer, and a feedforward layer to produce the embeddings in the input states.

However, in the related field of endeavor of visual question and answering, Chen teaches:
processing the word vectors using a convolutional neural network layer, a self-attention layer, and a feedforward layer to produce the embeddings in the input states.
Chen teaches processing word vectors using a convolutional neural network layer, and a self-attention layer (Para. [0027] & Fig. 2). Chen further teaches a feedforward layer because a convolutional neural network is a type of feed forward network, thereby teaching a feedforward layer as well by teaching a convolutional neural network layer.

Thus it would have been obvious to a person having ordinary skill in the art, having the teachings of Chen, Baughman and Sordoni at the time that the claimed invention was effectively filed, to have combined the visual aspect of a question answering system, as taught by Chen, with the updating of the status of the input state at each iteration, as taught by Baughman, and the system and method for iteratively performing a search to predict an answer to a question, as taught by Sordoni.
One would have been motivated to make such combination because Chen teaches a visual question answering system that answers questions about images as opposed to text using neural networks (Para. [0026]), and it would have been obvious that a person having ordinary skill in the art that expanding the question-answering system taught by Sordoni and Baughman to handle questions about images, and not just text, would create a more dynamic question and answering system.

Regarding Claim 15:
All of the limitations herein are similar to some or all of the limitations of Claim 7.

Regarding Claim 23:
All of the limitations herein are similar to some or all of the limitations of Claim 7.




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Cohn et al. (U.S. Pre-Grant Publication No. 2018/0052915) teaches an iterative natural language search based on a confidence level of the candidate answers as to whether additional context needs to be added to the search or if the candidate answers are sufficient for returning a response to the user.
Gupta et al. (U.S. Pre-Grant Publication No. 2007/0203863) teaches a multipart artificial neural network for classifying a received question according to one or a plurality of defined categories.
Non-Patent Literature Choi et al., "Coarse-to-Fine Question Answering for Long Documents", July 2017, Association for Computational Linguistics, pages 209-220 teaches a question answering framework for scaling to longer documents by combine a coarse, fast model for selecting relevant sentences and a more expensive RNN for producing the answer from those sentences.
Non-Patent Literature Stroh et al. "Question Answering Using Deep Learning", 2016, The Stanford Natural Language Processing Group CS224d: Deep Learning for Natural Language Processing: Reports for 2016 teaches applying several deep learning approaches to question answering, with a focus on the bAbI dataset.



Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT F MAY whose telephone number is (571)272-3195. The examiner can normally be reached Monday-Friday 9:30am to 6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT F MAY/Examiner, Art Unit 2154                                                                                                                                                                                                        6/4/2022

/SYED H HASAN/Primary Examiner, Art Unit 2154