DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination.

Priority
It is acknowledged that the pending application claims priority to provisional application 63/032,474 filed 29 May 2020. Priority date of 29 May 2020 is given.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 19 October 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 1 recites “A computer-implemented method comprising:
determining an index using a machine learning model trained on datasets; and 
determining a score between first data and second data from the index using a sparse vector representation.”
Claim 1, (Step 1) the claim recites “A computer-implemented method comprising…” as drafted, the claimed method is a process, which is a statutory category of invention.
(Step 2A-Prong One) The limitations of “determining an index using a machine learning model trained on datasets;” and “determining a score between first data and second data from the index using a sparse vector representation” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “computer” and “machine learning model,” nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “computer” and “machine learning model” language, “determining” and “determining” in the context of this claim encompasses the user manually “determining an index using a machine learning model trained on datasets;” and “determining a score between first data and second data from the index using a sparse vector representation” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
the claim recites additional elements – using “computer” and “machine learning model” to perform the “determining” and “determining” steps. The “computer” and “machine learning model” in these steps are recited at a high-level of generality such that they amount no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “computer” and “machine learning mode” to perform “determining” and “determining” steps amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claim is not patent eligible.
Thus, these limitations do not amount to significantly more. Even when considered in combination, these additional elements represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept. The claim is not patent eligible.
For claim 2, which recites “The method of claim 1, wherein the datasets include pairs of data.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 2 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 3, which recites “The method of claim 2, wherein the pairs of data include questions and answers” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1 and dependent claim 2. Therefore, claim 3 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 4, which recites “The method of claim 1, wherein the first data includes queries and the second data includes answer candidates” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 4 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 5, which recites “The method of claim 1, wherein there is a relationship between the first data and the second data, and the score represents relevance between the first data and the second data.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 5 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 6, which recites “The method of claim 1, further comprising:
matching the second data to the first data; and 
retrieving the second data in response to input of the first data.”
(Step 2A-Prong One) The limitations of “matching the second data to the first data” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “matching” in the context of this claim encompasses the user manually “matching the second data to the first data” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
The additional element – “…retrieving the second data in response to input of the first data” which is mere data gathering and is in form of insignificant extra-solution activity (MPEP: 2106.05(g), “iv. Obtaining information about transactions using the Internet to verify credit card transactions, CyberSource v. Retail Decisions, Inc.,”).
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The limitation is not sufficient to amount to significantly more than the judicial exception because “retrieving” only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. For example, MPEP 2106.05(d)(II), “i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec," “iv. Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc.”
Thus, limitation(s) does not amount to significantly more. Even when considered in combination, this additional element represents mere instructions to apply an exception and insignificant extra-solution activity, which does not provide an inventive concept. The claim is not patent eligible.
For claim 7, which recites “The method of claim 1, further comprising: 
determining vector components of the sparse vector representation using an activation function and a bias term.”
(Step 2A-Prong One) The limitations of “determining vector components of the sparse vector representation using an activation function and a bias term” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user manually “determining vector components of the sparse vector representation using an activation function and a bias term” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 8, which recites “The method of claim 2, wherein the activation function comprises a rectified linear unit.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1 and dependent claim 2. Therefore, claim 8 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 9, which recites “The method of claim 1, further comprising:
using a lookup function on text and context information of the second data.
(Step 2A-Prong One) The limitations of “using a lookup function on text and context information of the second data” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “using” in the context of this claim encompasses the user manually “using a lookup function on text and context information of the second data” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 10, which recites “The method of claim 1, wherein the second data includes a tuple of answer text and context information.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 10 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 11, which recites “The method of claim 1, further comprising: tokenizing the first data.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 11 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
For claim 12, which recites “The method of claim 11, further comprising:
encoding the first data using non-contextualized embedding.
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
The additional element – “…encoding the first data using non-contextualized embedding” which is Selecting a particular data source or type of data to be manipulated and is in form of insignificant extra-solution activity (MPEP: 2106.05(g), “iii. Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display,.,”).
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The limitation is not sufficient to amount to significantly more than the judicial exception because “encoding” only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. 
For example, Janakiraman (U.S. Pub. No.: US 20210021621), paragraph [0027], “…the sequence of the entities accessed can build non-contextual individual entity embeddings…”
Heinzerling, “Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation,” 2019, page, “Pretrained contextual and non-contextual subword embeddings have become available in over 250 languages…”
Horev, “BERT Explained: State of the art language modelfor NLP,” 10 November 2018, page 7, “…the non-contextual embedding…”
Arora et al., “Contextual Embeddings: When Are TheyWorth It?,” 18 May 2020, page 1, “…we find in experiments across a range of tasks that the performance of the non-contextual embeddings (GloVe, random) improves rapidly as we increase the amount of training data, often attaining within 5 to 10% accuracy of BERT embeddings when the full training set is used.…”
Thus, limitation(s) does not amount to significantly more. Even when considered in combination, this additional element represents mere instructions to apply an exception and insignificant extra-solution activity, which does not provide an inventive concept. The claim is not patent eligible.
For claim 13, which recites “The method of claim 1, further comprising: tokenizing the second data.”
(Step 2A-Prong One) The limitations of “tokenizing the second data” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “tokenizing” in the context of this claim encompasses the user manually “tokenizing the second data” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 14, which recites “The method of claim 13, further comprising: using a lookup function on the second data.”
(Step 2A-Prong One) The limitations of “using a lookup function on the second data” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “using” in the context of this claim encompasses the user manually “using a lookup function on the second data” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 15, which recites “The method of claim 1, wherein the machine learning model includes a contextualized transformer model.” which is merely data (e.g. contents) and does not meet any of the categories (MPEP: 2106.03, “Thus, the Federal Circuit has held that a product claim to an intangible collection of information, even if created by human effort, does not fall within any statutory category. Digitech, 758 F.3d at 1350, 111 USPQ2d at 1720 (claimed "device profile" comprising two sets of data did not meet any of the categories because it was neither a process nor a tangible product).”).
For the above reason, the limitation does not change the result of the analysis from the independent claim 1. Therefore, claim 15 is also rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Alternatively, (Step 2A-Prong One) The limitations of “a contextualized transformer model” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the user manually performs contextualized transformer model in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 16, which recites “The method of claim 1, further comprising determining a final score between the first data and the second data by summing individual scores between each query token representing the first data and the second data.
(Step 2A-Prong One) The limitations of “determining a final score between the first data and the second data by summing individual scores between each query token representing the first data and the second data” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user manually “determining a final score between the first data and the second data by summing individual scores between each query token representing the first data and the second data” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 17, which recites “The method of claim 1, further comprising: providing a real-time inference.”
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
The additional element – “…providing a real-time inference” which is Selecting a particular data source or type of data to be manipulated and is in form of insignificant extra-solution activity (MPEP: 2106.05(g), “iii. Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display,.,”).
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The limitation is not sufficient to amount to significantly more than the judicial exception because “providing” only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. For example, MPEP 2106.05(d)(II), “iv. Presenting offers and gathering statistics, OIP Techs.”
Further example, Graeber et al. (U.S. Pub. No.: US 20110282704), paragraph [0067], “…presented in the user interface in real-time…”
Iwamura et al., (U.S. Pub. No.: US 20120230592), paragraph [0212], “…a real-time operated interface…”
Tavernier, (U.S. Patent No.: US 10706450), column 13, lines 29-35, “…a user interface provided in real time can be generated and then displayed to a user…”
Thus, limitation(s) does not amount to significantly more. Even when considered in combination, this additional element represents mere instructions to apply an exception and insignificant extra-solution activity, which does not provide an inventive concept. The claim is not patent eligible.
For claim 18, which recites “The method of claim 1, wherein a size of the sparse vector representation is limited to a predetermined size.”
(Step 2A-Prong One) The limitations of “wherein a size of the sparse vector representation is limited to a predetermined size” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “limited” in the context of this claim encompasses the user manually limits “wherein a size of the sparse vector representation is limited to a predetermined size” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) and (Step 2B) 
No additional elements are provided in the claim, therefore there is still no practical application and the claim does not provide significantly more as per claim 1 analysis.  
For claim 19, which recites “A method for training a machine learning model for open domain question answering, comprising: 
determining a relevance between a query and answer token pair using dot product and max pooling.”
Claim 1, (Step 1) the claim recites “A method…” as drafted, the claimed method is a process, which is a statutory category of invention.
(Step 2A-Prong One) The limitations of “determining a relevance between a query and answer token pair using dot product and max pooling” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “machine learning model,” nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “machine learning model” language, “determining” in the context of this claim encompasses the user manually “determining a relevance between a query and answer token pair using dot product and max pooling” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
the claim recites additional elements – using “machine learning model” to perform the “determining” steps. The “machine learning model” in these steps are recited at a high-level of generality such that they amount no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “machine learning mode” to perform “determining” steps amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claim is not patent eligible.
Thus, these limitations do not amount to significantly more. Even when considered in combination, these additional elements represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept. The claim is not patent eligible.
For claim 20, which recites “A controller comprising:
a processor; and 
a storage communicatively coupled to the processor, wherein the processor is configured to execute programmed instructions stored in the storage to: 
determining a score between queries and answer candidates using a sparse vector representation; and 
output ranked answer candidates based on the score.”
Claim 1, (Step 1) the claim recites “A controller comprising …” as drafted, the claimed method is a device, which is a statutory category of invention.
(Step 2A-Prong One) The limitations of “determining a score between queries and answer candidates using a sparse vector representation” as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “controller,” “processor” and “storage,” nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “controller,” “processor” and “storage” language, “determining” in the context of this claim encompasses the user manually “determining a score between queries and answer candidates using a sparse vector representation” in his mind.
If these claim limitations, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.
(Step 2A-Prong Two) This judicial exception is not integrated into a practical application. 
the claim recites additional elements – using “controller,” “processor” and “storage” to perform the “determining” and “outputting” steps. The “controller,” “processor” and “storage” in these steps are recited at a high-level of generality such that they amount no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
The additional element - output ranked answer candidates based on the score,” which are “Field of Use and Technological Environment” (MPEP: 2106.05(h), “Examples of limitations that the courts have described as merely indicating a field of use or technological environment in which to apply a judicial exception include:”… vi. Limiting the abstract idea of collecting information, analyzing it, and displaying certain results of the collection and analysis to data related to the electric power grid, because limiting application of the abstract idea to power-grid monitoring is simply an attempt to limit the use of the abstract idea to a particular technological environment, Electric Power Group, LLC v. Alstom S.A.”).
(Step 2B) The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “controller,” “processor” and “storage” to perform “determining” and “outputting” steps amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claim is not patent eligible.
The claim recites additional elements – “output ranked answer candidates based on the score” steps are Field of Use and Technological Environment in conjunction with the abstract idea (MPEP: 2106.05(h), “vi. Limiting the abstract idea of collecting information, analyzing it, and displaying certain results of the collection and analysis to data related to the electric power grid, because limiting application of the abstract idea to power-grid monitoring is simply an attempt to limit the use of the abstract idea to a particular technological environment, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016)”). Employing generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more. Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not amount to significantly more than the exception itself, and cannot integrate a judicial exception into a practical application.
Thus, these limitations do not amount to significantly more. Even when considered in combination, these additional elements represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4, 5, 6, 7, 9 and 10 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo).
For claim 1, Haffner discloses a computer-implemented method comprising:
determining an index using a machine learning model trained on datasets (Haffner: column 2, lines 54-66, “This indicates that, in this case, the most time consuming part of the learning process is to retrieve the small proportion of training vectors which have features in common with the added support vector, and, as in information retrieval, the concept of inverted index can be useful. For instance, suppose that our new support vector corresponds to the sentence "I want to check my bill", and that we reduced it to a vector with 3 active features ("want", "check", "bill"), ignoring function words such as "I", "to" and "my". The inverted index approach would retrieve the list of vectors including these three words and merge them…” column 3, line 59-column 4, line 2, “In one embodiment, the present invention enables a method based on transposition to speed up SVM learning computations on sparse data…” column 5, line 65-column 6, line 11, “…the initial sparse vector is simply the encoded vector of feature column 1 in M representing feature F1; therefore, the initial sparse vector representing feature F1 is encoded in (index, value) pair as shown in list 610. The fourth column of matrix M representing feature F4 is encoded using (index, value) pairs as shown in list 615”
WHERE “index” is broadly interpreted as “inverted index”
WHERE “machine learning model” is broadly interpreted as “learning process” or “SVM learning”
WHERE “datasets” is broadly interpreted as “sentence "I want to check my bill"”).
However, Haffner does not explicitly disclose determining a score between first data and second data from the index using a sparse vector representation.
Bayardo discloses determining a score between first data and second data from the index using a sparse vector representation (Bayardo: page 2, “Our work is also related to work in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”
WHERE “first data” is broadly interpreted as “user query” and WHERE “second data” is broadly interpreted as “each document” or vice versa,
WHERE “index” is broadly interpreted as “indexed through inverted lists”
WHERE “score” is broadly interpreted as “similarity”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity..” (Bayardo: page 2)
For claim 2, Haffner and Bayardo disclose the method of claim 1, wherein the datasets include pairs of data (Bayardo: page 1, “Given a large collection of sparse vector data in a high dimensional space, we investigate the problem of finding all pairs of vectors whose similarity score (as determined by a function such as cosine distance) is above a given threshold…find all pairs of similar queries based on the similarity of the search results for those queries… we wish to compute the set of all pairs (x, y) and their similarity values sim(x, y)…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity..” (Bayardo: page 2)
For claim 4, Haffner and Bayardo disclose the method of claim 1, wherein the first data includes queries and the second data includes answer candidates (Bayardo: page 2, “Our work is also related to work in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” 
WHERE “the first data includes queries” is broadly interpreted as “user query” 
WHERE “second data includes answer candidates” is broadly interpreted as “Answering the query…finding all, or the top k, document”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity..” (Bayardo: page 2)
For claim 5, Haffner and Bayardo disclose the method of claim 1, wherein there is a relationship between the first data and the second data, and the score represents relevance between the first data and the second data (Bayardo: page 2, “Our work is also related to work in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” 
WHERE “relationship” is broadly interpreted as “Answering the query” 
WHERE “the score represents relevance” is broadly interpreted as “similarity”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity..” (Bayardo: page 2).
For claim 6, Haffner and Bayardo disclose the method of claim 1, further comprising: matching the second data to the first data; and retrieving the second data in response to input of the first data (Bayardo: page 2, “Our work is also related to work in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” 
WHERE “matching” is broadly interpreted as “Answering the query… similarity…” 
WHERE “retrieving the second data” is broadly interpreted as “information retrieval” and “Answering the query then amounts to finding all, or the top k, document vectors…ranked in order of their similarity”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” (Bayardo: page 2).
For claim 7, Haffner and Bayardo disclose the method of claim 1, further comprising: determining vector components of the sparse vector representation using an activation function and a bias term (Bayardo: page 2, “…applying a sketching function based on min-wise independent permutations in order to compress document vectors whose dimensions correspond to distinct n-grams…in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”
WHERE “activation function and a bias term” is broadly interpreted as “vector of weighted terms…a sparse vector of those terms with certain (usually equal) weights…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” (Bayardo: page 2).
For claim 9, Haffner and Bayardo disclose the method of claim 1, further comprising: using a lookup function on text and context information of the second data (Bayardo: page 2, “…applying a sketching function based on min-wise independent permutations in order to compress document vectors whose dimensions correspond to distinct n-grams…in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”
WHERE “using a lookup function on text and context information” is broadly interpreted as “user query, itself a list of terms”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” (Bayardo: page 2).
For claim 10, Haffner and Bayardo disclose the method of claim 1, wherein the second data includes a tuple of answer text and context information (Bayardo: page 2, “…applying a sketching function based on min-wise independent permutations in order to compress document vectors whose dimensions correspond to distinct n-grams…in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”
WHERE “second data includes a tuple of answer text and context information” is broadly interpreted as “information retrieval” and “Answering the query then amounts to finding all, or the top k, document vectors”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” (Bayardo: page 2).

Claims 3 and 17 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Karandish et al. (U.S. Pub. No.: US 20210056150, hereinafter Karandish).
For claim 3, Haffner and Bayardo disclose the method of claim 2.
However, Haffner and Bayardo do not explicitly disclose, wherein the pairs of data include questions and answers.
Karandish discloses wherein the pairs of data include questions and answers (Karandish: paragraph [0110], “…the question-answer pair can be stored in the index…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “INGESTION AND RETRIEVAL OF DYNAMIC SOURCE DOCUMENTS IN AN AUTOMATED QUESTION ANSWERING SYSTEM” as taught by Karandish, because it would provide Haffner’s method with the enhanced capability of “…greatly improve information access by automatically answering questions through an automated chat agent with fresh information from dynamic source documents…” (Karandish: paragraph [0126])
For claim 17, Haffner and Bayardo disclose the method of claim 1.
However, Haffner and Bayardo do not explicitly disclose, , further comprising: providing a real-time inference.
Karandish discloses further comprising: providing a real-time inference (Karandish: paragraph [0079], “…the answer (e.g., 551) can be sent in real-time after receiving the question (e.g., 511 (FIG. 5A)) in block 510 (FIG. 5A), such that method 500 is processed in real-time…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “INGESTION AND RETRIEVAL OF DYNAMIC SOURCE DOCUMENTS IN AN AUTOMATED QUESTION ANSWERING SYSTEM” as taught by Karandish, because it would provide Haffner’s method with the enhanced capability of “…greatly improve information access by automatically answering questions through an automated chat agent with fresh information from dynamic source documents…” (Karandish: paragraph [0126])

Claim 8 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Henry (U.S. Pub. No.: US 20180174037, hereinafter Henry).
For claim 8, Haffner and Bayardo disclose the method of claim 2.
However, Haffner and Bayardo do not explicitly disclose, wherein the activation function comprises a rectified linear unit.
Henry discloses, wherein the activation function comprises a rectified linear unit (Henry: paragraph [0058], “…a message embedding may be a fixed-length vector that represents a meaning of the message in an N-dimensional space. Any appropriate techniques may be used to generate the message embedding from the message features…message embedding component may receive TFIDF (term frequency, inverse document frequency) features and use a multi-layer perceptron with rectified linear units”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “SUGGESTING RESOURCES USING CONTEXT HASHING” as taught by Henry, because it would provide Haffner’s method with the enhanced capability of “…to improve the performance of a system for suggesting resources…” (Henry: paragraph [0075]).

Claims 11, 13 and 14 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Gao (U.S. Patent No.: US 9009148, hereinafter Gao).
For claim 11, Haffner and Bayardo disclose the method of claim 1.
However, Haffner and Bayardo do not explicitly disclose, further comprising: tokenizing the first data.
Gao discloses, further comprising: tokenizing the first data (Gao: column 6, line 64-column 7, line 3, “…A paired query and title are expected to not only share the same prior distribution over topics, but also contain similar fractions of words assigned to each topic. Since MAP estimation of the shared topic vector is concerned with explaining the union of tokens in the query…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “SUGGESTING RESOURCES USING CONTEXT HASHING” as taught by Gao, because it would provide Haffner’s method with the enhanced capability of “…identifying a number of query-document pairs based on clickthrough data for a number of documents. The method also includes building a latent semantic model based on the query-document pairs and ranking the documents for a search based on the latent semantic model …” (Gao: column 1, lines 33-36).
For claim 13, Haffner and Bayardo disclose the method of claim 1, further comprising: tokenizing the second data.
However, Haffner and Bayardo do not explicitly disclose, further comprising: tokenizing the second data.
Gao discloses, further comprising: tokenizing the second data (Gao: column 6, line 64-column 7, line 3, “…A paired query and title are expected to not only share the same prior distribution over topics, but also contain similar fractions of words assigned to each topic. Since MAP estimation of the shared topic vector is concerned with explaining the union of tokens in…document…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “SUGGESTING RESOURCES USING CONTEXT HASHING” as taught by Gao, because it would provide Haffner’s method with the enhanced capability of “…identifying a number of query-document pairs based on clickthrough data for a number of documents. The method also includes building a latent semantic model based on the query-document pairs and ranking the documents for a search based on the latent semantic model …” (Gao: column 1, lines 33-36).
For claim 14, Haffner, Bayardo and Gao disclose the method of claim 13, further comprising: using a lookup function on the second data (Bayardo: page 2, “…applying a sketching function based on min-wise independent permutations in order to compress document vectors whose dimensions correspond to distinct n-grams…in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”
WHERE “using a lookup function” is broadly interpreted as “user query, itself a list of terms”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Scaling Up All Pairs Similarity Search” as taught by Bayardo, because it would provide Haffner’s method with the enhanced capability of “Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…” (Bayardo: page 2).

Claim 12 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Gao (U.S. Patent No.: US 9009148, hereinafter Gao), and further in view of Arora et al. (“Contextual Embeddings: When Are TheyWorth It?” 18 May 2020, hereinafter Arora).
For claim 12, Haffner, Bayardo and Gao disclose the method of claim 11.
However, Haffner, Bayardo and Gao do not explicitly disclose, further comprising: encoding the first data using non-contextualized embedding.
Arora discloses, further comprising: encoding the first data using non-contextualized embedding (Arora: “Contextual Embeddings: When Are TheyWorth It?,” 18 May 2020, page 1, “…we find in experiments across a range of tasks that the performance of the non-contextual embeddings (GloVe, random) improves rapidly as we increase the amount of training data, often attaining within 5 to 10% accuracy of BERT embeddings when the full training set is used.…”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “Contextual Embeddings: When Are TheyWorth It?” as taught by Arora, because it would provide Haffner’s method with the enhanced capability of “…improves rapidly as we increase the amount of training data, often attaining within 5 to 10% accuracy of BERT embeddings when the full training set is used…” (Arora: page 1).

Claim 15 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Liu et al. (U.S. Pub. No.: US 20210326751, hereinafter Liu).
For claim 15, Haffner and Bayardo disclose the method of claim 1, wherein the machine learning model includes a contextualized transformer model.
However, Haffner and Bayardo do not explicitly disclose, wherein the machine learning model includes a contextualized transformer model.
Liu discloses wherein the machine learning model includes a contextualized transformer model (Liu: paragraph [0023], “…Natural language processing model 100 is an example of a machine learning model…” paragraph [0026], “Transformer encoder 104(2) can obtain contextual information for each word or token, e.g., via self-attention, and generate second embeddings 108, e.g., a sequence of context embedding vectors.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “ADVERSARIAL PRETRAINING OF MACHINE LEARNING MODELS” as taught by Liu, because it would provide Haffner’s method with the enhanced capability of “…providing a machine learning model having one or more mapping layers, including at least a first mapping layer configured to map components of pretraining examples into first representations in a space…” (Liu: paragraph [0003]).

Claim 16 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Bell et al. (U.S. Pub. No.: US 20190392082, hereinafter Bell).
For claim 16, Haffner and Bayardo disclose the method of claim 1.
However, Haffner and Bayardo do not explicitly disclose, further comprising determining a final score between the first data and the second data by summing individual scores between each query token representing the first data and the second data.
Bell discloses further comprising determining a final score between the first data and the second data by summing individual scores between each query token representing the first data and the second data (Bell: paragraph [0073], “…the scoring of the search result candidate titles is based on matching the terms with the tokens and then adding the individual token scores up to receive a final ranked result. In some embodiments, one or more search result candidates are alternatively or additionally scored based at least on a context similarity and/or distance between the one or more terms/tokens of the query and the one or more vectors or tokens in vector space (e.g., as determined in block 704).”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “COMPREHENSIVE SEARCH ENGINE SCORING AND MODELING OF USER RELEVANCE” as taught by Bell, because it would provide Haffner’s method with the enhanced capability of “…New functionalities that also improve existing search engine software technologies further include running a query through vector space (e.g., in a word embedding vector model) where each term or token in vector space is contextually similar such that when query results are returned they are scored based on contextual similarity score…” (Bell: paragraph [0022]).

Claim 18 rejected under 35 U.S.C. 103 as being unpatentable over Haffner (U.S. Patent No.: US 7664713), and in view of Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and further in view of Hawkins et al. (U.S. Pub. No.: US 20110225108, hereinafter Hawkins).
For claim 18, Haffner and Bayardo disclose the method of claim 1.
However, Haffner and Bayardo do not explicitly disclose, wherein a size of the sparse vector representation is limited to a predetermined size.
Hawkins discloses wherein a size of the sparse vector representation is limited to a predetermined size (Hawkins: paragraph [0076], “…If the size of a vector in sparse distributed representation is too small, only a small number of spatial patterns can be represented by the vector…The upper limit of the vector size depends on the number of spatial patterns and the application…”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Method And Apparatus For Providing Fast Kernel Learning On Sparse Data” as taught by Haffner by implementing “TEMPORAL MEMORY USING SPARSE DISTRIBUTED REPRESENTATION” as taught by Hawkins, because it would provide Haffner’s method with the enhanced capability of “…provide improved immunity to various types of noise. The improved immunity to noise is attributable partly to the use of sparse distributed representation…” (Hawkins: paragraph [0208]).

Claim 19 rejected under 35 U.S.C. 103 as being unpatentable over Bell et al. (U.S. Pub. No.: US 20190392082, hereinafter Bell), in view of Katabi et al. (U.S. Pub. No.: US 20200341115, hereinafter Katabi).
For claim 19, Bell discloses a method for training a machine learning model for open domain question answering, comprising: determining a relevance between a query and answer token pair (Bell: Paragraph [0033], “…the learning module 124, the training module 126, the scoring module 130…” paragraph [0073], “…the scoring of the search result candidate titles is based on matching the terms with the tokens and then adding the individual token scores up to receive a final ranked result. In some embodiments, one or more search result candidates are alternatively or additionally scored based at least on a context similarity and/or distance between the one or more terms/tokens of the query and the one or more vectors or tokens in vector space (e.g., as determined in block 704).”
WHERE “relevance” is broadly interpreted as “distance”).
However, Bell does not explicitly disclose “using dot product and max pooling” as in “determining a relevance between a query and answer token pair using dot product and max pooling.”
Katabi discloses “using dot product and max pooling” as in “determining a relevance between a query and answer token pair using dot product and max pooling” (Katabi: paragraph [0097], “Based on the concurrent tracklet and acceleration data, the first and second branches 82, 84 each provide a feature vector to a multiplier 86. These feature vectors correspond to the two data types. The multiplier 86 evaluates the dot product of the two feature vectors as a basis for evaluating their similarity…Following the dot product, the process continues with max pooling in the temporal dimension and the use of a fully-connected layer 88 to produce a similarity score…”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “COMPREHENSIVE SEARCH ENGINE SCORING AND MODELING OF USER RELEVANCE” as taught by Bell by implementing “SUBJECT IDENTIFICATION IN BEHAVIORAL SENSING SYSTEMS” as taught by Katabi, because it would provide Bell’s method with the enhanced capability of “…offer advantages over conventional behavioral sensing systems by enabling deployment in new homes without requiring subjects in those homes to annotate behavioral data produced by the system with subject identification information…” (Katabi: paragraph [0013]).

Claim 20 rejected under 35 U.S.C. 103 as being unpatentable over Bayardo et al. (“Scaling Up All Pairs Similarity Search,” 2007, hereinafter Bayardo), and in view of Chen et al. (U.S. Patent No.: US 8521662, hereinafter Chen).
For claim 20, Bayardo discloses a controller comprising: 
determining a score between queries and answer candidates using a sparse vector representation; and output ranked answer candidates based on the score (Bayardo: page 2, “Our work is also related to work in information retrieval (IR) optimization [5, 14, 15, 16, 17, 22, 23]. In IR, each document can be represented by a sparse vector of weighted terms, indexed through inverted lists. The user query, itself a list of terms, can be represented by a sparse vector of those terms with certain (usually equal) weights. Answering the query then amounts to finding all, or the top k, document vectors with non-zero similarity to the query vector, ranked in order of their similarity…”)
However, Bayardo does not explicitly disclose 
a processor; and a storage communicatively coupled to the processor, wherein the processor is configured to execute programmed instructions stored in the storage to.
Chen discloses a processor; and a storage communicatively coupled to the processor, wherein the processor is configured to execute programmed instructions stored in the storage to (Chen: column 15, lines 13-41, “The invention may be implemented in hardware, firmware or software, or a combination of the three. Preferably the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device…”).
Chen also discloses output ranked answer candidates based on the score (Chen: column 2, lines 58-60, “…applying the model to a set of documents and displaying documents matching a query.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “Scaling Up All Pairs Similarity Search” as taught by Bayardo by implementing “SUBJECT IDENTIFICATION IN BEHAVIORAL SENSING SYSTEMS” as taught by Katabi, because it would provide Bayardo’s controller with the enhanced capability of “…implemented in hardware, firmware or software, or a combination of the three …” (Chen: column 15, lines 13-41).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YU ZHAO whose telephone number is (571)270-3427. The examiner can normally be reached Monday-Friday 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 5712724046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

YU ZHAO
Primary Examiner
Art Unit 2169



/YU ZHAO/          Examiner, Art Unit 2169