DETAILED ACTION

Introduction
This office action is in response to Applicant’s submission filed on 3/30/2021. Claims
1-20 are pending in the application. As such, claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 7/5/2022, 11/23/2021, 7/9/2021, 4/9/2021 and 3/30/2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings filed on 3/30/2021 is accepted and considered by the Examiner.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-4, 8-11 and 15-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The independent claims 1, 8 and 15 recites “calculate a set of gradients representing a first text-based paragraph describing a seed item and a second text-based paragraph describing a recommended item predicted to be similar to the seed item, the set of gradients calculated with respect to a cosine similarity function applied on a set of feature vectors, the set of feature vectors comprising a first feature vector representing the first text-based paragraph and a second feature vector representing the second text-based paragraph; generate contextualized embeddings based on the set of gradients and a similarity score measuring an affinity between the first text-based paragraph and the second text-based paragraph; identify a set of word-pairs based on the contextualized embeddings, the set of word- pairs comprising a first word selected from the first text-based paragraph matched to a second word selected from the second text-based paragraph, wherein the first word and the second word have a similar semantic meaning; and select a word-pair from the set of word-pairs having a word-pair score based on a threshold value, the word-pair score indicating a degree of influence exerted by the word-pair on selection of the recommended item from a plurality of candidate items.”
The limitation of “calculate…”, “generate…”, “identify…”, and “select…”, as drafted covers a mental process that “can be performed in the human mind or by a human using a pen and paper.  More specifically, an application of a person viewing and comparing two paragraphs/descriptions, generate vectors based on the text (which is just a computer readable representation of the text), applying cosine similarity function which just a math function, generate computer representation of text based on context and also how closely the two paragraphs are related, then identify keywords that are similar in meaning from both paragraphs, and select a pair of keywords having a similarity score above a pre-determined threshold values.  
This judicial exception is not integrated into a practical application. In particular, independent claims 1 and 15 recite additional elements of “processor”, and/or “memory and/or computer storage device”, and “pre-trained language model”.  For example, in [0048] of the as filed specification, there is description of using a general purpose computer.  As such, a general purpose computer would contain a processor, memory and computer-readable storage media.  In [00147] of the as filed specification, a description of conventional memory and tangible storage device is also mentioned.  Also, In [0027] of the as filed specification, there is a description of an application using a general language model.  Independent claims 1 also recite additional elements of an “interpreting text-based similarity (ITBS) model”, in [0035] of the as filed specification, the ITBS model is mention as being any natural language process (NLP) context-based models.  Accordingly, these additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Thus, the claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer as described. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  Further, the additional limitations in the claims noted above are directed towards insignificant solution activity. Thus, the claims are not patent eligible.
With respect to claims 2, 9 and 16, the claim relates to wherein the instructions are further operative to: analyze the first text-based paragraph by a token saliency component, wherein the token saliency component generates a first set of gradients associated with the first text-based paragraph; and analyze the second text-based paragraph by the token saliency component, wherein the token saliency component generates a second set of gradients associated with the second text-based paragraph.  This reads on a person reviewing and analyzing the two sets of text paragraph by using an (token saliency component), which is being interpreted as a generic language embedding model that generates set of vectors relating to the tokens.  There are no additional limitations that would make this claim eligible.  
  With respect to claims 3, 10 and 17, the claim relates to, wherein the instructions are further operative to: identify a word-pair from the set of word-pairs having a word-pair score exceeding a threshold value. This reads on a person identifying a related pair of keywords from among the multiple set of word pairs which has a similarity score or value that exceeds a pre-determined threshold.  There are no additional limitations that would make this claim eligible.  
  Regarding claims 4, 11 and 18, the claims relate to wherein the instructions are further operative to: identify a word-pair from the set of word-pairs having a highest weight for selection.  This reads on a person reviewing, identifying and selecting a related pair of keywords from the multiple sets of word pairs by their importance or weights ranking or scale.  There are no additional limitations that would make this claim eligible.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6, 8-11, 13, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Behtash et al. (US Patent No.: US 20210109958 A1) hereinafter as Behtash, in view of applicant supplied reference -Liu (US Patent Application Publication No: US 20170235824 A1) hereinafter as Liu.
Regarding claim 1, Behtash discloses: A system for interpreting similarities between unlabeled paragraph pairs inferred by a pre-trained language model, the system comprising: at least one processor; and at least one memory comprising computer-readable instructions, the at least one memory and the computer-readable instructions configured to, with the at least one processor, to cause the at least one processor to: (See fig. 46 regarding the computer system.  Also see [0029] and [0040] and [0309])
implement an interpreting text-based similarity (ITBS) model, to cause the at least one processor to: ([0182] A non-limiting example of such a model is ELMo (“Embeddings from Language Models”))
calculate a set of gradients representing a first text-based paragraph describing a seed item and a second text-based paragraph describing a recommended item predicted to be similar to the seed item, the set of gradients calculated with respect to a cosine similarity function applied on a set of feature vectors, the set of feature vectors comprising a first feature vector representing the first text-based paragraph and a second feature vector representing the second text-based paragraph; ([0037] In another aspect, a trained model is deployed as a research tool. In this tool, the user explains the issue at hand using a summary of facts or a series of keywords, and the trained model returns the relevant laws based on the factual, contextual, and conceptual information and patterns in the user's query. In this scenario, there will be no searching over all court opinions to compare and match them with the query. Rather, the user can include different aspects of the issue as a summary of facts or a long list of keywords. Then, the trained model of the research system with its context-analyzing capabilities considers the entirety of the issue and automatically picks out the important legal aspects and patterns and neglects the irrelevant details. Since the research system understands semantics of the words, the exact words used to express the facts is not as important as it is in alternative tools. Also see [0039] comparison can be done using vectors.  See [0178] for cosine similarities. [the first feature vector can be read as the inquiry or one of the articles contain in the corpus, and the second feature vector can be read as representing anyone one of the article in the corpus that is being compared against])
generate contextualized embeddings based on the set of gradients and a similarity score measuring an affinity between the first text-based paragraph and the second text-based paragraph; ([0039] In still another aspect, laws can be transformed into dense, continuous-valued vectors in a low dimensional space called state space based on the contextual similarity of the laws. A context for each law is determined by looking at the locations of the citation in the court case texts. Since the transformation does preserve the contextual similarity of the legal citations by placing correlated laws close to one another in the state space, contextually similar laws end up being mapped to close-by vectors in the state space. In some embodiments, the laws may be mapped to the same state space that the words are mapped to.)
identify a set of word-pairs based on the contextualized embeddings, the set of word- pairs comprising a first word selected from the first text-based paragraph matched to a second word selected from the second text-based paragraph, wherein the first word and the second word have a similar semantic meaning; ([0033] In supervised ML, the goal is to learn a function that maps an input to an output based on example input-output pairs. Putting this into the context of a research system, the goal is to learn a function that receives the query from the user, map the query into the correct outputs and bring the outputs back as search results.  [0037] In another aspect, a trained model is deployed as a research tool. In this tool, the user explains the issue at hand using a summary of facts or a series of keywords, and the trained model returns the relevant laws based on the factual, contextual, and conceptual information and patterns in the user's query. In this scenario, there will be no searching over all court opinions to compare and match them with the query. Rather, the user can include different aspects of the issue as a summary of facts or a long list of keywords. Then, the trained model of the research system with its context-analyzing capabilities considers the entirety of the issue and automatically picks out the important legal aspects and patterns and neglects the irrelevant details. Also see [0289] on comparing the word vector of the chosen word from the query against the word vectors of all the words in the context. [word pair is being read as important or keywords from the inquiry and the text from the articles that is in the corpus.])
and select a word-pair from the set of word-pairs having a word-pair score based on a threshold value, the word-pair score indicating a degree of influence exerted by the word-pair on selection of the recommended item from a plurality of candidate items.  ([0262] If the ML model considers laws as continuous-valued dense vectors (FIG. 21), the output is going to be a vector in an M dimensional state space. In this case, step 1203 receives this vector and finds all the laws close to this output vector. The found laws are sorted based on their proximity to the output vector. In some embodiments, proximity measures such as cosine similarity may be used as a quantifier to measure the distance between the output vector and the word vectors of different laws. Step 1203 uses a predefined threshold value as a cut-off for proximity measure and selects the laws that their proximity to the output is above this threshold value.  Also see [0149], [0261], [0284], [0290], and [0296] regarding selecting of the word pair, which could read on selecting relevant law or from summary and keywords to the relevant law])
Behtash does not explicitly, but Liu discloses similarity score ([0049] similarity score is disclosed)
Behtash and Liu are considered analogous art because they are both in the related art of text similarity comparison. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Behtash to combine the teaching of Liu, to incorporate the above mentioned claim limitations, because similarity measurement can be used to measure how close the source is to the target in classification (Behtash, [0029]).

Regarding claims 8 and 15, although different in scope from claim 1 and each other, they recite elements of the system of claim 1 as a method and computer storage device.  Thus, the analysis in rejecting claim 1 is equally applicable to claims 8 and 15.



Regarding claim 2, Behtash in view of Liu discloses: The system of claim 1,
Behtash further discloses: wherein the instructions are further operative to: analyze the first text-based paragraph by a token saliency component, wherein the token saliency component generates a first set of gradients associated with the first text-based paragraph; ([0149] The cleaned, tokenized documents produced by step 201 of FIG. 2 contain unstructured data and they are not directly usable as a training dataset for ML applications. Step 202 in FIG. 2 extracts a suitable training dataset from the unstructured text data of these documents. The type and the nature of the training dataset depends on the nature of the data and what the user 101 expects from the research system 104 For example, assume the dataset is comprised of court orders and decisions, the user's query 102 is a summary of facts in hand or a sequence of keywords, and the user expects the relevant laws as the outputs from the research system 104. Therefore, the input-output pairs of the training dataset should be a summary of the issue or keywords and the relevant laws. [Also see [0296])
and analyze the second text-based paragraph by the token saliency component, wherein the token saliency component generates a second set of gradients associated with the second text-based paragraph. ([0149] The cleaned, tokenized documents produced by step 201 of FIG. 2 contain unstructured data and they are not directly usable as a training dataset for ML applications. Step 202 in FIG. 2 extracts a suitable training dataset from the unstructured text data of these documents. The type and the nature of the training dataset depends on the nature of the data and what the user 101 expects from the research system 104 For example, assume the dataset is comprised of court orders and decisions, the user's query 102 is a summary of facts in hand or a sequence of keywords, and the user expects the relevant laws as the outputs from the research system 104. Therefore, the input-output pairs of the training dataset should be a summary of the issue or keywords and the relevant laws.  [Also see [0296])

Regarding claims 9 and 16, although different in scope from claim 2 and each other, they recite elements of the system of claim 2 as a method and computer storage device.  Thus, the analysis in rejecting claim 2 is equally applicable to claims 9 and 16.

Regarding claim 3, Behtash in view of Liu discloses: The system of claim 1, Behtash further discloses: wherein the instructions are further operative to: identify a word-pair from the set of word-pairs having a word-pair score exceeding a threshold value. ([0262] If the ML model considers laws as continuous-valued dense vectors (FIG. 21), the output is going to be a vector in an M dimensional state space. In this case, step 1203 receives this vector and finds all the laws close to this output vector. The found laws are sorted based on their proximity to the output vector. In some embodiments, proximity measures such as cosine similarity may be used as a quantifier to measure the distance between the output vector and the word vectors of different laws. Step 1203 uses a predefined threshold value as a cut-off for proximity measure and selects the laws that their proximity to the output is above this threshold value.  [word pair is being read as important or keywords from the inquery and the text from the articles that is in the corpus.  Word similarity score was discussed earlier in the Liu disclosure.  Here the focus is on the discussion of keywords and threshold value.)

Regarding claims 10 and 17, although different in scope from claim 3 and each other, they recite elements of the system of claim 3 as a method and computer storage device.  Thus, the analysis in rejecting claim 3 is equally applicable to claims 10 and 17.

Regarding claim 4, Behtash in view of Liu discloses: The system of claim 1, Behtash further discloses: wherein the instructions are further operative to: identify a word-pair from the set of word-pairs having a highest weight for selection. ([0297] where n.sub.d is the number of contexts in the training dataset that contain the word d, and N is the total number of contexts available in the training dataset. The common words appearing in all the contexts will get a weight of 0. The rare discriminative words will get weights close to 1.)
Regarding claims 11 and 18, although different in scope from claim 4 and each other, they recite elements of the system of claim 4 as a method and computer storage device.  Thus, the analysis in rejecting claim 4 is equally applicable to claims 11 and 18.
	Regarding claim 6, Behtash in view of Liu discloses: The system of claim 1, Behtash further discloses: wherein the instructions are further operative to: maximize the similarity score between an aggregated latent representation of a matched word associated with a description of the recommended item and a word associated with a description of the seed item. ([0289] Step 3905 compares the word vector of the chosen word from the query against the word vectors of all words in the context x.sub.i, finds the most contextually similar word from the context, and calculates their similarity. Mathematically, step 3905 can be implemented as follows: θ k = max j .Math. .Math. C  ( Vq k , Vx i j )  [0290] where Vq.sup.k is the word vector of the word chosen from the query in step 3904, Vx.sub.i.sup.j is the word vector for the jth word in the context x.sub.i, and j goes from 0 to 1 with 1 being the length of the context. The function C can be any contextual similarity measure defined for the word vectors, including, but not limited to cosine similarity. Θ.sub.k is indeed the contextual similarity measure of the closest word in the context x.sub.i to the word selected from the query in step 3904. In some instances, instead of finding the similarity measure between the most similar word in the context with the word from the query, the collective contextual similarity measures between all the words in the context with the word from the query may be calculated.)

Regarding claim 13, although different in scope from claim 6, they recite elements of the system of claim 6 as a method.  Thus, the analysis in rejecting claim 6 is equally applicable to claim 13.

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Behtash, in view of Liu, and further in view of Begum et al. (Begum, S., Ahmed, M. U., Funk, P., Xiong, N., & Schéele, B. (2007). Similarity of medical cases in health care using cosine similarity and ontology. In ICCBR (pp. 263-272).) hereinafter as Begum.
Regarding claim 5, Behtash in view of Liu discloses: The system of claim 1, Behtash in view of Liu does not explicitly, but Begum discloses: wherein the instructions are further operative to: scale at least one gradient map by a multiplication with corresponding activation maps and summed across one or more feature vectors to produce one or more saliency score(s) for every token associated with a selected paragraph. (See fig. 2, [sect 4] The weighting terms (Wi,j) method calculates the weight of each term or word from the stored cases and the inputted user’s query to perform further matching. [also see sect 4.1, 4.2 and 4.3])
Behtash, Liu and Begum are considered analogous art because they are both in the related art of text similarity comparison. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Behtash, in view of Liu, to combine the teaching of Begum, to incorporate the above mentioned claim limitations, because it would improve retrieval effectiveness due to the huge amount of words (Begum, sect 4).

Regarding claims 12 and 19, although different in scope from claim 5 and each other, they recite elements of the system of claim 5 as a method and computer storage device.  Thus, the analysis in rejecting claim 5 is equally applicable to claims 12 and 19.

Claims 7, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Behtash, in view of Liu, and further in view of Gupta et al. (US Patent Application Publication No: US 20190251707 A1) hereinafter as Gupta.
Regarding claim 7, Behtash in view of Liu discloses: The system of claim 1, Behtash in view of Liu does not explicitly, but Gupta discloses: wherein the instructions are further operative to: aggregate token saliency scores associated with at least one word in an item description to generate word-scores. ([0006] The content saliency neural network may be trained to provide a saliency score for each of a set of elements in a content item. The saliency score represents the probability of a human viewing the content looking at the element within a predetermined time and is based on the visual features of the element and the content in general. [0031] discusses normalization and summing of saliency scores. [Behtash disclosed tokenization])
Behtash, Liu and Gupta are considered analogous art because they are both in the related art of text similarity comparison and/or use of neural network for saliency prediction. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Behtash, in view of Liu, to combine the teaching of Gupta, to incorporate the above mentioned claim limitations, because saliency represents the likelihood of an element stands out from others (Gupta, summary).

Regarding claims 14 and 20, although different in scope from claim 7 and each other, they recite elements of the system of claim 7 as a method and computer storage device.  Thus, the analysis in rejecting claim 7 is equally applicable to claims 14 and 20.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Wang et al. (US Patent No: US 11461829 B1) hereinafter as Wang.  Wang discloses a method and system for comparing similarities between items using token level attributes.  
Applicant supplied reference -Rangasamy et al. (US Patent Application Publication No: US 20170186032 A1) hereinafter as Rangasamy.  Rangasamy discloses a method and apparatus for detecting spam publication by comparing item attributes with image attributes.
Osuala et al. (US Patent Application Publication No: US 20170186032 A1) hereinafter as Ousala.  Ousala discloses method of text search that used word embedding vector similarities to identify relevant products.
	Davidson et al. (US Patent Application Publication No: US 20190370323 A1) hereinafter as Davidson.  Davidson discloses a text correction training models using word pairs.
	Sun et al. (Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W., & Jiang, P. (2019, November). BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 1441-1450).) hereinafter as Sun.  Sun discloses a sequential recommendation system using Bidirectional Encoder Representation Transformer (BERT).
	Yang et al. (Yang, X., He, X., Zhang, H., Ma, Y., Bian, J., & Wu, Y. (2020). Measurement of semantic textual similarity in clinical texts: comparison of transformer-based models. JMIR medical informatics, 8(11), e19735.) hereinafter as Yang.  Yang’s paper is on comparison of various transformer models to measuring semantic text similarity in clinical text.  The three models used for the study included are BERT, XLNet, and RoBERTa.  
	  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Phillip H Lam whose telephone number is (571)272-1721. The examiner can normally be reached 10 AM-6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PHILIP H LAM/Examiner, Art Unit 2656                                                                                                                                                                                                        

/HUYEN X VO/Primary Examiner, Art Unit 2656