DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/29/2018 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the Examiner.

Specification
The use of the terms THE BEATLES®, ROLLING STONES®, PINK FLOYD® and WIKIPEDIA® which are a trade name or a mark used in commerce, has been noted in this application. It should be capitalized wherever it appears and be accompanied by the generic terminology. It should also include a ®, TM or SM, whichever is appropriate. 
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Objections
Claim 9 is objected to because of the following informality: there should be an “and” before the last limitation. Thus, the claim should be amended like so: “example pairs; and converting,….” Appropriate correction is required.

Claim 12 is objected to because of the following informality: there is an extra space at end of the sentence between “structured” and the period. The extra space should be removed. Appropriate correction is required.

Claim 14 is objected to because of the following informality: there should be a comma between “11” and “wherein”. Thus, the claim should be amended like so: “claim 11, wherein”. Appropriate correction is required.

Claim 18 is objected to because of the following informality: there should be a hyphen between “computer” and “readable”.  Thus, the claim should be amended like so: “computer- readable”. Appropriate correction is required.

Claim 19 is objected to because of the following informality: there should be a hyphen between “computer” and “readable”.  Thus, the claim should be amended like so: “computer- readable”. Appropriate correction is required.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  

Claim 1 recites in the last limitation on lines 15-17 a determination of analogical similarity between “the first relationship encoding and a second relationship encoding”. It is not clear if Applicant intended for “a second relationship encoding” to be a new second relationship encoding or if Applicant intended to refer back to second relationship encoding that was previously recited on line 13. From the claim limitation, it is likely that Applicant intended to refer back to the previously recited second relationship encoding in order to make this analogical similarity determination. If so, then the claim should be amended like so: “[[a]] the second relationship encoding”. In any case, clarification is sought regarding the claim limitation.   
Claims 2-9 are rejected by virtue of their dependency either directly or indirectly from the independent claim above and they do not provide clarification regarding the ambiguity recited above.

Claim 2 recites the limitation “the second relationship encoding” on line 11. It is not clear which instance of second relationship encoding is being referred back to in claim 1, on line 13 or line 17.  
Claims 3-9 are rejected by virtue of their dependency either directly or indirectly from claim 2 and they do not provide clarification regarding the ambiguity recited above. 

Claim 4 recites the limitation “the first word-based attention mechanism” on line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim 5 recites the limitation “the first sentence-based attention mechanism” on line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim 9 recites the limitations “the relationship” on lines 8 and 12.  There are insufficient antecedent bases for these limitations in the claim.

Claim 10 recites in the last limitation on lines 20-23 a determination of analogical similarity between “the first relationship encoding and a second relationship encoding”. It is not clear if Applicant intended for “a second relationship encoding” to be a new second relationship encoding or if Applicant intended to refer back to second relationship encoding that was previously recited on lines 18-19. From the claim limitation, it is likely that Applicant intended to refer back to the previously recited second relationship encoding in order to make this analogical similarity determination. If so, then the claim should be amended like so: “[[a]] the second relationship encoding”. In any case, clarification is sought regarding the claim limitation.   
Claim 10 also recites the limitation “the one or more storage devices” on lines 3-4.  There is insufficient antecedent basis for this limitation in the claim. Previous reference recites “one or more computer-readable storage devices”. 
Claims 11-19 are rejected by virtue of their dependency either directly or indirectly from the independent claim above and they do not provide clarification regarding the ambiguities recited above.

Claim 11 recites the limitation “the second relationship encoding” on line 11. It is not clear which instance of second relationship encoding is being referred back to in claim 10, on lines 18-19 or lines 22-23.  
Claims 12-17 are rejected by virtue of their dependency and they do not provide clarification regarding the ambiguity recited above. 

Claim 13 recites the limitation “the first word-based attention mechanism” on line 2.  There is insufficient antecedent basis for this limitation in the claim.

Claim 14 recites the limitation “the first sentence-based attention mechanism” on line 2.  There is insufficient antecedent basis for this limitation in the claim.

Claim 18 recites the limitations “the computer usable code” on lines 2 and 4. There are insufficient antecedent bases for these limitations in the claim.

Claim 19 recites the limitation “the computer usable code” on lines 2 and 4. There are insufficient antecedent bases for these limitations in the claim.

Claim 20 recites in the last limitation on lines 22-25 a determination of analogical similarity between “the first relationship encoding and a second relationship encoding”. It is not clear if Applicant intended for “a second relationship encoding” to be a new second relationship encoding or if Applicant intended to refer back to second relationship encoding that was previously recited on lines 20-21. From the claim limitation, it is likely that Applicant intended to refer back to the previously recited second relationship encoding in order to make this analogical similarity determination. If so, then the claim should be amended like so: “[[a]] the second relationship encoding”. In any case, clarification is sought regarding the claim limitation.   
Claim 20 also recites the limitation “the one or more storage devices” on line 4.  There is insufficient antecedent basis for this limitation in the claim. Previous reference recites “one or more computer-readable storage devices”. 
Claim 20 also recites the limitation “the one or more memories” on line 6. There is insufficient antecedent basis for this limitation in the claim. Previous reference recites “one or more computer-readable memories”. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 10-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. 
Claim 10 does not fall within at least one of the four categories of patent eligible subject matter because it encompasses impermissible signals per se. Claim 10 recites “A computer usable program product comprising one or more computer-readable storage devices” and “one or more storage devices”, which are not defined in the specification as excluding signals per se. Specification paragraph [0095] discusses the exclusion of transitory signals per se by describing that computer readable storage medium is not to be construed as transitory signals per se. However, there is no such description regarding the computer usable program product, and it is not explicitly clear that the “computer-readable storage devices” and “storage devices” recited in claim 10 are the same as the “computer readable storage medium” as recited in the specification. Accordingly, the claim encompasses impermissible signals per se and therefore, does not fall within at least one of the four categories of patent eligible subject matter.
Claims 11-19 are rejected for having a similar issue as stated above and by virtue of their dependency from claim 10, either directly or indirectly. 
Claim 20 does not fall within at least one of the four categories of patent eligible subject matter because it encompasses impermissible signals per se. Claim 20 recites “one or more computer-readable memories, and one or more computer-readable storage devices”, “storage devices”, and “memories”, which are not defined in the specification as excluding signals per se. Specification paragraph [0095] discusses the exclusion of transitory signals per se by describing that computer readable storage medium is not to be construed as transitory signals per se. However, there is no such description regarding the “computer-readable memories”, “computer-readable storage devices”, “storage devices”, and “memories”, and it is not explicitly clear that the “computer-readable memories”, “computer-readable storage devices”, “storage devices”, and “memories” as recited in claim 20 are the same as the “computer readable storage medium” as recited in the specification. Accordingly, the claim encompasses impermissible signals per se and therefore, does not fall within at least one of the four categories of patent eligible subject matter.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
It is noted that claims 10-20 are already rejected for being directed to a non-statutory subject matter and thus are already considered as ineligible subject matter. However, since the claims can potentially be amended to fall within a statutory category, they would then be subjected to an analysis under the subsequent steps of §101 and would be rejected for being directed to an abstract idea without significantly more. As such, claims 10-20 are also being analyzed below for compact prosecution but with the noted caveat described above.   

Step 1 for all claims
	Under the first part of the analysis, claims 1-8 recite a method. Accordingly, these claims fall within the four statutory categories and the analysis now proceeds to Step 2A, Prongs 1 and 2 and then Step 2B. 
	As stated above, claims 10-20 do not fall within the four statutory categories since they encompass impermissible signals per se. However, since the claims can potentially be amended to fall within a statutory category, they then would subjected to an analysis under the subsequent steps of §101. As such, claims 10-20 are also being analyzed below for compact prosecution but with the previously noted caveat.
Claim 1
Step 2A, prong 1: the following limitations on lines 1-17 recite mental processes:
“A method comprising: 
… encode a first natural language string into a first sentence encoding comprising a set of word encodings; 
adjusting, …, a weight value for a word encoding within the first sentence encoding to form an adjusted first sentence encoding; 
determining, …, a first relationship encoding corresponding to the adjusted first sentence encoding; 
computing an absolute difference between the first relationship encoding and a second relationship encoding; and 
determining, …, a degree of analogical similarity between the first relationship encoding and a second relationship encoding.”
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve method steps for: encoding the first natural language into the first sentence; adjusting weight values to form the adjusted first sentence; determining the first relationship encoding; computing the absolute difference value between encodings; and determining the analogical similarity degree between the encodings. 
Thus, the claim recites mental processes because the limitations are based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to encoding data, adjusting weight values, determining the first relationship encoding, computing the absolute difference value, and determining the analogical similarity degree. For example, one can mentally or with aid of pencil and paper encode the first natural language string to the first sentence encoding comprising the set of word encoding by converting the first natural language string via segmenting or proportioning or designating portions of the first natural language string according to the set of words, resulting in the first sentence encoding. That is, the encoding is simply a conversion of data from a natural language format to a set of words format, wherein such a process can be done mentally or with the aid of pencil and paper via segmenting/proportioning/designating of the natural language format. In addition, one can mentally or with the aid of pencil and paper adjust weight values, determine the corresponding relationship between the encodings, compute the absolute difference value between the encodings, and determine the analogical similarity between the encodings via observations, evaluations, judgments, or opinions. 
As such, these limitations are conceivably performed mentally or with the aid of paper and pencil and thus are considered as mental processes.
Step 2A, prong 2: the following limitations recite additional elements:
Lines 2-3: “operating a first neural network on a processor and a memory to”.
Lines 5-6: “using a word-based attention mechanism with a context vector”.
Lines 9-10: “using a sentence-based attention mechanism”.
Line 15: “using a multi-layer perceptron”. 
The limitations recite additional elements related to mere instructions for applying the judicial exception, i.e., the steps related to the method, on a generic computing device such as the processor and memory (see MPEP 2106.05(f)), as well as a generic field of use via application to the processor and memory (see MPEP 2106.05(h)). That is, the recitation of the first neural network, various attention mechanisms, and the multi-layer perceptron recite mere instructions for applying the judicial exception since they are merely software algorithms that can be executed on the processor and memory (see MPEP 2106.05(f)). Wherein the use of a generic computing device such as the processor and memory to execute mere instructions for applying the mental processes amounts to just mental processes that can be performed on a generic computing device and nothing more (see MPEP 2106.04(a)(2)(III)(C)).
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the limitations relate to mere instructions to apply the judicial exception on a generic computing device, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the mere instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
Furthermore, specifying that the claim limitation be applied to the processor and memory is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim 2
Step 2A, prong 1: the following limitations on lines 1-12 recite mental processes:
“The method of claim 1, further comprising: 
… encode a second natural language string into a second sentence encoding comprising a second set of word encodings; 
adjusting, …, a weight value for a word encoding within the second sentence encoding to form an adjusted second sentence encoding; and 
determining, …, the second relationship encoding corresponding to the adjusted second sentence encoding.”
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve method steps for: encoding the second natural language into the second sentence; adjusting weight values to form the adjusted second sentence; and determining the second relationship encoding. 
Thus, the claim recites mental processes because the limitations are based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to encoding data, adjusting weight values, and determining the second relationship encoding. For example, one can mentally or with aid of pencil and paper encode the second natural language string to the second sentence encoding comprising the set of word encoding by converting the second natural language string via segmenting or proportioning or designating portions of the second natural language string according to the set of words, resulting in the second sentence encoding. That is, the encoding is simply a conversion of data from a natural language format to a set of words format, wherein such a process can be done mentally or with the aid of pencil and paper via segmenting/proportioning/designating of the natural language format. In addition, one can mentally or with the aid of pencil and paper adjust weight values and determine the corresponding relationship between the encodings via observations, evaluations, judgments, or opinions. 
As such, these limitations are conceivably performed mentally or with the aid of paper and pencil and thus are considered as mental processes.
Step 2A, prong 2: the following limitations recite additional elements:
Lines 2-3: “operating a second neural network on a processor and a memory to”.
Lines 6-7: “using a second word-based attention mechanism with a context vector”.
Lines 10-11: “using a second sentence-based attention mechanism”. 
The limitations recite additional elements related to mere instructions for applying the judicial exception, i.e., the steps related to the method, on a generic computing device such as the processor and memory (see MPEP 2106.05(f)), as well as a generic field of use via application to the processor and memory (see MPEP 2106.05(h)). That is, the recitation of the second neural network and various attention mechanisms recite mere instructions for applying the judicial exception since they are merely software algorithms that can be executed on the processor and memory (see MPEP 2106.05(f)). Wherein the use of a generic computing device such as the processor and memory to execute mere instructions for applying the mental processes amounts to just mental processes that can be performed on a generic computing device and nothing more (see MPEP 2106.04(a)(2)(III)(C)).
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the limitations relate to mere instructions to apply the judicial exception on a generic computing device, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the mere instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
Furthermore, specifying that the claim limitation be applied to the processor and memory is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim 3
Step 2A, prong 1: the claim inherits the mental processes from the independent claim and dependent claim 2. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “wherein the first neural network and the second neural network are identically structured”. 
The limitation recites additional elements related to mere instructions for applying the judicial exception, i.e., the steps related to the method (see MPEP 2106.05(f)). That is, the recitation of the first and second neural networks being identically structured recite mere instructions for software algorithms as denoted by the first and second neural networks for applying the judicial exception (see MPEP 2106.05(f)). Thus, the limitation does not integrate the judicial exception into a practical application. 
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the limitation relates to mere instructions to apply the judicial exception, wherein such application does not amount to significantly more than the judicial exception because the mere instruction for executing the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 

Claim 4
Step 2A, prong 1: the claim inherits the mental processes from the independent claim and dependent claim 2. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “wherein the first word-based attention mechanism with a context vector and the second word-based attention mechanism with a context vector are identically structured”. 
The limitation recites additional elements related to mere instructions for applying the judicial exception, i.e., the steps related to the method (see MPEP 2106.05(f)). That is, the recitation of the first and second word-based attention mechanisms being identically structured recite mere instructions for software algorithms as denoted by the first and second word-based attention mechanisms for applying the judicial exception (see MPEP 2106.05(f)). Thus, the limitation does not integrate the judicial exception into a practical application. 
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the limitation relates to mere instructions to apply the judicial exception, wherein such application does not amount to significantly more than the judicial exception because the mere instruction for executing the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 

Claim 5
Step 2A, prong 1: the claim inherits the mental processes from the independent claim and dependent claim 2. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “wherein the first sentence-based attention mechanism and the second sentence-based attention mechanism are identically structured”. 
The limitation recites additional elements related to mere instructions for applying the judicial exception, i.e., the steps related to the method (see MPEP 2106.05(f)). That is, the recitation of the first and second sentence-based attention mechanism being identically structured recite mere instructions for software algorithms as denoted by the first and second sentence-based attention mechanism for applying the judicial exception (see MPEP 2106.05(f)). Thus, the limitation does not integrate the judicial exception into a practical application. 
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the limitation relates to mere instructions to apply the judicial exception, wherein such application does not amount to significantly more than the judicial exception because the mere instruction for executing the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 





Claim 6
Step 2A, prong 1: the following limitations recite a mental process: “The method of claim 2, further comprising: determining, …, that the first relationship encoding and the second relationship encoding correspond to an analogous relationship.”
This is a mental process because determining that there is an analogous relationship between the first and second relationship encodings is a process that, under a broadest reasonable interpretation, relates to observations, evaluations, judgments, or opinions that can be performed mentally or with the aid of a pencil and paper (see MPEP 2106.04(a)(2)(III)). As such, the claim denotes a mental process. 
Step 2A, prong 2: the following limitation recites additional elements: “using an output unit including a sigmoid activation function”. 
The limitation recites additional elements related to mere instructions for applying the judicial exception, i.e., the determination steps, via the output unit (see MPEP 2106.05(f)). That is, the recitation of using the output unit is merely reciting instructions for using software as denoted by the output unit with the sigmoid activation function to apply the judicial exception (see MPEP 2106.05(f)). Thus, the limitation does not integrate the judicial exception into a practical application.
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the limitation relates to mere instructions to apply the judicial exception, wherein such application does not amount to significantly more than the judicial exception because execution of the mere instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 
Claim 7
Step 2A, prong 1: the following limitations recite a mental process: “The method of claim 2, further comprising: determining, …, that the first relationship encoding and the second relationship encoding do not correspond to an analogous relationship.”
This is a mental process because determining that there is not an analogous relationship between the first and second relationship encodings is a process that, under a broadest reasonable interpretation, relates to observations, evaluations, judgments, or opinions that can be performed mentally or with the aid of a pencil and paper (see MPEP 2106.04(a)(2)(III)). As such, the claim denotes a mental process. 
Step 2A, prong 2: the following limitation recites additional elements: “using an output unit including a sigmoid activation function”. 
The limitation recites additional elements related to mere instructions for applying the judicial exception, i.e., the determination steps, via the output unit (see MPEP 2106.05(f)). That is, the recitation of using the output unit is merely reciting instructions for using software as denoted by the output unit with the sigmoid activation function to apply the judicial exception (see MPEP 2106.05(f)). Thus, the limitation does not integrate the judicial exception into a practical application.
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the limitation relates to mere instructions to apply the judicial exception, wherein such application does not amount to significantly more than the judicial exception because execution of the mere instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 
 Claim 8
Step 2A, prong 1: the claim inherits the mental processes from the independent claim and dependent claim 2. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “training, using a set of pairs of natural language strings, wherein each natural language string in the set of pairs of natural language strings expresses a relationship between entities included in the natural language string, the first neural network and the second neural network.”
The limitation recites, at a high level of generality, the additional element of a well-understood, routine, and conventional insignificant extra-solution activity related to training neural networks using data, e.g., natural language data (see MPEP 2106.05(g)). That is, the concept of training neural networks using data, such as natural language data, involving a relationship between entities in the natural language data merely denotes a natural language processing technique that is commonly utilized in training neural networks that is well-understood, routine, and conventional. Thus, the limitation does not integrate the judicial exception into a practical application.
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, training neural networks using natural language data is a well-understood, routine, and conventional activity (see MPEP 2106.05(d)(II)). 
In an example, Liu et. al., “Matching Natural Language Sentences with Hierarchical Sentence Factorization” teaches training of a Siamese model comprising of a first and a second neural network (Liu Sections 4, 5.1, and 5.3). Wherein the training includes transforming natural language sentence strings into their concepts/entities via hierarchical sentence factorization, to show a relationship such as a semantic alignment amongst the concepts/entities (Liu Sections 2-2.1). 
In another example, Ma et. al., “Matching Descriptions to Spatial Entities using a Siamese Hierarchical Attention Network” teaches training of a Siamese hierarchical attention network comprising of a first and a second neural network (Ma Sections IV(A) and (B)). Wherein the training involves using natural language sentences that has a spatial relationship between the entities in the natural language sentences (Ma Sections III(A) and (B), IV(C), and V). 
In yet another example, Mueller et. al, “Siamese Recurrent Architectures for Learning Sentence Similarity” teaches training of a Siamese long short-term memory (LSTM) model comprising of a first and a second neural network (Mueller pg. 2788). Wherein the training involves using natural language sentences that has a semantic relationship between the entities in the natural language sentences (Mueller pgs. 2788-2790). 
Additionally, it is noted that specification paragraphs [0020]-[0023] describe generally machine learning training techniques, which include the use of a neural network, that involve natural language data that has a relationship between entities in the natural language data. 
Therefore, the training of the neural networks as recited in the claim limitation denotes a well-understood, routine, and conventional activity. As such, the limitation does not amount to significantly more than the judicial exception. 




Claim 9
Step 2A, prong 1: the following limitations recite a mental process:
“… generating a set of relation pairs, wherein each relation pair in the set of relation pairs comprises a pair of entities and a relationship relating the pair of entities; 
generating a set of positive example pairs, wherein each positive example pair comprises two relation pairs, the relationship of each relation pair being equivalent to each other;
generating a set of negative example pairs, wherein each negative example pair comprises two relation pairs, the relationship of each relation pair not being equivalent to each other;
combining, forming a training set of example pairs, the set of positive example pairs and the set of negative example pairs; 
converting, by extracting from a text corpus a natural language string expressing a relationship between entities included in the natural language string, the training set of example pairs to a training set of pairs of natural language strings.”
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve method steps for: generating sets of relation pairs, positive example pairs, and negative example pairs; combining the positive and negative examples pairs; and converting the training set from examples pairs to natural language via data extraction. 
Thus, the claim recites mental processes because the limitations are based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to generating various data pairs, combining the positive and negative examples pairs, and converting the training set from example pairs to natural language. For example, one can mentally or with aid of pencil and paper generate various data pairs based on the desired relationship using observations, evaluations, judgments, or opinions. Similarly, one can mentally or with the aid of pencil and paper combine positive and negative example pairs using observations, evaluations, judgments, or opinions. Likewise, one can mentally or with the aid of pencil and paper convert the training set from example pairs to natural language by extraction of relevant data using observations, evaluations, judgments, or opinions. 
As such, these limitations are conceivably performed mentally or with the aid of paper and pencil and thus are considered as mental processes.
Step 2A, prong 2: the claim does not recite any additional elements that integrate the judicial exception into a practical application. 
Step 2B: the claim does note recite any additional elements that amount to significantly more than the judicial exception.

Claim 10: is substantially similar to independent claim 1 and thus is rejected for similar reasons as claim 1. Claim 10 just adds in “A computer usable program product comprising one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices, the stored program instructions comprising” and “program instructions” to process the method and mental processes, which are additional elements related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and does not integrate the judicial exception into a practical application. Wherein the recitation of the computer usable program product and computer-readable storage devices with program instructions denote generic computing components in a generic computing environment (see MPEP 2106.05(h)). Furthermore, the recitation of generic computing components to perform the mental process still amounts to a mental process that can be performed on a generic computer. See MPEP 2106.04(a)(2)(III)(C). Wherein utilization of these generic computing components denotes an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception. 

Claim 11: is substantially similar to claim 2 and thus is rejected for similar reasons as claim 2. Claim 11 recites a “computer usable program product” and “program instructions” to perform the method, which amounts to a program instruction that is performable on a generic computer. That is, the computer usable program product and program instructions denote additional elements. Wherein these additional elements are related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and denote a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of these generic computing components is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 12: is substantially similar to claim 3 and thus is rejected for similar reasons as claim 3. Claim 12 recites a “computer usable program product” to perform the method, which amounts to performing the method on a generic computing component. That is, the computer usable program product denotes an additional element of a generic computing component in a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of this generic computing component is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 13: is substantially similar to claim 4 and thus is rejected for similar reasons as claim 4. Claim 13 recites a “computer usable program product” to perform the method, which amounts to performing the method on a generic computing component. That is, the computer usable program product denotes an additional element of a generic computing component in a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of this generic computing component is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 14: is substantially similar to claim 5 and thus is rejected for similar reasons as claim 5. Claim 14 recites a “computer usable program product” to perform the method, which amounts to performing the method on a generic computing component. That is, the computer usable program product denotes an additional element of a generic computing component in a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of this generic computing component is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 15: is substantially similar to claim 6 and thus is rejected for similar reasons as claim 6. Claim 15 recites a “computer usable program product” and “program instructions” to perform the method, which amounts to a program instruction that is performable on a generic computer. That is, the computer usable program product and program instructions denote additional elements. Wherein these additional elements are related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and denote a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of these generic computing components is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 16: is substantially similar to claim 7 and thus is rejected for similar reasons as claim 7. Claim 16 recites a “computer usable program product” and “program instructions” to perform the method, which amounts to a program instruction that is performable on a generic computer. That is, the computer usable program product and program instructions denote additional elements. Wherein these additional elements are related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and denote a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of these generic computing components is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 17: is substantially similar to claim 8 and thus is rejected for similar reasons as claim 8. Claim 17 recites a “computer usable program product” and “program instructions” to perform the method, which amounts to a program instruction that is performable on a generic computer. That is, the computer usable program product and program instructions denote additional elements. Wherein these additional elements are related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and denote a generic computing environment (see MPEP 2106.05(h)), which does not integrate the judicial exception into a practical application. The utilization of these generic computing components is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception.

Claim 18
Step 2A, prong 1: the claim inherits the mental processes from the independent claim. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “The computer usable program product of claim 10, wherein the computer usable code is stored in a computer readable storage device in a data processing system, and wherein the computer usable code is transferred over a network from a remote data processing system.”
The preamble with the “computer usable program product” denotes the additional element of a generic computing component indicative of a generic computing environment or field of use (see MPEP 2106.05(h)). The limitation reciting the storage of the computer usable code denotes, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data storage (see MPEP 2106.05(g)). The description of the computer readable storage device and data processing system for storing the computer usable code denotes generic computing components indicative of a generic computing environment or field of use (see MPEP 2106.05(h)). The limitation reciting the transfer of the computer usable code over the network denotes, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data transmission/output (see MPEP 2106.05(g)). The recitation of the remote data processing system denotes generic computing component indicative of a generic computing environment or field of use (see MPEP 2106.05(h)).
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the computer usable program product, computer readable storage device, data processing system, and remote data processing system are generic computing components which denote an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). Likewise, the courts have found that the limitations regarding storing the computer usable code and the transferring the computer usable code over the network, when recited at a high level of generality, denote mere data storage and data transmission/output indicative of an insignificant extra-solution activity (see MPEP 2106.05(g)). Wherein “receiving or transmitting data over a network” or “storing and retrieving information in memory” are known to be well-understood, routine, and conventional activities when recited at a high level of generality (see MPEP 2106.05(d)(II)). 
Thus, the limitations taken together do not amount to significantly more than the judicial exception. 

Claim 19
Step 2A, prong 1: the claim inherits the mental processes from the independent claim. The claim does not recite additional mental processes. 
Step 2A, prong 2: the following limitation recites additional elements: “The computer usable program product of claim 10, wherein the computer usable code is stored in a computer readable storage device in a server data processing system, and wherein the computer usable code is downloaded over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system.”
The preamble with the “computer usable program product” denotes the additional element of a generic computing component indicative of a generic computing environment or field of use (see MPEP 2106.05(h)). The limitation reciting the storage of the computer usable code denotes, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data storage (see MPEP 2106.05(g)). The description of the computer readable storage device and server data processing system for storing the computer usable code denotes generic computing components indicative of a generic computing environment or field of use (see MPEP 2106.05(h)). The limitation reciting the download of the computer usable code over the network denotes, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data transmission/output (see MPEP 2106.05(g)). The recitation of the remote data processing system denotes generic computing component indicative of a generic computing environment or field of use (see MPEP 2106.05(h)).
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the computer usable program product, computer readable storage device, server data processing system, and remote data processing system are generic computing components which denote an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). Likewise, the courts have found that the limitations regarding storing the computer usable code and downloading the computer usable code over the network, when recited at a high level of generality, denote mere data storage and data transmission/output indicative of an insignificant extra-solution activity (see MPEP 2106.05(g)). Wherein “receiving or transmitting data over a network” or “storing and retrieving information in memory” are known to be well-understood, routine, and conventional activities when recited at a high level of generality (see MPEP 2106.05(d)(II)). Thus, the limitations taken together do not amount to significantly more than the judicial exception. 

Claim 20: is substantially similar to independent claim 1 and thus is rejected for similar reasons as claim 1. Claim 20 just adds in “A computer system comprising one or more processors, one or more computer-readable memories, and one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising” and “program instructions” to process the method and mental processes, which are additional elements related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and does not integrate the judicial exception into a practical application. Wherein the recitation of the computer system, processors, computer-readable memories, and computer-readable storage devices with program instructions denote generic computing components in a generic computing environment (see MPEP 2106.05(h)). Furthermore, the recitation of generic computing components to perform the mental process still amounts to a mental process that can be performed on a generic computer. See MPEP 2106.04(a)(2)(III)(C). Wherein utilization of these generic computing components denotes an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)). As such, the limitations taken together do not amount to significantly more than the judicial exception. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-7, 10-16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0080225, hereinafter Agarwal) in view of Chi et. al., “A Sentence Similarity Estimation Method Based on Improved Siamese Network” (hereinafter Chi) and Zhu et. al. “A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification” (hereinafter Zhu). 

Regarding claim 1, Agarwal teaches: 
	A method comprising: 
	operating a first neural network on a processor and a memory ([0031], [0033], and [0034]: describing a “Bidirectional Long-Short Term Memory (BiLSTM)-Siamese network based classifier system 100” comprising a Siamese model that is implemented on a processor and a memory. Wherein the Siamese model comprises twin neural networks (NNs) ([0034] and [0040]). This is shown in Fig. 3, wherein the twin NNs each has a base network and various types of layers and the top one of the twin NNs can represent the first neural network.) to encode a first natural language string into a first sentence encoding comprising a set of word encodings ([0034]-[0035] and [0040]: describing that queries (i.e., natural language string), which include a first query, are encodable via embedding layer into respective sequence of word vector encodings, which include a first sequence of word vector encodings, that comprises of vector encodings for each word in the queries (i.e., set of word encodings). This is shown in Fig. 3. See also [0024]: describing examples of the queries.); 
	…; 
	determining, …, a first relationship encoding corresponding to ([0034]-[0035] and [0040]: describing a determination/generation of query embeddings, which include a first query embedding xi, via subsequent layers such as the BiLSTM layers. Wherein the query embeddings are based on/correspond to respective query inputs and encodings. That is, a first query embedding is based on/corresponds to a first query input and encoding. As such, the first query embedding denotes a first relationship encoding since it represents a relationship with the first query input and encoding.)…;
	computing an absolute difference between the first relationship encoding and a second relationship encoding ([0040]-[0041]: describing Euclidean distance measurement between respective query embeddings, xi and xj, as well as a Jaccard similarity measurement. The respective query embeddings denoting first and second relationship encodings. Wherein it is known that Euclidean and Jaccard involve difference computations with absolute values.); and 
	determining, …, a degree of analogical similarity between the first relationship encoding and a second relationship encoding ([0025] and [0040]-[0041]: describing a determination that the respective query embeddings, xi and xj, are semantically similar if they are in the same class C. Wherein C values have a degree range such that a value closer to 1 denotes similarity while a value closer to 0 denotes dissimilarity.).
While the cited reference Agarwal teaches the above limitations of claim 1, it does not explicitly teach: “adjusting, using a word-based attention mechanism with a context vector, a weight value for a word encoding within the first sentence encoding to form an adjusted first sentence encoding” on lines 5-8 and “the adjusted first sentence encoding” on lines 10-11. Chi teaches: a Siamese model with twin NNs, wherein each NN of the twin NNs has an attention layer comprising an attention mechanism with a context vector, wherein the attention mechanism calculates/adjusts a weight for each word embedding vector based on an importance metric to generate a final sentence representation (Chi Section 3.2 and Figure 1). Wherein “[t]he final sentence representation is the weighted sum of all the word annotations using the attention weight” (Chi Sections 3.2 and 4.3.3). The final sentence denoting an adjusted first sentence encoding since it is derived from the weighted sum as described above. The attention mechanism being a word-based attention mechanism since it operates on word embedding vectors obtained as part of an encoding of input sentences that includes a first input sentence (Chi Section 3.2 and Figure 1). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process in the cited reference to include the word-based attention mechanism in Chi. Doing so would enable “improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design our model based on the Siamese network using deep Long Short-Term Memory (LSTM) Network. And we add the special attention mechanism to let the model give different words different attention while modeling sentences.” (Chi Abstract). 
While the cited references in combination teach the above limitations of claim 1, they do not explicitly teach: “using a sentence-based attention mechanism” on lines 9-10 and “using a multi-layer perceptron” on line 15. Zhu teaches: 
“using a sentence-based attention mechanism”: describing use of attention/self-attention mechanisms that comprises a first attention/self-attention mechanism to encode query sentences, i.e., sentence-based attention mechanism (Zhu Section 3.2 and Figs. 1 and 2). Wherein self-attention is also a type of attention mechanism. 
“using a multi-layer perceptron”: describing multi-layer perceptron (MLP) as part of the Siamese model/network (Zhu Section 3.2). 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process and the word-based attention mechanism in the combined cited references to include the MLP and sentence-based attention mechanisms in Zhu. Doing so would enable “a new deep learning method, which combines the attention mechanism with Bi-LSTM [long short-term memory] based on Siamese network to achieve the semantic similarity matching for given question pairs” (Zhu Abstract). 

Regarding claim 2, the rejection of claim 1 is incorporated. Agarwal teaches: 
The method of claim 1, further comprising: 
operating a second neural network on a processor and a memory ([0031], [0033], and [0034]: describing a “Bidirectional Long-Short Term Memory (BiLSTM)-Siamese network based classifier system 100” comprising a Siamese model that is implemented on a processor and a memory. Wherein the Siamese model comprises twin neural networks (NNs) ([0034] and [0040]). This is shown in Fig. 3, wherein the twin NNs each has a base network and various types of layers and the bottom one of the twin NNs can represent the second neural network.) to encode a second natural language string into a second sentence encoding comprising a second set of word encodings ([0034]-[0035] and [0040]: describing that queries (i.e., natural language string), which include a second query, are encodable via embedding layer into respective sequence of word vector encodings, which include a second sequence of word vector encodings, that comprises of vector encodings for each word in the queries (i.e., set of word encodings). This is shown in Fig. 3. The respective sequence of word vector encodings denotes a second sentence encoding. See also [0024]: describing examples of the queries.); 
…; and 
determining, …, the second relationship encoding corresponding to ([0034]-[0035] and [0040]: describing a determination/generation of query embeddings, which include a second query embedding xj, via subsequent layers such as the BiLSTM layers that is determined/generated in a similar manner to xi. Wherein the query embeddings are based on/correspond to respective query inputs and encodings. That is, a second query embedding is based on/corresponds to a second query input and encoding. As such, the second query embedding denotes a second relationship encoding since it represents a relationship with the second query input and encoding.)….


While the cited reference Agarwal teaches the above limitations of claim 2, it does not explicitly teach: “adjusting, using a second word-based attention mechanism with a context vector, a weight value for a word encoding within the second sentence encoding to form an adjusted second sentence encoding” on lines 6-9 and “the adjusted second sentence encoding” on line 12. Chi further teaches: a Siamese model with twin NNs, wherein each NN of the twin NNs has an attention layer comprising an attention mechanism with a context vector, wherein the attention mechanism calculates/adjusts a weight for each word embedding vector based on an importance metric to generate a final sentence representation (Chi Section 3.2 and Figure 1). Wherein “[t]he final sentence representation is the weighted sum of all the word annotations using the attention weight” (Chi Sections 3.2 and 4.3.3). The final sentence denoting an adjusted second sentence encoding since it is derived from the weighted sum as described above. The attention mechanism being a word-based attention mechanism since it operates on word embedding vectors obtained as part of an encoding of input sentences that includes a second input sentence (Chi Section 3.2 and Figure 1). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process in the cited reference to include the word-based attention mechanism in Chi. Doing so would enable “improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design our model based on the Siamese network using deep Long Short-Term Memory (LSTM) Network. And we add the special attention mechanism to let the model give different words different attention while modeling sentences.” (Chi Abstract). 
While the cited references in combination teach the above limitations of claim 2, they do not explicitly teach: “using a second sentence-based attention mechanism” on lines 10-11. Zhu teaches: use of attention/self-attention mechanisms that comprises a second attention/self-attention mechanism to encode query sentences, i.e., sentence-based attention mechanism (Zhu Section 3.2 and Figs. 1 and 2). Wherein self-attention is also a type of attention mechanism. 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process and the word-based attention mechanism in the combined cited references to include the MLP and sentence-based attention mechanisms in Zhu. Doing so would enable “a new deep learning method, which combines the attention mechanism with Bi-LSTM [long short-term memory] based on Siamese network to achieve the semantic similarity matching for given question pairs” (Zhu Abstract). 

Regarding claim 3, the rejection of claim 2 is incorporated. Agarwal teaches: 
The method of claim 2, wherein the first neural network and the second neural network are identically structured ([0034] and [0040]: describing that the Siamese model comprises twin NNs. Wherein the twin NNs have an identical structure with each NN of the twin NNs comprising of a base network and various layers as shown in Fig. 3.).

Regarding claim 4, the rejection of claim 2 is incorporated. The cited references in combination do not explicitly teach: “wherein the first word-based attention mechanism with a context vector and the second word-based attention mechanism with a context vector are identically structured.” Chi further teaches: that each of the twin NNs in the Siamese model has an identical attention mechanism comprising context vectors (Chi Section 3.2 and Figure 1). That is, first and second attention mechanisms. Wherein the attention mechanisms are a word-based attention mechanism since it operates on word embedding vectors (Chi Section 3.2 and Figure 1).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process, the MLP, and the sentence-based attention mechanisms in the combined cited references to include the identically structured word-based attention mechanisms in Chi. Doing so would enable “improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design our model based on the Siamese network using deep Long Short-Term Memory (LSTM) Network. And we add the special attention mechanism to let the model give different words different attention while modeling sentences.” (Chi Abstract). 

Regarding claim 5, the rejection of claim 2 is incorporated. The cited references in combination do not explicitly teach: “wherein the first sentence-based attention mechanism and the second sentence-based attention mechanism are identically structured.” Zhu further teaches: that each of the twin NNs in the Siamese model has identical attention/self-attention mechanisms to encode query sentences, i.e., sentence-based attention mechanism (Zhu Section 3.2 and Figs. 1 and 2). That is, first and second sentence-based attention mechanisms. 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process and the word-based attention mechanism in the combined cited references to include the sentence-based attention mechanisms in Zhu. Doing so would enable “a new deep learning method, which combines the attention mechanism with Bi-LSTM [long short-term memory] based on Siamese network to achieve the semantic similarity matching for given question pairs” (Zhu Abstract).

Regarding claim 6, the rejection of claim 2 is incorporated. Agarwal teaches: 
The method of claim 2, further comprising: determining, using an output unit including a sigmoid activation function ([0035]-[0036]: describing an output gate (i.e., output unit) with a logistic sigmoid function (i.e., sigmoid activation function).), that the first relationship encoding and the second relationship encoding correspond to an analogous relationship ([0040]-[0041]:  describing a determination that the respective query embeddings, xi and xj, (i.e., first and second relationship encodings) are in the same class (i.e., have an analogous relationship) if C is closest to 1. The respective query embeddings was previously described.).

Regarding claim 7, the rejection of claim 2 is incorporated. Agarwal teaches: 
The method of claim 2, further comprising: determining, using an output unit including a sigmoid activation function ([0035]-[0036]: describing an output gate (i.e., output unit) with a logistic sigmoid function (i.e., sigmoid activation function).), that the first relationship encoding and the second relationship encoding do not correspond to an analogous relationship ([0040]-[0041]:  describing a determination that the respective query embeddings, xi and xj, (i.e., first and second relationship encodings) are not in the same class (i.e., have a non- analogous relationship) if C is closest to 0. The respective query embeddings was previously described.).

Regarding claim 10, claim 10 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 10 is a program product claim that corresponds to method claim 1.
A mapping is shown below for the limitations of claim 10 that differ from claim 1. Agarwal teaches:
A computer usable program product comprising one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices, the stored program instructions comprising ([0031], [0033]-[0034], [0078]-[0079], and [0081]: describing hardware and software components (i.e., computer-usable program product) comprising computer-usable/computer-readable storage medium with program instructions stored thereupon.): ….

Regarding claim 11, claim 11 is substantially similar to claim 2 and therefore is rejected on the same grounds as claim 2. Claim 11 is a program product claim that corresponds to method claim 2.

Regarding claim 12, claim 12 is substantially similar to claim 3 and therefore is rejected on the same grounds as claim 3. Claim 12 is a program product claim that corresponds to method claim 3.

Regarding claim 13, claim 13 is substantially similar to claim 4 and therefore is rejected on the same grounds as claim 4. Claim 13 is a program product claim that corresponds to method claim 4.

Regarding claim 14, claim 14 is substantially similar tso claim 5 and therefore is rejected on the same grounds as claim 5. Claim 14 is a program product claim that corresponds to method claim 5.

Regarding claim 15, claim 15 is substantially similar to claim 6 and therefore is rejected on the same grounds as claim 6. Claim 15 is a program product claim that corresponds to method claim 6.

Regarding claim 16, claim 16 is substantially similar to claim 7 and therefore is rejected on the same grounds as claim 7. Claim 16 is a program product claim that corresponds to method claim 7.

Regarding claim 20, claim 20 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 20 is a system claim that corresponds to method claim 1.
A mapping is shown below for the limitations of claim 20 that differ from claim 1. Agarwal teaches:
A computer system comprising one or more processors ([0031], [0034], [0076], and [0078]: describing computing systems and processors.), one or more computer-readable memories ([0031], [0033]-[0034], [0078]-[0079], and [0081]: describing various computer-readable memories.), and one or more computer-readable storage devices ([0031], [0033]-[0034], [0078]-[0079], and [0081]: describing various computer-readable storage media/devices.), and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising ([0031], [0033]-[0034], [0078]-[0079], and [0081]: describing program instructions that are storable in the computer-readable storage media/devices, wherein the program instructions are executable by processors via memories.): ….

Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0080225, hereinafter Agarwal), Chi et. al., “A Sentence Similarity Estimation Method Based on Improved Siamese Network” (hereinafter Chi), and Zhu et. al. “A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification” (hereinafter Zhu) in view of Liu et. al., “Matching Natural Language Sentences with Hierarchical Sentence Factorization” (hereinafter Liu). 


Regarding claim 8, the rejection of claim 2 is incorporated. The cited references in combination do not explicitly teach: “training, using a set of pairs of natural language strings, wherein each natural language string in the set of Page 39 of 45 Docket No. P201803359US01pairs of natural language strings expresses a relationship between entities included in the natural language string, the first neural network and the second neural network.” Liu teaches: training with training sets for a Siamese network architecture comprising twin NNs, i.e., first and second neural networks (Liu Sections 4, 5.1, and 5.3). The Siamese network architecture is shown in Fig. 5. Wherein the training includes transforming natural language sentence strings into their concepts/entities via hierarchical sentence factorization to show a relationship such as a semantic alignment amongst the concepts/entities (Liu Sections 2-2.1). The transforming is shown in Fig. 1 for a pair of natural language sentence strings A and B. 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process, the word-based attention mechanism, the MLP, and the sentence-based attention mechanisms in the combined cited references to include the training of the neural networks with entities relationship in the natural language strings in Liu. Doing so would enable “for supervised semantic matching, we extend the existing Siamese network architectures (both for CNN and LSTM) to multi-scaled models, where each scale adopts an individual Siamese network, taking as input the vector representations of the two sentences at the corresponding depth in the factorization trees, ranging from the coarse-grained scale to fine-grained scales…. Our proposed multi-scaled deep neural networks can effectively improve existing deep models
by measuring the similarity between a pair of sentences at different semantic granularities.” (Liu Section 1). Wherein CNN denotes convolutional neural network and LSTM denotes long short-term memory. 

Regarding claim 17, claim 17 is substantially similar to claim 8 and therefore is rejected on the same grounds as claim 8. Claim 17 is a program product claim that corresponds to method claim 8.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0080225, hereinafter Agarwal), Chi et. al., “A Sentence Similarity Estimation Method Based on Improved Siamese Network” (hereinafter Chi), and Zhu et. al. “A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification” (hereinafter Zhu), and Liu et. al., “Matching Natural Language Sentences with Hierarchical Sentence Factorization” (hereinafter Liu) in view of Ruvini et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2020/0065873, hereinafter Ruvini). 

Regarding claim 9, the rejection of claim 8 is incorporated. The cited references in combination do not explicitly teach on lines 2-5: “generating a set of relation pairs, wherein each relation pair in the set of relation pairs comprises a pair of entities and a relationship relating the pair of entities;….” Liu further teaches: generating relation pairs of words or phrases of natural language sentences as shown in the hierarchical sentence factorization tree (Liu Sections 2-2.1). Wherein the relation pairs comprise entities/concepts pairs and a relationship between the entities/concepts pairs such as a semantic alignment relationship (see previous citation). This is shown in Fig. 1. 
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process, the word-based attention mechanism, the MLP, and the sentence-based attention mechanisms in the combined cited references to include the relation pairs in Liu. Doing so would enable the use of “our Hierarchical Sentence Factorization techniques to transform a sentence into a hierarchical tree structure, which also naturally produces a reordering of the sentence at the root node. This multi-scaled representation form proves to be effective at improving both unsupervised and supervised semantic matching ….” (Liu Section 2).  

While the cited reference Liu teaches the above limitations of claim 9, it does not explicitly teach: “generating a set of positive example pairs, wherein each positive example pair comprises two relation pairs, the relationship of each relation pair being equivalent to each other; generating a set of negative example pairs, wherein each negative example pair comprises two relation pairs, the relationship of each relation pair not being equivalent to each other; combining, forming a training set of example pairs, the set of positive example pairs and the set of negative example pairs; converting, by extracting from a text corpus a natural language string expressing a relationship between entities included in the natural language string, the training set of example pairs to a training set of pairs of natural language strings” on lines 6-21. Ruvini teaches:
“generating a set of positive example pairs, wherein each positive example pair comprises two relation pairs, the relationship of each relation pair being equivalent to each other (Ruvini [0025]-[0028]: describing positive example pairs of sentences, wherein the positive example pairs comprises relation pairs such as semantic parts/components or grammatical parts/components. The relation pairs being similar when a function F computing the similarity between the relation pairs and sentences is close to 1 ([0027]-[0028] and [0050]).);
generating a set of negative example pairs, wherein each negative example pair comprises two relation pairs, the relationship of each relation pair not being equivalent to each other (Ruvini [0025]-[0028]: describing negative example pairs of sentences, wherein the negative example pairs comprises relation pairs such as semantic parts/components or grammatical parts/components. The relation pairs being dissimilar when a function F computing a dissimilarity between the relation pairs and sentences is close to 0 ([0027]-[0028] and [0050]).); 
combining, forming a training set of example pairs, the set of positive example pairs and the set of negative example pairs (Ruvini [0028] and [0050]: describing that the Siamese neural networks is trained via training data sets that comprises of the positive and negative example pairs, i.e., a combination of the positive and negative example pairs.); 
converting, by extracting from a text corpus a natural language string expressing a relationship between entities included in the natural language string (Ruvini [0017], [0024], [0026], and [0048]-[0049]: describing an extraction module for extracting key textual natural language statements from a source, such as textual corpus of natural language reviews, using natural language processing. Wherein the key textual natural language statements comprise a relationship of the entities/words (Ruvini [0025]).), the training set of example pairs to a training set of pairs of natural language strings (Ruvini [0018], [0028], and [0050]: describing that the training example pairs can be converted to a training set of natural language sentences since the training example pairs are based on natural language sources such as reviews and dictionaries, etc. Thus, enabling the training example pairs to be convertible to a training set of natural language sentences.)”.
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the relation pairs in the cited reference to include the positive and negative examples and their use in Ruvini. Doing so would enable “[s]ystems and methods for improving an information provisioning system using a natural language conversational assistant is provided. A machine agent initiates an interactive natural language conversation with a user to provide the user with guidance on one or more products…. Based on the accessed textual statements and an overall empirical utility of each of the accessed textual statements, the machine agent determines one or more statements of the accessed textual statements to convey to the user.” (Ruvini Abstract). Wherein a “natural language conversation system 104 communicates with the extraction system 102 to obtain the information (e.g., textual statements), determines, in real time, which statements to convey (without the use of a script), and presents the statements in a natural language response” (Ruvini [0017]). 

Claims 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2019/0080225, hereinafter Agarwal), Chi et. al., “A Sentence Similarity Estimation Method Based on Improved Siamese Network” (hereinafter Chi), and Zhu et. al. “A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification” (hereinafter Zhu) in view of Woods et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2014/0359691, hereinafter Woods). 

Regarding claim 18, the rejection of claim 10 is incorporated. Agarwal teaches:
The computer usable program product of claim 10, wherein the computer usable code is stored in a computer readable storage device in a data processing system ([0031]-[0033], [0078]-[0079], and [0081]: describing computer-readable instructions (i.e., computer usable code) that is stored in memory/computer-readable storage medium as part of a computing system with processor to process data, wherein the computing system denotes a data processing system.), and ….

While the cited reference Agarwal teaches the above limitations of claim 18, it does not explicitly teach: “wherein the computer usable code is transferred over a network from a remote data processing system”. Woods teaches: that the instructions/program code (i.e., computer usable code) can be “downloaded [i.e., transferred] over a network from a remote data processing system” (Woods [0062]). Similarly, see also Woods [0029] and [0032]: describing transfer of the instructions/program code via the network. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process in the cited reference to include the instruction transfer in Woods. Doing so would enable “[p]rogram code 216 [] located in a functional form on computer-readable media 218 that is selectively removable and may be loaded onto or transferred to data processing system 200 for execution by processor unit 204.” (Woods [0029]). Wherein the program code/instructions are used for implementing natural language processing (Woods Abstract and [0032]).  

Regarding claim 19, the rejection of claim 10 is incorporated. The cited references in combination do not explicitly teach: “wherein the computer usable code is stored in a computer readable storage device in a server data processing system, and wherein the computer usable code is downloaded over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system.” Woods teaches:
“wherein the computer usable code is stored in a computer readable storage device in a server data processing system (Woods [0062]: describing that “instructions or [computer usable] code may be stored in a computer readable storage medium in a server data processing system”.), and 
wherein the computer usable code is downloaded over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system (Woods [0062]: describing that the “instructions or [computer usable] code may be … adapted to be downloaded over a network to a remote data processing system for use in a computer readable storage medium within the remote system”.).” 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network process, the word-based attention mechanism, the MLP, and the sentence-based attention mechanisms in the combined cited references to include the storage and downloading of the instructions along with the storage device and various systems in Woods. Doing so would enable “[p]rogram code 216 [] located in a functional form on computer-readable media 218 that is selectively removable and may be loaded onto or transferred to data processing system 200 for execution by processor unit 204.” (Woods [0029]). Wherein the program code/instructions are used for implementing natural language processing (Woods Abstract and [0032]).  
Conclusion
The prior art made of record and not relied upon are considered pertinent to applicant's disclosure. 
Platt et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2010/0324883): describing Siamese neural network for translingual text representation, which comprises converting document text into bag-of-words vector.  
Monti et. al. (U.S. Pat. No. 11,055,355): describing “a Siamese neural network may be trained to associate similar words to similar vectors in the vector space”. Wherein the Siamese neural network is used to analyze text queries. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762. The examiner can normally be reached M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SELENE A. HAEDI/Examiner, Art Unit 2128