Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant's Response
In Applicant's Response dated 11/3/2021, Applicant amended the Claims and argued against all objections and rejections set forth in the previous Office Action.
All objections and rejections not reproduced below are withdrawn. 
The prior art rejection of the Claims under 35 U.S.C. 103 previously set forth are withdrawn. 
	The examiner appreciates the applicant noting where the support for the amendments is located in the specification. 
The Application was filed on 7/15/2020.
Claim(s) 1-7, 9-17, 19-22 are pending for examination. Claim(s) 1, 11 is/are independent claim(s).

Examiners Interpretation of Claim(s) 11-17, 19, 20, 22: 
Claim(s) 11-17, 19, 20, 22 is/are interpreted as being statutory. For purposes of USC 101, the examiner has regarded the “server computing device” as necessarily including hardware. That is to say the “server computing device” is interpreted as being hardware only or having hardware parts and not merely being software only. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5-8, 11-13, 15-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Harpale; Abhay US Pub. No. 2020/0364270 (Harpale) in view of Manning, Christopher D. et al. “Introduction to Information Retrieval”, Cambridge University Press. 2008 (Manning) in view of in view of Lukas Galke et al. “Evaluating the Impact of Word Embeddings on Similarity Scoring in Practical Information Retrieval” DOI: 10.18420/in2017_215 (Galke).

Claim 1: 
	Harpale teaches: 
	A method for identifying data strings in electronic documents using pattern recognition, the method comprising [¶ 0001] (determining patterns and trends):
…
receiving, by the server computing device, a feedback score from a user, wherein the feedback score corresponds to an accuracy of the calculated cosine similarity between the first processed data string and the second processed data string [¶ 0032, 34-35, 43] (calculated cosine similarity score is approved or rejected by a user, this is a “feedback score”); and
calculating, by the server computing device, an adjusted cosine similarity between the first processed data string and the second processed data string based on the calculated cosine similarity and the feedback score [¶ 0031-32, 35, 42-43] (comparing the predicted similarity score to the true score) [¶ 0007, 16, 31, 33-36, 42, Figs. 1-2] (updating the weights of the cosine similarity on the text based on the user feedback is “adjusted cosine similarity”). 

Harpale also teaches: [¶ 0002, 07, 14-16, 23] (words or elements are represented as vectors). 

	Harpale fails to teach, but Manning teaches: 
receiving, by a server computing device, a first data string corresponding to a first sentence of a first plurality of sentences of an electronic reference document from a first database [Section: 8.3, pg. 153] (test collections);
[Examiner’s Interpretation: the examiner interprets “legal” as being nonfunctional descriptive material (see MPEP 2111.05), that is there is no functional relationship or functional difference between an “electronic document” and an “electronic legal document”]
receiving, by the server computing device, a second data string corresponding to a second sentence of a second plurality of sentences of an electronic legal document from a second database [Section: 8.3, pg. 153] (test collections) [Section: 1.4, 8.3, pg. 15, 156] (although not needed, Manning teaches: legal databases, law, paralegal search, and lawsuit search);
processing, by the server computing device, the first data string corresponding to the first sentence into a first processed data string, wherein processing the first data string comprises at least one of removing stop words [Section: 2.2.2, pg. 27-28] (dropping common words or stop words), removing punctuation [Section: 2.2, pg. 30] (converting all characters to lower-case, or case folding), removing digits, converting all characters to lower-case [Section: 2.2, pg. 30] (converting all characters to lower-case, or case folding), or lemmatization [Section: 2.2.4, pg. 32-34] (lemmatization of words for information retrieval);
processing, by the server computing device, the second data string corresponding to the second sentence into a second processed data string, wherein processing the second data string comprises at least one of removing stop words [Section: 2.2.2, pg. 27-28] (dropping common words or stop words), removing punctuation [Section: 2.2, pg. 30] (converting all characters to lower-case, or case folding), removing digits, converting all characters to lower-case [Section: 2.2, pg. 30] (converting all characters to lower-case, or case folding), or lemmatization [Section: 2.2.4, pg. 32-34] (lemmatization of words for information retrieval).;

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of using feedback for improvements to cosine similarity in Harpale and the method of retrieving and analyzing text in Manning, with a reasonable expectation of success. 

	
Harpale, Manning fail to teach, but Galke teaches:
computing, by the server computing device, a term frequency-inverse document frequency (TF-IDF) matrix based upon the first processed data string and the second processed data string [Pages 2158-2159] (Word centroid similarity (WCS) using IDF Re-weighted Aggregation of Word Vectors);
generating, by the server computing device, a real-valued vector for each word in the first processed data string and for each word in the second processed data string [Pages 2155-2161, 2165] (a word embedding is a distributed vector representation for words [Mi13], each word is represented by a low-dimensional, compared to the vocabulary size, dense vector);
calculating, by the server computing device, a centroid-based cosine similarity score using the TF-IDF matrix and the real-valued vectors for the words in each of the first processed data string and the second processed data string [Pages 2155-2165] (IDF re-weighted aggregation of word vectors and the associated word centroid similarity);

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of using feedback for improvements to cosine similarity in Harpale and the method of retrieving and 
	The motivation for this combination would have been to improve information retrieval by improving the “word centroid similarity” retrieval model [Galke: pages 2155, 2156, 2162, 2164-2165].

Claim 2: 
	Manning teaches: 
	The method of claim 1. wherein the server computing device is configured to process the first data string and the second data string by removing stop words [Section: 2.2.2, pg. 27-28] (dropping common words or stop words). 

Claim 3: 
	Manning teaches: 
	The method of claim 1, wherein the server computing device is configured to process the first data string and the second data string by removing punctuation [Section: 2.2.1, pg. 22-23] (tokenization involved removing punctuation form the string).

Claim 5: 
	Manning teaches: 
	The method of claim 1, wherein the server computing device is configured to process the first data string and the second data string by converting all characters to lower-case [Section: 2.2, pg. 30] (converting all characters to lower-case, or case folding).

Claim 6: 
	Manning teaches: 
	The method of claim 1, wherein the server computing device is configured to process the first data string and the second data string through lemmatization [Section: 2.2.4, pg. 32-34] (lemmatization of words for information retrieval).

Claim 7: 
	Harpale teaches: 
The method of claim 1. wherein the term frequency-inverse document frequency algorithm comprises comparing a plurality of words of the first processed data string with a plurality of words of the second processed data string one word at a time [¶ 0021-22] (in the formula each word or element, x, y, is compared, this means it is done “one word at a time”).

Claims 11-13, 15-17: 
Claim(s) 11 is/are substantially similar to Claim 1 and are rejected using the same art and the same rationale as Claim 1. 
Claim 1 is a “method” claim, Claim 11 is a “system” claim, but the steps or elements of each claim are essentially the same. 

Claim(s) 13 is/are substantially similar to Claim 3 and are rejected using the same art and the same rationale as Claim 3. 
Claim(s) 15 is/are substantially similar to Claim 5 and are rejected using the same art and the same rationale as Claim 5. 
Claim(s) 16 is/are substantially similar to Claim 6 and are rejected using the same art and the same rationale as Claim 6. 
Claim(s) 17 is/are substantially similar to Claim 7 and are rejected using the same art and the same rationale as Claim 7. 

Claim(s) 4, 9, 10, 14, 19, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Harpale; Abhay US Pub. No. 2020/0364270 (Harpale) in view of Manning, Christopher D. et al. “Introduction to Information Retrieval”, Cambridge University Press. 2008 (Manning) in view of in view of Lukas Galke et al. “Evaluating the Impact of Word Embeddings on Similarity Scoring in Practical Information Retrieval” DOI: 10.18420/in2017_215 (Galke) Kowolenko; Michael et al. US Pub. No. 2020/0409951 (Kowolenko).
Claim 4: 
Harpale, Manning, Galke teach all the elements shown above:
Harpale, Manning, Galke fail to teach, but Kowolenko teaches:
The method of claim 1, wherein the server computing device is configured to process the first data string and the second data string by removing digits [¶ 0061] .

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of using feedback for improvements to cosine similarity in Harpale and the method of retrieving and analyzing text in and the method of similarity scoring in Galke and the method of data analysis in Kowolenko, with a reasonable expectation of success. 
	The motivation for this combination would have been “improved specificity and content (accuracy and precision) with regard to the results obtained” [Kowolenko: Abstract, ¶ 0036, 39, 40, 56].

Claim 9: 
Kowolenko teaches: 
The method of claim 1, wherein the server computing device is configured to calculate the adjusted cosine similarity based on a random forest machine learning algorithm [¶ 0075] (machine learning using random forest).
 Manning also teaches: [Section: 15, intro, pg. 319] (random forest). 

Claim 10: 
Kowolenko teaches:
The method of claim 1, wherein the server computing device is configured to generate for display the first sentence of the first plurality of sentences, the second sentence of the second plurality of sentences, and at least one of the calculated cosine similarity, the feedback score, or the calculated adjusted cosine similarity [¶ 0062] (phrases are displayed to user) [¶ 0059-60, 65] (frequency analysis, term frequency-inverse document frequency vectorization) [¶ 0036, 39, 42-43, 57, 62, 70] (feedback loop).

Claims 14, 20: 
Claim(s) 14 is/are substantially similar to Claim 4 and are rejected using the same art and the same rationale as Claim 4. 
Claim(s) 19 is/are substantially similar to Claim 9 and are rejected using the same art and the same rationale as Claim 9. 
Claim(s) 20 is/are substantially similar to Claim 10 and are rejected using the same art and the same rationale as Claim 10. 

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Please See PTO-892: Notice of References Cited.

Bhatt; Himanshu Sharad et al. US Pub. No. 2017/0337266 (Bhatt) [¶ 0029] (machine learning using random forest) [¶ 0025, 41-42, 71, 74-82] (centroid of cluster) [¶ 0040-42, 68] (Term frequency-Inverse document frequency (TF-IDF) algorithm) [¶ 0042, 79] (cosine similarity using TF-IDF and centroid cluster);

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-7, 9-17, 19-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Allowable Subject Matter
Claims 21 and 22 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN J SMITH whose telephone number is (571)270-3825.  The examiner can normally be reached on Monday - Friday 11:00 - 7:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott Baderman can be reached on (571)272-3644.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Benjamin Smith/Examiner, Art Unit 2144                                                                                                                                                                                                        Direct Phone: 571-270-3825
Direct Fax: 571-270-4825
Email: benjamin.smith@uspto.gov