DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This communication is in response to Application filed on January 28, 2021. Claims 1-20 are pending. 

Allowable Subject Matter
Claims 1-20 are allowed.

The following is an examiner’s statement of reasons for allowance: 
	The primary reason for the allowance of the independent claims is the inclusion of the features including retrieving from a hash index database a second set of attribute values based on the first set of attribute values, wherein at least one first attribute value from the first set of attribute values is associated with at least one second attribute value from the second set of attribute values; generate a third set of attribute values based on:
i) the second set of attribute values and ii) the plurality of second data records; wherein the third set of attribute values is distinct from the first set of attribute values; that the set of similarity scores generated comprises similarity scores between each attribute value of the first set of attribute values and each attribute value of the third set of attribute values; generating a final similarity score indicative of a latent similarity between the first data record and a second data record of the plurality of second data records, by inputting into a trained latent similarity identification machine learning model:
i) the first set of attribute values, ii) the third set of attribute values, and iii) the set of similarity scores; and identifying a similar second data record of the plurality of second data records that is related to the first data record based on the final similarity score, in combination with all claimed limitations, which are not taught by the prior art. 


The closest prior art Matthews (US 2019/0065470) discloses:
receiving a first data record from a computing device, wherein the first data record comprises a first set of attribute values [wherein labeled change records are converted to feature vectors, para 3, 62; feature vectors include topic labels linked to change records, para 93]; 
retrieving from a records database a plurality of second data records based on the second set of attribute values [feature vectors generated from new/additional change records, para 3, 63, 93];
generating a set of similarity scores, wherein the set of similarity scores comprises similarity scores between each attribute value of the first set of attribute values [semantic similarity score calculated between the additional change record and one or more of the labeled change records, para 4-5, 67-70];
generate a final similarity score indicative of a latent similarity between the first data record and a second data record of the plurality of second data records, by inputting into a trained latent similarity identification machine learning model:
i) the first set of attribute values,
iii) the set of similarity scores [supervised learning implemented to compute a probabilistic classification and corresponding confidence score, para 68; probabilistic classification of new change record generated by topic model through latent semantic analysis (LSA), para 6, 73]; and
identifying a similar second data record of the plurality of second data records that is related to the first data record based on the final similarity score [para 117-119].

	However, Matthews’ generation of semantic similarity scores are not calculated using a third set of attribute values generated based on the second set of attribute values and ii) the plurality of second data records; wherein the third set of attribute values is distinct from the first set of attribute values, as claimed. Matthews also doesn’t disclose retrieval of the second set of feature vectors from a hash index database, as claimed. Furthermore, Matthews calculation of the probabilistic classification and corresponding confidence score also do not consider the third set of attributes generated by inputting them into the trained model, as claimed.  

The closest prior art Gopalan et al (US 2021/0065042) discloses machine learning model training of input sentences to compute multiple similarity scores between the first sentence and previously classified sentences to identify a second sentence from the previously classified sentences with a similarity score greater than or equal to a similarity score threshold, to identify a sentence type that is associated with the second sentence, and to associated the first sentence with the sentence type [Abstract; Fig 3 and related portions of specification]. However, Gopalan does not teach the specifics of how the final (second) similarity score with respect to the third attribute generated, as claimed. Furthermore, Gopalan does not disclose retrieval of the second set of feature vectors from a hash index database, as claimed. 

It is for these reasons that the claims distinguish over the prior art. 
Claims 2-8, 10-16 and 18-20 are also allowed by virtue of their dependencies. 

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Lee et al (US 2020/0394455) directed to real-time machine learning modeling and evaluation for insurance policy classification;
Faruquie et al (US 2021/0142191) directed to entity resolution and entity activity indexing using machine learning models for prediction;
Pyati et al (US 2019/0311301) directed to machine learning and visualization of output of multiple evaluations of machine learning models.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHERYL M SHECHTMAN whose telephone number is (571)272-4018.  The examiner can normally be reached on Mon-Fri: 8am-4pm.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571-272-3645.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.



Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

CHERYL M SHECHTMANPatent Examiner
Art Unit 2167                                                                                                                                                                                                        

/C.M.S/
/ROBERT W BEAUSOLIEL JR/Supervisory Patent Examiner, Art Unit 2167