DETAILED ACTION
Claims 1-20 are pending in this office action.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claims 1, 9, 15, similarly recite limitations receiving a plurality of data records; generating a first comparison vector by comparing a first and a second data record of the plurality of data records, wherein the first comparison vector indicates differences between the first and second data records; training a machine learning model based at least in part on the first comparison vector; evaluating the plurality of data records using the machine learning model; and linking at least two of the plurality of data records based on the evaluation. 
a) In analyzing under step 2A Prong One, Does the claim recite an abstract idea law of nature or natural phenomenon?  Yes.
Claims 1, 9, 15, recite abstract limitations such as (generating a first comparison vector by comparing a first and a second data record of the plurality of data records; training a machine learning model based at least in part on the first comparison vector; evaluating the plurality of data records using the machine learning model; and linking at least two of the plurality of data records based on the evaluation) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. That is, other than reciting “by a processor,” nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “at least one processor” language, “generating; training; evaluating; and linking” in the context of this claim encompasses the user manually calculating. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.   The human mind can perform steps of generating; training; evaluating; and linking.  Accordingly, the claims recite an abstract idea.

b) In analyzing under step 2A Prong Two, Does the claim recite additional elements that integrate the judicial exception into a practical application?  NO.
This judicial exception is not integrated into a practical application because the claims recite limitations such as one or more computer-readable storage media, computer-readable program code; one or more computer processors (claims 9, 15) that is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component.  The additional limitations “wherein the first comparison vector indicates differences between the first and second data records” that just indicates differences between records; and “receiving a plurality of data records” which are mere data gathering steps, represent well-understood, routine, conventional activity previously known to the industry and are specified at a high level of generality. That is, these limitations represent well-understood, routine, conventional activity (See MPEP 2106.05(g) or 2106.05(d) for Receiving or transmitting data over a network, e.g. see Intellectual Ventures v. Symantec; Storing and retrieving information in memory: Versata; Analyzing data: Genetic Techs; Determining: OIP Techs; Electronic recordkeeping: Alice Corp). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea. 

c) In analyzing under step 2B, does the claim recite additional elements that amount to significantly more than the judicial exception? NO
Claims do not recite any additional elements except a generic processor such as  one or more computer-readable storage media, computer-readable program code; one or more computer processors that is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component for generating; training; evaluating; and linking that are well understood routine and conventional activities.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are  high level of generality performing code generation. The claims are not patent eligible
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of “The additional limitations “wherein the first comparison vector indicates differences between the first and second data records” that just indicates differences between records; and “receiving a plurality of data records” which are mere data gathering steps, represent well-understood, routine, conventional activity previously known to the industry and are specified at a high level of generality. That is, these limitations represent well-understood, routine, conventional activity (See MPEP 2106.05(g) or 2106.05(d) for Receiving or transmitting data over a network, e.g. see Intellectual Ventures v. Symantec; Storing and retrieving information in memory: Versata; Analyzing data: Genetic Techs; Determining: OIP Techs; Electronic recordkeeping: Alice Corp). The claims are not patent eligible.
Dependent claims 2-8, 10-14, 16-20 of claims 1, 9, 15 include all the limitations of claims 1, 15. Therefore, claims 2-8, 10-14, 16-20 recite the same abstract ideas being performed in the mind, and the analysis must therefore proceed to Step 2A Prong Two.
Dependent claims 2-8, 10-14, 16-20 recite no additional elements that are sufficient to amount to significantly more than the judicial exception as defined in independent claims 1, 9, 15. 
Claim 2 recites abstract limitations such as (generating a second comparison vector by comparing third and fourth data records of the plurality of data records; labeling the third and fourth data records as not matching, based on evaluating the second comparison vector using the machine learning model; and refining the machine learning model based on the indication) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea. The additional limitation “receiving an indication that the third and fourth data records are matching” which are mere data gathering steps, represent well-understood, routine, conventional activity previously known to the industry and are specified at a high level of generality. That is, these limitations represent well-understood, routine, conventional activity (See MPEP 2106.05(g) or 2106.05(d) for Receiving or transmitting data over a network, e.g. see Intellectual Ventures v. Symantec; Storing and retrieving information in memory: Versata; Analyzing data: Genetic Techs; Determining: OIP Techs; Electronic recordkeeping: Alice Corp). The claims are not patent eligible.
.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claims 3, 10, 16, recites abstract ideas such as (generating a second comparison vector by comparing third and fourth data records of the plurality of data records; labeling the third and fourth data records as matching, based on evaluating the second comparison vector using the machine learning model; refining the machine learning model based on the indication) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea. The additional limitation “receiving an indication that the third and fourth data records are not matching” which are mere data gathering steps, represent well-understood, routine, conventional activity previously known to the industry and are specified at a high level of generality. That is, these limitations represent well-understood, routine, conventional activity (See MPEP 2106.05(g) or 2106.05(d) for Receiving or transmitting data over a network, e.g. see Intellectual Ventures v. Symantec; Storing and retrieving information in memory: Versata; Analyzing data: Genetic Techs; Determining: OIP Techs; Electronic recordkeeping: Alice Corp). The claims are not patent eligible.
.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claims 4, 11, 17, recites abstract ideas such as (determining that the third and fourth data records represent a false positive; evaluating the second comparison vector to identify a feature pattern indicative of the false positive; and evaluating a set of matching records using the identified feature pattern) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea.  Claims do not recite any additional elements except the above limitations that are well understood routine and conventional activities.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claims 5, 12, 18, recite limitation “ wherein the first and second data records each include values for a plurality of attributes, wherein the plurality of attributes include at least one of: (i) a name of a corresponding person; (ii) a numeric identifier of the corresponding person; (iii) a date of birth of the corresponding person; (iv) an email address of the corresponding person; (v) a mailing address of the corresponding person; or (vi) a phone number of the corresponding person” that just indicates a type of data.  Claims do not recite any additional elements except limitation steps of generating and generating that are well understood routine and conventional activities.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claim 6 recites limitation (iteratively refining the machine learning model based on the manual review) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea.  The additional limitation “providing a subset of the plurality of data records for manual review; upon determining that the machine learning model is sufficiently accurate, deploying the machine learning model” which are mere data gathering steps, represent well-understood, routine, conventional activity previously known to the industry and are specified at a high level of generality. That is, these limitations represent well-understood, routine, conventional activity (See MPEP 2106.05(g) or 2106.05(d) for Receiving or transmitting data over a network, e.g. see Intellectual Ventures v. Symantec; Storing and retrieving information in memory: Versata; Analyzing data: Genetic Techs; Determining: OIP Techs; Electronic recordkeeping: Alice Corp). The claims are not patent eligible.
Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claims 7, 13, 19 similarly recite limitation (generating the first comparison vector comprises: identifying differences between the first and second data records; generating one or more scores based on the identified differences using a predefined default configuration; and aggregating the identified differences and the one or more scores) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea.  Claims do not recite any additional elements except limitation steps of generating and generating that are well understood routine and conventional activities.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.
Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.

Claims 8, 14, 20, similarly recite limitation (wherein training the machine learning model based at least in part on the first comparison vector further comprises: determining a match status of the first and second data records; and  Page 27 of 32 CONFIDENTIAL - PREPARED BY ATTORNEY FOR IBMAttorney Docket No.: 2528.108710) training the machine learning model based further on the match status) as drafted, is a process or system or medium that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. The claim is directed to an abstract idea.  Claims do not recite any additional elements except limitation steps of generating and generating that are well understood routine and conventional activities.  Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. The claims are not patent eligible.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 9, 12, 15, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Erenich et al (or hereinafter ”Eren”) (US 20190188200) in view of Jeon et al (or hereinafter “Jeon”) (US 20170262445).
As to claim 1, Eren teaches the claimed limitations:
“receiving a plurality of data records” as receiving a plurality of records (paragraphs 21, 58, 85);
“generating a first comparison vector by comparing a first and a second data record of the plurality of data records” as creating a statistical model if a particular record relating to another record (paragraphs 76, 83) of the plurality of records (paragraph 21).  The statistical model is not a first comparison vector;
“evaluating the plurality of data records using the machine learning model” as evaluating a plurality of pairs of records using machine learning model (paragraphs 80, 83-85). In particularly, a pair is informative to the machine learning model if it assists the machine learning model in determining whether two records are related to the same entity and can be based on the one or more evaluations that have been made on the pairs by feature evaluator 540 (paragraph 80); and 
“linking at least two of the plurality of data records based on the evaluation” as linking the pair of the plurality of records based on evaluation via an entity (fig. 6, paragraphs 83-85, 96).
Eren does not explicitly teach the claimed limitations:
a first comparison vector, wherein the first comparison vector indicates differences between the first and second data records;
training a machine learning model based at least in part on the first comparison vector.
Jeon teaches the claimed limitations:
“generating a first comparison vector by comparing a first and a second data record of the plurality of data records, wherein the first comparison vector indicates differences between the first and second data records” as generating a different vector as a first comparison vector based on the difference between the profile vector and the behavior vector as a first and second data record of the plurality of data records (paragraph 29), the difference vector (figs. 3-4) indicates difference scores as differences between the profile vector and a behavior vector (fig. 4, paragraphs 22, 27, 29); 
“training a machine learning model based at least in part on the first comparison vector” as training machine learning model based on the difference vector as comparison vector (paragraphs 5, 27, 32). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Jeon’s teaching to Eren’s teaching in order to  train the model using machine learning techniques for selecting content items that are most likely to be interested to a user and further to  provide an engaging user experience for a user for presenting content items that is likely to interest the user.

As to claims 5, 12, 18, Eren and Jeon teach the claimed limitation “wherein the first and second data records each include values for a plurality of attributes, wherein the plurality of attributes include at least one of: (i) a name of a corresponding person; (ii) a numeric identifier of the corresponding person; (iii) a date of birth of the corresponding person; (iv) an email address of the corresponding person; (v) a mailing address of the corresponding person; or (vi) a phone number of the corresponding person” as a plurality of records each record includes values e.g., user names, EIDs, city name for a plurality of attributes includes phone number (Eren: fig. 2, paragraph 25).

Claim 9 has the same claimed limitation subject matter as discussed in claim 1; thus claim 9 is represented under the same reason as discussed in claim 1.  In addition, Eren further teaches a computer program product comprising one or more computer-readable storage media collectively containing computer-readable program code that, when executed by operation of one or more computer processors, performs an operation comprising (paragraphs 45-46).

Claim 15 has the same claimed limitation subject matter as discussed in claim 1; thus claim 15 is represented under the same reason as discussed in claim 1.  In addition, Eren further a system comprising: one or more computer processors; and one or more memories collectively containing one or more programs which when executed by the one or more computer processors performs an operation, the operation comprising: (a processor; and memory containing one or more programs which when executed by the processor performs an operation, the operation comprising: paragraphs 45-46).

Claims 2-3, 10, 16  are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Guo et al (or hereinafter “Guo”) (US 11372900).
As to claim 2, Eren and Jeon teach the claimed limitations:
“generating a second comparison vector by comparing third and fourth data records of the plurality of data records” as  generating a different vector as a first comparison vector based on the difference between the profile vector and the behavior vector as a first and second data record of the plurality of data records (Jeon: paragraph 29; Eren: paragraphs 76, 83), the difference vector (Jeon: figs. 3-4) indicates difference scores as differences between the profile vector and a behavior vector (Jeon: fig. 4, paragraphs 22, 27, 29); 
“labeling the third and fourth data records as not matching, based on evaluating the second comparison vector using the machine learning model” as labeling the two records matching, based on relating to a similar entity using machine learning model (Eren: paragraphs 77-79); evaluating the different vector e.g., if the different vector 300C is associated with a user, then the model 240 determines a high (e.g., 80%) probability that the user will interact with candidate content items  (Jeon: paragraph 32);
“receiving an indication that the third and fourth data records are matching” as receiving a matching label as indication that two records are matching (Eren: paragraphs 79-80).
Eren does not explicitly teach the claimed limitation:
refining the machine learning model based on the indication.
Guo teaches the claimed limitations:
“refining the machine learning model based on the indication” as updating the machine learning model based on an indication of when a match was properly identified (col. 16, lines 40-67);
“receiving an indication that the third and fourth data records are matching” as receiving an indication that the records are matching (col. 16, lines 40-67).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Guo’s teaching to Eren’s teaching in order to help improve the accuracy of the matching performed by matching engine and to reduce the number of situations in which multiple unique identifiers are assigned to the same property.

As to claims 3, 10, 16, Eren and Jeon teach the claimed limitations:
“generating a second comparison vector by comparing third and fourth data records of the plurality of data records” as  generating a different vector as a first comparison vector based on the difference between the profile vector and the behavior vector as a first and second data record of the plurality of data records (Jeon: paragraph 29), the difference vector (Jeon: figs. 3-4) indicates difference scores as differences between the profile vector and a behavior vector (Jeon: fig. 4, paragraphs 22, 27, 29); 
 “labeling the third and fourth data records as matching, based on evaluating the second comparison vector using the machine learning model” as labeling the two records matching, based on relating to a similar entity using machine learning model (Eren: paragraphs 77-79); evaluating the different vector e.g., if the different vector 300C is associated with a user, then the model 240 determines a high (e.g., 80%) probability that the user will interact with candidate content items  (Jeon: paragraph 32);
“receiving an indication that the third and fourth data records are not matching” as receiving a matching label as indication that two records are matching (Eren: paragraphs 79-80).
Eren does not explicitly teach the claimed limitation:
refining the machine learning model based on the indication.  
Guo teaches the claimed limitations:
“refining the machine learning model based on the indication” as updating the machine learning model based on an indication of when a match was properly identified (col. 16, lines 40-67);
“receiving an indication that the third and fourth data records are not matching” as receiving an indication that the records are matching (col. 16, lines 40-67).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Guo’s teaching to Eren’s teaching in order to   help improve the accuracy of the matching performed by matching engine and to reduce the number of situations in which multiple unique identifiers are assigned to the same property.

Claims 4, 11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Guo and Rege et al (or hereinafter “Rege”) (US 20220138238).
As to claims 4, 11, 17, Eren and Jeon teach the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining the records are similar (Eren: paragraphs 62, 74, 76).  The similar is not  a false positive;
“evaluating a set of matching records using the identified feature pattern” as evaluating a pair of similarity records using the statistical model (Eren: paragraphs 84-85, 92).  The statistical model is not identified feature pattern.
Eren does not explicitly teach the claimed limitations:
a false positive; the identified feature pattern; evaluating the second comparison vector to identify a feature pattern indicative of the false positive.
Jeon teaches the claimed limitation:
“evaluating the second comparison vector to identify a feature pattern indicative of the false positive” as evaluating the different vector to determine a high (e.g., 80%) probability that the user will interact with candidate content items (e.g., news feed stories including text and/or multimedia, or sponsored content) that 0-20 year old users typically interact with.  Determine a high probability is not a feature pattern indicative of the false positive.
	Rege teaches the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining r2 and r2 are a false positive (paragraph 286); 
“evaluating a set of matching records using the identified feature pattern” as evaluating a set of matching records using the match rule that include 
a perfect match on national ID, a minimum match on national ID and surname, a perfect match on national ID and similar match on surname (paragraphs 106-108)
“identify a feature pattern indicative of the false positive; the identified a feature pattern” as identify matching rule to identify false positive unique identify e.g.,  SSN that does not contain 9 digits (paragraphs 106-108).  The matching rule is represented as a feature pattern indicative of the false positive.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Jeon’s teaching and Rege’s teaching to Eren’s system in order to reduce the dimensionality of the unique identities based on contained similarities, to provide plurality such dimensionality reduction processes, each process focusing on one aspect of similarity contained within the unique identities, and further to provide multiple clusters of similar  unique identities.

Claims 6 are rejected under 35 U.S.C. 103 as being unpatentable over Erenich in view of Jeon and further in view of Lowe et al (or hereinafter “Lowe”) (US 9552412) and Melo et al (or hereinafter “Melo”) ( US 20210342211) 
As to claim 6, Erenich does not explicitly teach the claimed limitation providing a subset of the plurality of data records for manual review; iteratively refining the machine learning model based on the manual review; and upon determining that the machine learning model is sufficiently accurate, deploying the machine learning model.  
Low teaches providing results for user selection; iteratively refining the query based on the user’s selection as manual review (fig. 1, col. 9, lines 60-67; col. 10, lines 1-10).  The query is not  the machine learning model .
Melo teaches the machine learning model (paragraph 43); sending the machine learning model to client devices upon the machine learning model satisfies the validation criterion (paragraph 43).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Low’s teaching and Melo’s teaching to Eren’s teaching in order to allow a user narrow search results allow the user to quickly visually parse whether a query requires refinement to reach their desired results, to  reduce round-trip exchanges with remote servers and provide a relatively low-latency responsive graphical user interface and further to perform additional training to improve (e.g., improve the accuracy of) the machine learning model.

Claims 7-8, 13-14, 19-20  are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Yan et al (or hereinafter “Yan”) (US 20200349136) and Meek et al (US 20050049987).
As to claims 7, 13,19, Eren and Jeon explicitly teach the claimed limitations:
“ wherein generating the first comparison vector comprises: identifying differences between the first and second data records” as generating the difference vector based on differences between the provide vector and behavior vector as first and second records (Jeon: fig. 5, paragraphs 29-30).
Eren does not explicitly teach the claimed limitations:
generating one or more scores based on the identified differences using a predefined default configuration; and aggregating the identified differences and the one or more scores.
Yeon teaches as training machine learning model based on the difference vector further comprise using training data and features stored in the training data store 270, vector module 250, and/or user profile store 280 (fig. 5, paragraphs 17, 29).
Yan teaches the claimed limitations:
“identifying differences between the first and second data records” as identifying differences such as DoB and suffix conflicts between records e.g., pair (r1, r6) (fig. 9, paragraphs 82-83, 86);
“generating one or more scores based on the identified differences using a predefined default configuration” as generating scores based on labels (paragraph 45) that are not the identified differences using a predefined default configuration.
“aggregating the identified differences and the one or more scores” as aggregating the identified differences e.g., DoB and suffix conflicts and the one or more scores (fig. 9, paragraphs 82-83).
Meek teaches the claimed limitation generating one or more scores using a predefined default configuration (as  generating a score using default configuration: paragraph 28). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention Meek’s teaching, Yeon’s teaching and Yan’s teaching to Eren’s system in order to select stories and content items that a target user and/or population of users are most likely to be interested in and interact with and further to  provide content items that match the interests of users, as determined from both the user profile information and behavioral data, to predict a user's possible interests in items such as products, services, and information and the like and further to create a quick and efficient seed list for items of interest.

As to claims 8, 14, 20, Eren, Yeon, Yan teach the claimed limitation “wherein training the machine learning model based at least in part on the first comparison vector further comprises: determining a match status of the first and second data records; and  Page 27 of 32 CONFIDENTIAL - PREPARED BY ATTORNEY FOR IBMAttorney Docket No.: 2528.108710) training the machine learning model based further on the match status” as  training machine learning model based on the difference vector further comprise using training data and features stored in the training data store 270, vector module 250, and/or user profile store 280 (Yeon: fig. 5, paragraphs 17, 29; Eren: paragraphs 783-85 ); Identify a match status for record pairs within a set of records (Yan: paragraphs 17, 58); and training the classifier based on the moderate-match as the match status (Yan: fig. 9, paragraphs 45, 82).

Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Saxena et al (or hereinafter “Saxena”) (US 20190215574).
As to claim 2, Eren and Jeon teach the claimed limitations:
“generating a second comparison vector by comparing third and fourth data records of the plurality of data records” as  generating a different vector as a first comparison vector based on the difference between the profile vector and the behavior vector as a first and second data record of the plurality of data records (Jeon: paragraph 29), the difference vector (Jeon: figs. 3-4) indicates difference scores as differences between the profile vector and a behavior vector (Jeon: fig. 4, paragraphs 22, 27, 29); 
“labeling the third and fourth data records as not matching, based on evaluating the second comparison vector using the machine learning model” as labeling the two records matching, based on relating to a similar entity using machine learning model (Eren: paragraphs 77-79); evaluating the different vector e.g., if the different vector 300C is associated with a user, then the model 240 determines a high (e.g., 80%) probability that the user will interact with candidate content items  (Jeon: paragraph 32);
“receiving an indication that the third and fourth data records are matching” as receiving a matching label as indication that two records are matching (Eren: paragraphs 79-80).
Eren does not explicitly teach the claimed limitation:
refining the machine learning model based on the indication.
However, Eren teaches limitation “the indication” as an indication that the records are matching (col. 16, lines 40-67).
Saxena teaches the claimed limitations:
“refining the machine learning model based on the indication” as refining the machine learning model based on received additional information (paragraph 157).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Saxena’s teaching to Eren’s teaching in order to  provide digital media that effectively engages individual users of the communication system, to enable the user to follow other users engage in a user experience with other users of the communication system, and further to increase popularity of online communication and networking, as well as the increasing amount of digital media shared via users of various communication systems.

 As to claims 3, 10, 16,  Eren and Jeon  teach the claimed limitations:“ generating a second comparison vector by comparing third and fourth data records of the plurality of data records” as  generating a different vector as a first comparison vector based on the difference between the profile vector and the behavior vector as a first and second data record of the plurality of data records (paragraph 29), the difference vector (Jeon: figs. 3-4) indicates difference scores as differences between the profile vector and a behavior vector (Jeon: fig. 4, paragraphs 22, 27, 29); 
 “labeling the third and fourth data records as matching, based on evaluating the second comparison vector using the machine learning model” as labeling the two records matching, based on relating to a similar entity using machine learning model (Eren: paragraphs 77-79); evaluating the different vector e.g., if the different vector 300C is associated with a user, then the model 240 determines a high (e.g., 80%) probability that the user will interact with candidate content items  (Jeon: paragraph 32);
“receiving an indication that the third and fourth data records are not matching” as receiving a matching label as indication that two records are matching (Eren: paragraphs 79-80).
Eren does not explicitly teach the claimed limitation:
refining the machine learning model based on the indication.  
Saxena teaches the claimed limitations:
“refining the machine learning model based on the indication” as refining the machine learning model based on received additional information (paragraph 157).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Saxena’s teaching to Eren’s system in order to  provide digital media that effectively engages individual users of the communication system, to enable the user to follow other users engage in a user experience with other users of the communication system, and further to increase popularity of online communication and networking, as well as the increasing amount of digital media shared via users of various communication systems.

Claims 4, 11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Saxena and Rege et al (or hereinafter “Rege”) (US 20220138238).
As to claims 4, 11, 17, Eren and Jeon teach the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining the records are similar (Eren: paragraphs 62, 74, 76).  The similar is not  a false positive;
“evaluating a set of matching records using the identified feature pattern” as evaluating a pair of similarity records using the statistical model (Eren: paragraphs 84-85, 92).  The statistical model is not the identified feature pattern.
Eren does not explicitly teach the claimed limitations:
a false positive; the identified feature pattern; evaluating the second comparison vector to identify a feature pattern indicative of the false positive.
Jeon teaches the claimed limitation:
“evaluating the second comparison vector to identify a feature pattern indicative of the false positive” as evaluating the different vector to determine a high (e.g., 80%) probability that the user will interact with candidate content items (e.g., news feed stories including text and/or multimedia, or sponsored content) that 0-20 year old users typically interact with.  Determine a high probability is not a feature pattern indicative of the false positive.
	Rege teaches the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining r2 and r2 are a false positive (paragraph 286); 
“evaluating a set of matching records using the identified feature pattern” as evaluating a set of matching records using the match rule that include 
a perfect match on national ID, a minimum match on national ID and surname, a perfect match on national ID and similar match on surname (paragraphs 106-108)
“evaluating the second comparison vector to identify a feature pattern indicative of the false positive; the identified a feature pattern” as identifying matching rule to identify false positive unique identify e.g.,  SSN that does not contain 9 digits (paragraphs 106-108).  The matching rule is not the second comparison vector.  The false positive unique identify is represented as a feature pattern indicative of the false positive.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Jeon’s teaching and Rege’s teaching to Eren’s system in order to reduce the dimensionality of the unique identities based on contained similarities, to provide plurality such dimensionality reduction processes, each process focusing on one aspect of similarity contained within the unique identities, and further to provide multiple clusters of similar  unique identities.

Claims 4, 11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of saxena and Trabelsi et al (or hereinafter “Tra”) (US 20210240834).
As to claims 4, 11, 17, Eren and Jeon teach the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining the records are similar (Eren: paragraphs 62, 74, 76).  The similar is not  a false positive;
“evaluating a set of matching records using the identified feature pattern” as evaluating a pair of similarity records using the statistical model (Eren: paragraphs 84-85, 92).  The statistical model is not the identified feature pattern.
Eren does not explicitly teach the claimed limitations:
a false positive; the identified feature pattern; evaluating the second comparison vector to identify a feature pattern indicative of the false positive.
Jeon teaches the claimed limitation:
“evaluating the second comparison vector to identify a feature pattern indicative of the false positive” as evaluating the different vector to determine a high (e.g., 80%) probability that the user will interact with candidate content items (e.g., news feed stories including text and/or multimedia, or sponsored content) that 0-20 year old users typically interact with.  Determine a high probability is not a feature pattern indicative of the false positive.
Tra teaches the claimed limitations:
“to identify a feature pattern indicative of the false positive” as machine model can identify patterns in code indicative of the false positive (paragraph 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Jeon’s teaching and Tra’s teaching to Eren’s system in order to allow the user to confirm or reject any of the identified issues as being problematic and further to predict the existence of false positives within the set of potential hardcoded credentials .

Claims 4, 11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Eren in view of Jeon and further in view of Guo and Trabelsi et al (or hereinafter “Tra”) (US 20210240834).
As to claims 4, 11, 17, Eren and Jeon teach the claimed limitations:
 “determining that the third and fourth data records represent a false positive” as determining the records are similar (Eren: paragraphs 62, 74, 76).  The similar is not a false positive;
“evaluating a set of matching records using the identified feature pattern” as evaluating a pair of similarity records using the statistical model (Eren: paragraphs 84-85, 92).  The statistical model is not the identified feature pattern.
Eren does not explicitly teach the claimed limitations:
a false positive; the identified feature pattern; evaluating the second comparison vector to identify a feature pattern indicative of the false positive.
Jeon teaches the claimed limitation:
“evaluating the second comparison vector to identify a feature pattern indicative of the false positive” as evaluating the different vector to determine a high (e.g., 80%) probability that the user will interact with candidate content items (e.g., news feed stories including text and/or multimedia, or sponsored content) that 0-20 year old users typically interact with.  Determine a high probability is not a feature pattern indicative of the false positive.
Tra teaches the claimed limitations:
“to identify a feature pattern indicative of the false positive” as machine model can identify patterns in code indicative of the false positive (paragraph 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Jeon’s teaching and Tra’s teaching to Eren’s system in order to allow the user to confirm or reject any of the identified issues as being problematic and further to predict the existence of false positives within the set of potential hardcoded credentials.


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAM-Y T TRUONG whose telephone number is (571)272-4042. The examiner can normally be reached (571) 272 4042.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on (571) 272 4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CAM Y T TRUONG/          Primary Examiner, Art Unit 2169