DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA  and is in response to communications filed on 3/22/2022 in which claims 1-16, and 19-22 are presented for examination.

Drawings
Drawings have been acknowledged and are acceptable for examination purposes.

Specification
Specification has been acknowledged and is acceptable for examination purposes.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-16, and 19-22 are rejected under 35 U.S.C. 101 because 
The claimed invention is directed to an abstract idea without significantly more. The claims recite obtaining and identifying entities from both structured and unstructured data, analyzing the entities through a knowledge graph, determining a probability that entities correspond to one another or not based on the analysis, using machine learning to determine a second probability, and performing some action based on the determination.  These limitations describe a system which performs steps that describe a mental process (including observation, evaluation, judgement, and opinion: see MPEP 2106.04).  This judicial exception is not integrated into a practical application because it covers performance of the limitation in the mind but for the recitation of generic computer components.  That is, other than reciting, “processors” or “memories”, nothing in the claim element precludes the step from practically being performed in the mind.  For example, but for the use of “processors” or “memories”, the “determining” step in the context encompasses a user manually deciding how similar two items are based on the relationships they share.  Accordingly, the claim recites an abstract idea.
The claim generally requires data to be received.  The data is then analyzed with a knowledge graph to identify characteristics.  There is also a training algorithm or model used in order to accurately determine if two entities in the data are associated.  This could be a mental process since the mind can take in data, analyze the data, and make a conclusion based on a learned method, model, or algorithm.
The claim is also not integrated into a practical application because “training, by the device and using a machine learning module, the entity resolution model” in order to determine that an entity is associated with a second entity based on a probability is a limitation that can still be practically be performed the mind because one characteristic of mental capacity is the ability to learn in a general sense.  This limitation doesn’t go into detail on how the learning takes place in such way as to distinguish from a general learning ability.  This is also not being integrated into a practical application outside of being able to better perform the mental activity of comparing different items.  Also, dependent claims which display an output of the determination using an interface, is performed using a general computing component such as a computer screen and is not a meaningful function for applying the determination that two entities are similar.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because as discussed above with respect to integration of the abstract idea into a practical application, the additional element of using processors and memories to perform the determination step amounts to no more than mere instructions to apply the exception using a generic computer component.  Although the limitation for identifying relationships using a machine learning technique, this is also insufficient to amount to significantly more than the judicial exception because this is also recited at a high level of generality.  There’s also no instruction on how the machine learning technique develops to properly identify the relationships.  Furthermore, learning techniques and models in general can be performed in the mind.  Obvious examples are that we learn through experience.  Therefore, a general recitation of a machine learning technique is a mere instruction to apply an exception in at a high level of generality.  Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  Therefore, the claim is not patent eligible.
Applicant has amended the claim in attempts to overcome this rejection by adding limitations which include using specific models.  However, these steps do not break away from the abstract idea because models are rules applied to data in order to analyze and judge the data in a specific way.  Storage of unstructured data to structured data is organization of data which is a mental process.  Using models in order to define rules on how the data is analyzed is also a mental process.
Similar claims 8 and 15 are rejected for similar reasons.

Dependent claims go into further detail about analysis of relationships between the entities using a knowledge graph as well as applying a threshold to determine a probability for how similar the entities are to each other.  However, these are further mental processes that don’t amount to significantly more than the judicial exception of a mental process which is an abstract idea.
Other dependent claims go into detail about what kind of sources are used as well as using a machine learning model.  However, these do not amount to significantly more than the judicial exception of a mental process which is an abstract idea either since one can be trained with examples to perform a mental process more efficiently using various sources.

Claim 2 is dependent on claim 1 and includes all limitations of claim 1.  Claim 2 also includes a clarification for unstructured data, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.  Similar claims 12 and 13 are rejected for similar reasons.

Claim 3 is dependent on claim 1 and includes all limitations of claim 1.  Claim 3 also includes a limitation for an entity recognition model which is trained using machine learning, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.  Similar claim 10 is also rejected for similar reasons.  Similar claims 9-11, and 20 are rejected for similar reasons.

Claim 4 is dependent on claim 1 and includes all limitations of claim 1.  Claim 4 also includes a clarification for structured data to include data from a data structure that is associated with a characteristic of the unstructured data, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.

Claim 5 is dependent on claim 1 and includes all limitations of claim 1.  Claim 5 also includes a clarification for determining the probability with linking and weighting associated with links, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.

Claim 6 is dependent on claim 1 and includes all limitations of claim 1.  Claim 6 also includes a clarification for performing the action including sending information identifying the probability, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.  

Claim 7 is dependent on claim 1 and includes all limitations of claim 1.  Claim 7 also includes a clarification for performing the action including satisfying a threshold, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.  Similar claims 9, 14, and 18 are rejected for similar reasons.

Claim 16 is dependent on claim 15 and includes all limitations of claim 15.  Claim 16 also includes a clarification for calculating a probability based on representations of the set of characteristics, the entity and other entity, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.

Claim 19 is dependent on claim 15 and includes all limitations of claim 15.  Claim 19 also includes a clarification for obtaining unstructured data with a web crawler, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.

Claims 20 and 21 are dependent on claim 15 and includes all limitations of claim 15.  Claims 20 and 21 also include associating entities with a sentiment or characteristic, but doesn’t add significantly more than the judicial exception, and therefore doesn’t break away from the reasons for the identified abstract idea.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 15, 16, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak et al. US 9535902 B1 (hereinafter referred to as “Michalak”) in view of Roque et al US 20160203130 A1 (hereinafter referred to as “Roque”).

As per claim 1, Michalak teaches:
A method, comprising: 
receiving, by a device, unstructured data (Michalak, column 1, lines 40-50 –Includes obtaining unstructured text data); 
identifying, by the device, a first plurality of entities in the unstructured data (Michalak, column 1, lines 40-50 – Including a plurality of references corresponding to entities); 
identifying, by the device, first relationships associated with the first plurality of entities using an entity relation model (Michalak, column 6, lines 1-5 – an entity-centric approach to data analytics, focused on uncovering the interesting facts, concepts, events, and relationships defined in the data); 
receiving, by the device, structured data (Michalak, column 2, lines 1-5 and fig. 8A – Obtaining structured data); 
identifying, by the device, a second plurality of entities in the structured data (Michalak column 6, lines 9-15 – Entity extraction can be done on both structured and unstructured data.  Column 23, lines 44-50 and fig. 8A – With respect to coreference unit B, the attributes may be associated with structured entities stored in a structured data source); 
identifying, by the device, second relationships associated with the second plurality of entities using the entity relation model (Michalak, column 5, lines 28-35 – Provide for implementing analytics using both supervised and unsupervised machine learning techniques. Supervised mathematical models can encode a variety of different data "features" and associated weight information, which can be stored in a data file and used to reconstruct a model at run-time.  Column 6, lines 1-5 – An entity-centric approach to data analytics, focused on uncovering the interesting facts, concepts, events, and relationships defined in the data, wherein relationships in the structured data is interpreted as second relationships in column 6, lines 1-5); 
generating, by the device, a first knowledge graph that is representative of the first plurality of entities and the first relationships (Michalak, column 5, lines 64-67 – Building a graph of global enterprise knowledge from data.  Column 6, lines 9-15 – Assemble a rich Knowledge Graph from both unstructured data and structured data.  Column 6, lines 1-5 – an entity-centric approach to data analytics, focused on uncovering the interesting facts, concepts, events, and relationships defined in the data);
generating, by the device, a second knowledge graph, that is representative of the second plurality of entities and the second relationships (Michalak column 10, lines 47-57 – A knowledge graph from both structured and unstructured data as well as a knowledge graph which can be viewed as two sub-graphs which are separate);
determining, by the device, a first probability that a first node from the first knowledge graph and a second node from the knowledge graph correspond to a same entity that is a first entity (Michalak, column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity); 
generating, by the device and based on determining the first probability, an entity resolution model to determine a second probability that a third node, from a third knowledge graph, and a fourth knowledge graph, correspond to another same entity that is a second entity, wherein the third knowledge graph is representative of a third plurality of entities identified in additional unstructured data, the fourth knowledge graph is representative of a fourth plurality of entities identified in additional structured data, and the second entity is different from the first entity (Michalak, column 10, lines 47-57 – A Knowledge Graph can be viewed as two separate and related sub-graphs: the knowledge sub-graph identifying the entities present in text and the relationships between them; and the information sub-graph which identifies the specific pieces of information that act as evidence/support for the knowledge sub-graph, wherein the more evidence/support for relationships between entities is interpreted as probabilities that nodes correspond to entities.  Also column 24, lines 32-50 for structured and unstructured data), and 
wherein the entity resolution model includes at least one of:
an item-based collaborative filtering model that identifies a relationship between at least two entities based on one or more characteristics of the at least two entities, or a single value decomposition model that identifies the relationship between the at least two entities based the one or more characteristics of the at least two entities (Michalak, column 7, lines 53-65 – Calculations can include similarity or dissimilarity between different classes or sets of objects.  Column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity.  Michalak, column 23, lines 17-32 – Attributes are used to for comparing coreference units, wherein attributes are interpreted as characteristics);
training, by the device and using a machine learning module, the entity resolution model to determine the second probability (Michalak, column 5, lines 28-35 – Provide for implementing analytics using both supervised and unsupervised machine learning techniques. Supervised mathematical models can encode a variety of different data "features" and associated weight information, which can be stored in a data file and used to reconstruct a model at run-time.  Column 24, lines 32-50 – If the computed measure of similarity exceeds the threshold amount or degree, then coreference unit A and coreference unit B are resolved to the same entity.  However, if the threshold isn’t exceeded, further coreference resolution processes may occur in order to move towards a more definitive determination, wherein this is interpreted as determining more probabilities that are based on a first probability to determine if two entities are associated); and
automatically determining, by the device and based on training the entity resolution model and determining that the second probability exceeds a thresholds probability, that the third node and the fourth node correspond to the second entity (Michalak, column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity).
Michalak doesn’t go into detail about storing probabilities or scores inside the nodes to determine how close two nodes are in similarity, however, Roque teaches:
storing, by the device and in an edge between the first node and the second node, the first probability that the first node and the second node correspond to the same entity (Roque, [0192] – The edges of said semantic graph may include the amount or type of evidence in said document of dependencies or relations that support said edges as a weight parameter, and said distance may be calculated using a variation of a Dijkstra algorithm, where said distance is inversely correlated with the link weight or amount of evidence, such that connections between said nodes that are supported by a large amount of evidence have a lower calculated distance);
utilizing, by the device, the first probability, that is stored in the edge between the first node and the second node, to determine whether the first node and/or the second node corresponds to another entity identified in the unstructured data (Roque, [0102] – optionally also store longer paths (indirect connections) between nodes if the path has a lot of evidential support, wherein indirect connections are interpreted as determining correspondence with other entities based on an edge connecting two nodes that aren’t directly related to the other node);
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Michalak’s invention in view of Roque in order to store, in an edge, a probability that two entities are related; this is a simple substitution from storing the similarity score in a general computer component to storing the score directly between the nodes in the edge.  This is also obvious because distance between items in a knowledge graph is a known technique which yields predictable results such as showing a user which items are most related to each other (Roque, paragraph [0192]).

As per claim 2, Michalak as modified teaches:
The method of claim 1, wherein receiving the unstructured data comprises: 
obtaining at least one of a webpage, a social media post, or electronic article via a network, wherein the unstructured data includes data from at least one of the webpage, the social media post, or the electronic article (Michalak, column 6, lines 9-20 – Unstructured data (e.g., web, email, instant messaging, or social media data)).

As per claim 3, Michalak as modified teaches:
The method of claim 1, wherein the first plurality of entities are identified in the unstructured data using an entity recognition model and the second plurality of entities are identified in the structured data using the entity recognition model that was trained using a machine learning technique (Michalak, column 5, lines 28-35 – Provide for implementing analytics using both supervised and unsupervised machine learning techniques. Supervised mathematical models can encode a variety of different data "features" and associated weight information, which can be stored in a data file and used to reconstruct a model at run-time).

As per claim 4, Michalak as modified teaches:
The method of claim 1, wherein the structured data comprises data from a data structure that is associated with a characteristic of the unstructured data (Michalak, column 6, lines 15-22 – Structured data (e.g., customer information, orders or trades, transactions, or reference data)).

As per claim 5, Michalak as modified teaches:
The method of claim 1, wherein determining the probability that the first node and the second node correspond to the same entity comprises: 
determining a fifth node, from the first knowledge graph, to which the first node is linked;
determining a sixth node, from the second knowledge graph, to which the second node is linked;
determining a third probability weighting associated with a second link between the first node and the fifth node (Michalak, column 15, lines 65-67 and column 16, lines 1-10 – Component of the system performs the weighting of features based on statistics associated with a global graph.  "Similarity" reasoning (see, e.g., "Similarity" at block 118 of FIG. 1) can relate to, given a set of features, using an operator to compare concepts, relationships, or messages and generate a ranked order, wherein similarity reasoning is interpreted as determining the probability that the first and second node correspond to the same entity by comparing relationships); 
determining a fourth probability weighting associated with a second link between the second node and the sixth node (Michalak, column 15, lines 65-67 and column 16, lines 1-10); and 
determining the first probability that the first node and the second node correspond to the same entity based on the third probability weighting and the fourth probability weighting (Michalak, column 15, lines 65-67 and column 16, lines 1-10.  See also column 20, lines 41-50 and column 22, lines 1-20).

As per claim 6, Michalak as modified teaches:
The method of claim 1, further comprising: 
sending, based on determining the second probability, information identifying the second probability to a user device to permit the user device to display the information identifying the second probability (Michalak, column 9, lines 5-15 – A Particularized API layer can enable organizations and developers to integrate aspects of the present disclosure with third party applications (e.g., business intelligence tools, search and discovery tools) or create customized user interfaces that utilize results of the analysis, wherein an interface is inherently on a user device and the results of the analysis is interpreted as the probability that the entities are the same).

As per claim 7, Michalak as modified teaches:
The method of claim 1, further comprising: 
analyzing, based on automatically determining that the third node and the fourth node correspond to the second entity, the unstructured data to determine a characteristic of the unstructured data (Michalak, column 24, lines 32-50); and 
designating that the second entity is associated with the characteristic (Michalak, column 24, lines 32-50 – If the computed measure of similarity exceeds the threshold amount or degree, then coreference unit A and coreference unit B are resolved to the same entity).

Claims 15, 16 and 20 are directed to a non-transitory computer-readable medium performing steps recited in claims 1-8 with substantially the same limitations.  Therefore, the rejections made to claims 1-8 are applied to claims 15, 16 and 20.

As per claim 21, Michalak as modified teaches:
The non-transitory computer-readable medium of claim 15, wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to: 
determine, from the unstructured data and based on automatically determining that the third entity is the same entity as the fourth entity, a sentiment of the third entity and the fourth entity; and 
indicate that the third entity is associated with the sentiment (Michalak, column 18, lines 62 – 67 – Sentiment analysis).

As per claim 22, Michalak as modified teaches:
The non-transitory computer-readable medium of claim 15, wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to: 
designate, based on automatically determining that the third entity is the same entity as the fourth entity, the fourth entity as being associated with the characteristic (Michalak, column 24, lines 32-50 – If the computed measure of similarity exceeds the threshold amount or degree, then coreference unit A and coreference unit B are resolved to the same entity).

Claims 8-14 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak in view of Roque and further in view of Bonin et al US 20180197088 A1 (hereinafter referred to as “Bonin”).

As per claim 8, Michalak teaches:
A device, comprising: 
one or more memories (Michalak, column, 1, lines 59-65 – One or more processors and a memory device that is operatively coupled to the one or processors); and 
one or more processors, communicatively coupled to the one or more memories, configured to: 
receive unstructured data (Michalak, column 1, lines 40-50 –Includes obtaining unstructured text data); 
identify a first entity in the unstructured data (Michalak, column 1, lines 40-50 – Including a plurality of references corresponding to entities); 
determine a first set of characteristics of the first entity based on entity information in the unstructured data (Michalak, column 6, lines 15-22 – Structured data (e.g., customer information, orders or trades, transactions, or reference data)); 
generate a first knowledge graph associated with the first entity, wherein the first knowledge graph includes a first internal node, for the first entity (Michalak, column 5, lines 64-67 and fig. 3 – Building a graph of global enterprise knowledge from data.  Column 6, lines 9-15 – Assemble a rich Knowledge Graph from both unstructured data and structured data), 
receive structured data (Michalak, column 2, lines 1-5 and fig. 8A – Obtaining structured data); 
identify a second entity in the structured data (Michalak column 6, lines 9-15 – Entity extraction can be done on both structured and unstructured data.  Column 23, lines 44-50 and fig. 8A – With respect to coreference unit B, the attributes may be associated with structured entities stored in a structured data source); 
determine a second set of characteristics of the second entity based on the structured data (Michalak, column 6, lines 15-22 – Structured data (e.g., customer information, orders or trades, transactions, or reference data)); 
wherein the second knowledge graph includes a second internal node, for the second entity, that is linked by corresponding second edges to second external nodes corresponding to the second set of characteristics (Michalak, column 5, lines 64-67 and fig. 3 – Building a graph of global enterprise knowledge from data.  Column 6, lines 9-15 – Assemble a rich Knowledge Graph from both unstructured data and structured data); 
determine, based on analyzing the first knowledge graph and the second knowledge graph a first probability that the first entity corresponds to the second entity (Michalak, column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity);
generate, based on determining the first probability, an entity resolution model to determine a second probability that a third entity corresponds to a fourth entity,
wherein the third entity corresponds to a third internal node of a third knowledge graph, and the fourth entity corresponds to a fourth internal node of a fourth knowledge graph, and 
wherein the entity resolution model includes at least one of:
an item-based collaborative filtering model that identifies a relationship between at least two entities based on one or more characteristics of the at least two entities, or a single value decomposition model that identifies the relationship between the at least two entities based the one or more characteristics of the at least two entities (Michalak, column 7, lines 53-65 – Calculations can include similarity or dissimilarity between different classes or sets of objects.  Column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity.  Michalak, column 23, lines 17-32 – Attributes are used to for comparing coreference units, wherein attributes are interpreted as characteristics);
train, using a machine learning module, the entity resolution model to determine the second probability (Michalak, column 5, lines 28-35 – Provide for implementing analytics using both supervised and unsupervised machine learning techniques. Supervised mathematical models can encode a variety of different data "features" and associated weight information, which can be stored in a data file and used to reconstruct a model at run-time.  Column 22, lines 57-67 – Attributes can come from a variety of sources which are interpreted as entities that contribute to accuracy of connection data.  Column 24, lines 32-50 – If the computed measure of similarity exceeds the threshold amount or degree, then coreference unit A and coreference unit B are resolved to the same entity.  However, if the threshold isn’t exceeded, further coreference resolution processes may occur in order to move towards a more definitive determination, wherein this is interpreted as determining more probabilities that are based on a first probability to determine if two entities are associated); and
automatically determine, based on training the entity resolution model and determining that the second probability exceeds a thresholds probability, that the third entity is a same entity as the fourth entity (Michalak, column 24, lines 32-50 – A measure of similarity between feature vector A and feature vector B is computed and compared to a threshold degree or amount of similarity. The measure of similarity may represents a degree or amount by which coreference unit A and coreference unit B correspond to the same entity).
generate a second knowledge graph for the second entity (Michalak column 10, lines 47-57 – A knowledge graph from both structured and unstructured data as well as a knowledge graph which can be viewed as two sub-graphs which are separate), 
Michalak doesn’t go into detail about the unstructured data being from an external source, however, Bonin teaches:
that is linked by corresponding first edges to first external nodes corresponding to the first set of characteristics (Bonin, [0083] – The text data may be annotated to include identified or detected mentions and/or provide references to the entities in an external resource); 
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Michalak’s invention in view of Bonin in order to include external sources to unstructured data; this is a simple substitution of how sources are compiled from being internal to external (Bonin, paragraph [0083]).
Michalak doesn’t go into detail about storing probabilities or scores inside the nodes to determine how close two nodes are in similarity, however, Roque teaches:
store, in an edge, of the first edges between the first node and the second node, the first probability that the first node and the second node correspond to a same entity (Roque, [0192] – The edges of said semantic graph may include the amount or type of evidence in said document of dependencies or relations that support said edges as a weight parameter, and said distance may be calculated using a variation of a Dijkstra algorithm, where said distance is inversely correlated with the link weight or amount of evidence, such that connections between said nodes that are supported by a large amount of evidence have a lower calculated distance);
utilizing the first probability, that is stored in the edge between the first node and the second node, to determine whether the first node and/or the second node corresponds to another entity identified in the unstructured data (Roque, [0102] – optionally also store longer paths (indirect connections) between nodes if the path has a lot of evidential support, wherein indirect connections are interpreted as determining correspondence with other entities based on an edge connecting two nodes that aren’t directly related to the other node);
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Michalak’s invention in view of Roque in order to store, in an edge, a probability that two entities are related; this is a simple substitution from storing the similarity score in a general computer component to storing the score directly between the nodes in the edge.  This is also obvious because distance between items in a knowledge graph is a known technique which yields predictable results such as showing a user which items are most related to each other (Roque, paragraph [0192]).

Claims 9-14 are directed to a device performing steps recited in claims 1-8 with substantially the same limitations.  Therefore, the rejections made to claims 1-8 are applied to claims 9-14.

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Michalak in view of Roque and further in view of Cohen et al. US 20140067656 A1 (hereinafter referred to as “Cohen”).

As per claim 19, Michalak doesn’t go into detail about a web crawler, however, Cohen teaches:
The non-transitory computer-readable medium of claim 15, wherein the unstructured data is obtained via a web crawler that obtains the unstructured data from one or more online platforms (Cohen, [0033] – A data crawler may be used to obtain information for various sources).
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Michalak’s invention as modified in view of Cohen in order to use a web crawler; this is a known function which has been used to improve similar devices such as data gathering devices.  This would have also yielded predictable results of collecting information in a fast and efficient manner (Cohen, paragraph [0033]).

Response to Arguments
Applicant’s arguments filed 3/22/2022 have been fully considered but they are not persuasive.  Applicant’s arguments against the 101 rejection begin on page 16 of Remarks.  Each specific argument is addressed below.

Argument:  Applicant argues in Remarks on pages 17-18 that limitations recited in the claims do not describe a mental process.  Reasoning for this is that claims do not recite a mental process when they don’t contain limitations that can practically be performed in the human mind.  Claims recite generating an entity resolution model which includes item-based collaborative filtering that identifies a relationship between entities based on characteristics, or a single value decomposition model that identifies a relationship between entities based on characteristics.  These are specific models rather than a general computing component.
In Response:  As pointed out in the rejection, other than general use of computer components, the limitations meet the definition of a mental activity according to the MPEP 2106.04.  Models, even when they are recited in a specific manner or detail, are abstract ideas because they are a set of rules which compare two different entities.  This is often done in mind for comparing two different items.  Therefore, the limitations which recite using models is itself an abstract idea that corresponds to the mental processes of at least evaluation and judgment.

Argument:  Applicant argues in Remarks on page 18 that limitations recited in the claims provide significantly more even if the claims are direct to an abstract idea because the claims integrate the alleged abstract idea into a practical application.  Reasoning for this is that the claims improve the functionality of fraud detection, data mining, and/or e-discovery.  Paragraphs [0011] and [0049] of the specification gives details for possible improvements.  
In Response:  These are general improvements, but this is generally linking the use of the judicial exception, which is the mental process of evaluation and judgment, to a particular technological environment and isn’t clear to which specific computer field these steps are directed.  These improvements also aren’t claimed in a specific way because they are mentioned as general use applications in the specification.  The mental process of comparing two items can be used for multiple reasons which have practical applications, but without applying the comparison with a specific output in the claims, this isn’t sufficient for providing a basis that the claims integrate the judicial exception into a practical application.
There’s no output step in the independent claims for displaying data to the user in such a way that the user can utilize the data that has been analyzed with probabilities attached to them.  Claim 6 clarifies that the data is sent to a user device “to permit the user device to display the information…”, but this is general storage with a possibility of a user device to display the information.  Therefore, there’s not even a specific way that the claims clarify that the user device displays the data (e.g. displaying, on a user interface, the data in a ranked list by order of higher-to-lower probability).

Applicant’s arguments with respect to the 103 rejections on claims have been considered but are generally moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Olmstead et al. US 20190057310 A1 teaches an entity recognition unit as well as a relationship graph model which work together to form relationships with nodes between entities and topics in at least paragraph [0049].
Subramanian et al. US 20190354887 A1 teaches knowledge graph based learning content generation (title).
Chai et al. US 20160063106 A1 teaches “a search system identifies a collection of entities associated with a search query. The search system identifies entities related to those entities, and determines the relationships between them.” (abstract).

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Matthew Ellis whose telephone number is (571)270-3443.  The examiner can normally be reached on Monday-Friday 8AM-5PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on (571)270-0474.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

June 13, 2022
/MATTHEW J ELLIS/Primary Examiner, Art Unit 2152