Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 1 is objected to because of the following informalities: the limitation “identifying one or more embeddings for the subset of entities” should be clarified to read “identifying one or more embeddings for the first subset of entities” as at this point in the claim the second subset of entities has no embeddings.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-3 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The limitation “the first subset of the set of entities includes entities that are both in the set and outside the strict subset” is unclear as an earlier limitation in claim 2 states that “identifying the first subset of the set of entities 
Claim 3 is dependent on claim 2 and is therefore objected to for failing to overcome the deficiencies of claim 2 on which it depends. Claim 13 is a non-transitory computer-readable medium claim and its limitation is included in claim 2. Claim 13 is rejected for the same reasons as claim 2. Claim 14 is dependent on claim 13 and is therefore objected to for failing to overcome the deficiencies of claim 13 on which it depends.	
Claim 9-10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. It is unclear how “identifying a third subset of the second set of entities and a fourth subset of the second set of entities; for each entity in the third subset of the second set of entities, using the one or more machine learning techniques to generate a second machine-learned embedding for said each entity” results in new machine-learned embeddings for entities in the first subset since the second plurality of nodes is not clearly stated to include the first subset. The claim only states that the “second plurality of nodes representing a second set of entities that includes at least some of entities in the set of 
Claim 10 is dependent on claim 9 and is therefore objected to for failing to overcome the deficiencies of claim 9 on which it depends. Claim 18 is a non-transitory computer-readable medium claim and its limitation is included in claim 9. Claim 18 is rejected for the same reasons as claim 9. Claim 19 is dependent on claim 18 and is therefore objected to for failing to overcome the deficiencies of claim 18 on which it depends

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101. Claims 1-11 are directed to a method and claims 12-20 are directed to a non-transitory computer-readable medium; therefore, claims 1-20 fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter). However, claims 1-20 fall within the judicial exception of an abstract idea, specifically the abstract ideas of “Mental Processes” (including observation, evaluation, 
	Claim 1:
Step 1: Claim 1 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 1 recites the following abstract ideas:
identifying a first subset of the set of entities and a second subset of the set of entities (mental step directed to observation – a person could identify a first and second subset of entities in their mind); 
for each entity in the second subset of the set of entities: identifying a subset of entities in the first subset that are associated with said each entity (mental step directed to observation – a person could identify a subset of entities that are associated with a given entity in their mind); 
identifying one or more embeddings for the subset of entities (mental step directed to observation – a person could identify one or more embeddings for a subset of entities in their mind); 
based on the one or more embeddings, generating an inferred embedding for said each entity (given that the broadest reasonable interpretation of generating an inferred embedding includes a pooling operation, this limitation is interpreted as a mathematical calculation).
Step 2A, Prong 2: Claim 1 recites the following additional elements:
storing a graph that comprises a plurality of nodes representing a set of entities (storing a graph is interpreted as storing and receiving data);

Step 2B, Prong 2: Claim 1 recites the following additional elements:
storing a graph that comprises a plurality of nodes representing a set of entities (storing a graph is interpreted as storing and receiving data);
for each entity in the first subset of the set of entities, using one or more machine learning techniques to generate a machine-learned embedding for said each entity (using 
Claim 12 is a non-transitory computer-readable medium claim and its limitation is included in claim 1. The only difference is that claim 12 requires a non-transitory computer-readable medium. Therefore, claim 12 is rejected for the same reasons as claim 1.
The independent claims are not patent eligible.
Dependent claims 2-11 and 13-20 when analyzed as a whole are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the 
Claim 2:
Step 1: Claim 2 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 2 recites the following abstract ideas:
identifying the first subset of the set of entities comprises selecting a strict subset of the set of entities based on each entity in the strict subset being an entity of the first entity type (mental step directed to evaluation – a person could select a strict subset of entities based on identifying the type of each entity in their mind).
Step 2A, Prong 2: Claim 2 recites the following additional elements:
the set of entities includes a first entity of a first entity type; the set of entities includes a second entity of a second entity type that is different than the first entity type; the second subset of the set of entities includes the strict subset of the set of entities; the first subset of the set of entities includes at least some entities that are both in the set of entities and outside the strict subset. These are interpreted as insignificant extra-solution activity, which does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 2 recites the following additional elements:
the set of entities includes a first entity of a first entity type; the set of entities includes a second entity of a second entity type that is different than the first entity type; the second subset of the set of entities includes the strict subset of the set of entities; the first subset of the set of entities includes at least some entities that are both in the set of entities and outside 
	Claim 3:
Step 1: Claim 3 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 3 recites the abstract ideas from claim 2 on which it depends.
Step 2A, Prong 2: Claim 3 recites the following additional elements:
the first entity type is a non-attribute entity type. These are interpreted as insignificant extra-solution activity, which does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 3 recites the following additional elements:
the first entity type is a non-attribute entity type. These are interpreted as insignificant extra-solution activity, which does not amount to significantly more (see MPEP 2106.05(g)).
	Claim 4:
Step 1: Claim 4 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 4 recites the following abstract ideas:
determining a frequency of each entity in a subset of the set of entities (mental step directed to evaluation – a person could determine a frequency of each entity in a subset in their mind);
wherein identifying the first subset of the set of entities comprises selecting a strict subset of the set of entities based on the frequency of each entity in the subset of the set of entities; wherein the first subset of the set of entities includes the strict subset of the set of 
Step 2A, Prong 2: Claim 4 does not recite any additional elements and therefore does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 4 does not recite any additional elements and therefore does not amount to significantly more.
	Claim 5:
Step 1: Claim 5 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 5 recites the following abstract ideas:
for each attribute entity in the second subset of the set of entities: identifying a plurality of non-attribute entities that are associated with said each attribute entity (mental step directed to observation, evaluation – a person could identify a plurality of non-attribute entities that are associated with an attribute entity in their mind); 
identifying a plurality of attribute entities that are associated with the plurality of non-attribute entities (mental step directed to observation, evaluation – a person could identify a plurality of attribute entities that are associated with a plurality of non-attribute entities in their mind); 
identifying a plurality of machine-learned embeddings, each of which is associated with at least one attribute entity in the plurality of attribute entities (mental step directed to 
generating a particular inferred embedding based on the plurality of machine-learned embeddings (given that the broadest reasonable interpretation of generating an inferred embedding includes a pooling operation, this limitation is interpreted as a mathematical calculation).
Step 2A, Prong 2: Claim 5 does not recite any additional elements and therefore does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 5 does not recite any additional elements and therefore does not amount to significantly more.
	Claim 6:
Step 1: Claim 6 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 6 recites the following abstract ideas:
for each entity in the subset of the set of entities: determining a frequency of said each entity (mental step directed to observation, evaluation – a person could determine the frequency of each entity in their mind); 
determining, from among the plurality of entity types, a particular entity type of said each entity (mental step directed to evaluation – a person could determine a particular type of a specific entity in their mind); 
selecting, from among the plurality of frequency thresholds, a particular frequency threshold that corresponds to the particular entity type (mental step directed to evaluation – a 
if the frequency of said each entity is higher than the particular frequency threshold, then assigning said each entity to the first subset of the set of entities (mental step directed to evaluation – a person could compare the frequency of an entity to a threshold and assign it to a subset based on the comparison result in their mind);   
if the frequency of said each entity is lower than the particular frequency threshold, then assigning said each entity to the second subset of the set of entities (mental step directed to evaluation – a person could compare the frequency of an entity to a threshold and assign it to a subset based on the comparison result in their mind).
Step 2A, Prong 2: Claim 6 recites the following additional elements:
storing a plurality of frequency thresholds, each associated with a different entity type of a plurality of entity types. This is interpreted as sending and receiving data, which does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 6 recites the following additional elements:
storing a plurality of frequency thresholds, each associated with a different entity type of a plurality of entity types. This is interpreted as sending and receiving data, which does not amount to significantly more (see MPEP 2106.05(d)(II)).
	Claim 7:
Step 1: Claim 7 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 7 recites the following abstract ideas:

Claim 7 recites the following additional elements:
the one or more embeddings are a plurality of embeddings. These are interpreted as insignificant extra-solution activity, which does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 7 recites the following additional elements:
the one or more embeddings are a plurality of embeddings. These are interpreted as insignificant extra-solution activity, which does not amount to significantly more (see MPEP 2106.05(g)).
	Claim 8:
Step 1: Claim 8 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 8 recites the following abstract ideas:
after generating the first plurality of inferred embeddings, generating a second plurality of inferred embeddings, one for each entity in the second subset (given that the broadest reasonable interpretation of generating an inferred embedding includes a pooling operation, this limitation is interpreted as a mathematical calculation); 
based on a comparison between the first plurality of inferred embeddings and the second plurality of inferred embeddings, determining an amount of difference between the first plurality of embeddings and the second plurality of embeddings (mental step directed to 
Step 2A, Prong 2: Claim 8 does not recite any additional elements and therefore does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 8 does not recite any additional elements and therefore does not amount to significantly more.
	Claim 9:
Step 1: Claim 9 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 9 recites the following abstract ideas:
determining whether to generate new machine-learned embeddings for at least some of the entities in the first subset of the set of entities (mental step directed to evaluation – a person could determine whether to generate new machine-learned embeddings for some of the entities in the subset in their mind); 
in response to determining to generate new machine-learned embeddings for at least some of the entities in the first subset of the set of entities: identifying a third subset of the second set of entities and a fourth subset of the second set of entities (mental step directed to observation – a person could identify a third and fourth subset of entities in their mind); 
for each entity in the fourth subset of the second set of entities: identifying a particular subset of entities in the third subset that are associated with said each entity (mental step directed to observation – a person could identify a particular subset of entities that are associated with each entity in their mind); 

based on the one or more second embeddings, generating a second inferred embedding for said each entity (given that the broadest reasonable interpretation of generating an inferred embedding includes a pooling operation, this limitation is interpreted as a mathematical calculation).
Step 2A, Prong 2: Claim 9 recites the following additional elements:
after generating the inferred embedding for each entity in the second subset, updating the graph to include a second plurality of nodes representing a second set of entities that includes at least some of entities in the set of entities, wherein the second plurality of nodes is different than the first plurality of nodes (adding a new set of nodes is interpreted as receiving and transmitting data); and
for each entity in the third subset of the second set of entities, using the one or more machine learning techniques to generate a second machine-learned embedding for said each entity (using machine learning techniques to generate embeddings for entities in a knowledge graph is considered to be well-understood, routine, conventional activity as taught by US 20200349919 A1 (Wanas et al). Wanas paragraph [0084] recites “feature extraction operation can extract features in a variety of ways. In one embodiment, stop words (e.g., common words that convey little or no meaning like articles, “a”, “an”, “the” etc.) can be removed and the remaining words can be assembled into a training vector. In other embodiments, other mechanisms can be used to extract features. For example, words can be stemmed, n-grams can 
Step 2B, Prong 2: Claim 9 recites the following additional elements:
after generating the inferred embedding for each entity in the second subset, updating the graph to include a second plurality of nodes representing a second set of entities that includes at least some of entities in the set of entities, wherein the second plurality of nodes is different than the first plurality of nodes (adding a new set of nodes is interpreted as receiving and transmitting data); and
for each entity in the third subset of the second set of entities, using the one or more machine learning techniques to generate a second machine-learned embedding for said each entity (using machine learning techniques to generate embeddings for entities in a knowledge graph is considered to be well-understood, routine, conventional activity as taught by US 20200349919 A1 (Wanas et al). Wanas paragraph [0084] recites “feature extraction operation can extract features in a variety of ways. In one embodiment, stop words (e.g., common words that convey little or no meaning like articles, “a”, “an”, “the” etc.) can be removed and the remaining words can be assembled into a training vector. In other embodiments, other 
	Claim 10:
Step 1: Claim 10 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 10 recites the following abstract ideas:
determining whether to generate new machine-learned embeddings is performed based on or more criteria that comprises a lapse of a particular amount of time since the inferred embedding was generated or an amount of change between different sets of inferred embeddings (mental step directed to evaluation – a person could determine whether to generate new embeddings in their mind based on criteria include a particular amount of time or an amount of change between sets of embeddings).
Step 2A, Prong 2: Claim 10 does not recite any additional elements and therefore does not integrate the abstract idea into a practical application.

	Claim 11:
Step 1: Claim 11 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 11 recites the abstract ideas from claim 1 on which it depends.
Step 2A, Prong 2: Claim 11 recites the following additional elements:
for each entity in the particular set of entities, storing a plurality of embeddings in associated with said each entity, wherein the plurality of embeddings includes (1) a second plurality of embeddings, each corresponding to a different attribute of said each entity, and (2) a particular inferred embedding that is an aggregation of the second plurality of embeddings. These are interpreted as transmitting and receiving data, which does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 11 recites the following additional elements:
for each entity in the particular set of entities, storing a plurality of embeddings in associated with said each entity, wherein the plurality of embeddings includes (1) a second plurality of embeddings, each corresponding to a different attribute of said each entity, and (2) a particular inferred embedding that is an aggregation of the second plurality of embeddings. These are interpreted as transmitting and receiving data, which does not amount to significantly more (see MPEP 2106.05(d)(II)).
Claim 13 is a non-transitory computer-readable medium claim and its limitation is included in claim 2. Claim 13 is rejected for the same reasons as claim 2.

Claim 15 is a non-transitory computer-readable medium claim and its limitation is included in claim 5. Claim 15 is rejected for the same reasons as claim 5.
Claim 16 is a non-transitory computer-readable medium claim and its limitation is included in claim 6. Claim 16 is rejected for the same reasons as claim 6.
Claim 17 is a non-transitory computer-readable medium claim and its limitation is included in claim 8. Claim 17 is rejected for the same reasons as claim 8.
Claim 18 is a non-transitory computer-readable medium claim and its limitation is included in claim 9. Claim 18 is rejected for the same reasons as claim 9.
Claim 19 is a non-transitory computer-readable medium claim and its limitation is included in claim 10. Claim 19 is rejected for the same reasons as claim 10.
Claim 20 is a non-transitory computer-readable medium claim and its limitation is included in claim 11. Claim 20 is rejected for the same reasons as claim 11.
Viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea such that the claims amount to significantly more than the abstract idea itself. Therefore, the claims are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:


(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 7, 11, 13, 18, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shi (“Improving Knowledge Graph Quality with Network Representation Learning”, herein Shi).
Regarding claim 1, Shi teaches a method comprising: 
storing a graph that comprises a plurality of nodes representing a set of entities (pg. 34, para. 3 (from section 4.1) recites “in the present work we borrow the idea of open-world assumption from probabilistic database literature and relax the closed-world assumption to develop an Open-World Knowledge Graph Completion model (i.e. a knowledge graph completion model would store a graph comprised of a plurality of nodes that represent a plurality of entities) capable of predicting relationships involving unseen entities or those entities that have only a few connections”); 
identifying a first subset of the set of entities and a second subset of the set of entities (pg. 21 para. 3-4 (from section 3.2.3) recites “Although ProjE limits the number of additional parameters, the projection operation may be costly due to the large number of candidate-entities (i.e., the number of rows in Wc). If we reduce the number of candidate-entities in the training phrase, we could create a smaller working set that only contains a subset of the embedding matrix WE. With this in mind, we use candidate sampling to reduce the number of candidate-entities. Candidate sampling is not a new problem; many recent works have addressed this problem in interesting ways. We experimented with many choices, and found that the negative sampling used in Word2Vec resulted the best performance. For a given entity e, relationship r, and a binary label vector y, we compute the projection with all of the positive candidates and only a sampled subset of negative candidates from Py following the convention of Word2Vec” (i.e. the positive candidates are the first subset and the negative candidates are the second subset)); 
for each entity in the first subset of the set of entities, using one or more machine learning techniques to generate a machine-learned embedding for said each entity (pg. 21 para. 4 (from section 3.2.3) recites for a given entity e, relationship r, and a binary label vector y, we compute the projection with all of the positive candidates and only a sampled subset of negative candidates from Py following the convention of Word2Vec. For simplicity, Py can be replaced by a (0; 1) binomial distribution B(1; Py) shared by all training instances, where Py is the probability that a negative candidate is sampled and 1 – Py is the probability that a negative candidate is not sampled (i.e. generating a machine-learned embedding for each entity in the first subset)); 
for each entity in the second subset of the set of entities: identifying a subset of entities in the first subset that are associated with said each entity (pg. 22 para. 1 (from section 3.2.3) recites for every negative candidate in y we sample a value from B(1; Py) to determine whether we include this candidate in the candidate-entity matrix Wc or not (i.e. identifying a an entity in the second subset that is associated with one or more entities in the first subset)); 
(pg. 85 para. 2 (from section 6.2.2) recites the proposed ConMask model is able to add unobserved entities to existing knowledge graphs, but it still assumed a fixed schema and can therefore only complete unobserved relationships that are similar to relationships presented during training. For example, if ConMask is trained with the observed relationship type SpouseOf, then it might be able to infer the unseen relationship MarriedTo because the pre-trained word embedding of spouse and married have some degree of similarity (i.e. identifying an existing embedding and generating an inferred embedding based on the existing embedding)); 
wherein the method is performed by one or more computing devices (pg. 48 para. 3 (from section 4.4.1) recites that ConMask is implemented in TensorFlow (i.e. performed by a computing device, as TensorFlow is well known in the art)).
	
Regarding claim 2, Shi teaches the method according to claim 1, wherein the set of entities includes a first entity of a first entity type; the set of entities includes a second entity of a second entity type that is different than the first entity type; identifying the first subset of the set of entities comprises selecting a strict subset of the set of entities based on each entity in the strict subset being an entity of the first entity type; the second subset of the set of entities includes the strict subset of the set of entities; the first subset of the set of entities includes at least some entities that are both in the set of entities and outside the strict subset (pg. 57 para. 1 (from section 5.1) recites the present work focuses on the design of a scalable HIN (i.e. heterogeneous information network) representation learning model called Star2Vec that requires little human supervision. We evaluate Star2Vec on well-known social networks, LinkedIn and Facebook, which can be viewed as HINs containing dozens of node types such as person, school, company, job title, and so on (i.e. entity subsets where entities of a second type will be different than entities of a first type). Oftentimes, HINs are dominated by a single node type wherein non-dominating node types serve to describe alternate means of connecting the primary type. Because of this interesting topology, many previous HIN data mining methods focus on mining star-structures in the network, where the dominating type (e.g., paper in DBLP, person in Facebook) anchors the star structure and other node types surround it (i.e. the dominating type is considered the first subset and the strict subset (see 112(b) rejection for interpretation of the relationship between the first entity subset and the strict subset), and the non-dominating type is considered the second entity subset)).
Regarding claim 3, Shi teaches the method according to claim 2, wherein the first entity type is a non-attribute entity type (pg. 59 para. 3-4 (from section 5.2) recites we define an HIN as a network where nodes are labeled with a type and edges are labeled with a type consistent with the connecting nodes. More than one edge type can exist between two node types. For example, an edge between person and company may be labeled as alumnus or works-at or any number of other relationship types (i.e. the first entity is a non-attribute entity type)).
Regarding claim 7, Shi teaches the method according to claim 1, wherein the one or more embeddings are a plurality of embeddings; generating the inferred embedding comprises performing a pooling operation on the plurality of embeddings (fig. 4.2 and 4.3 and pg. 45 para. 4 (from section 4.3.2) recites figure 4.3 shows the overall architecture of the target fusion process and its dependent content masking process. The target fusion process has three FCN layers. In each layer, we first use two 1-D convolution operators to perform affine transformation, then we apply sigmoid as the activation function to the convoluted output followed by batch normalization and max-pooling. The last FCN layer uses mean-pooling instead of max-pooling to ensure the output of the target fusion layer always return a single k-dimensional embedding (i.e. performing a pooling operation on a plurality of embeddings)).
Regarding claim 11, Shi teaches the method according to claim 1, wherein the second subset of the set of entities includes a particular set of entities of a non-attribute type: for each entity in the particular set of entities, storing a plurality of embeddings in associated with said each entity, wherein the plurality of embeddings includes (1) a second plurality of embeddings, each corresponding to a different attribute of said each entity, and (2) a particular inferred embedding that is an aggregation of the second plurality of embeddings (fig. 4.2 and 4.3 and pg. 45 para. 4 (from section 4.3.2) recites figure 4.3 shows the overall architecture of the target fusion process and its dependent content masking process. The target fusion process has three FCN layers. In each layer, we first use two 1-D convolution operators to perform affine transformation, then we apply sigmoid as the activation function to the convoluted output followed by batch normalization and max-pooling. The last FCN layer uses mean-pooling instead of max-pooling to ensure the output of the target fusion layer always return a single k-dimensional embedding (i.e. performing a pooling operation on a plurality of embeddings). Pg. 46 para. 2 (from section 4.3.3.) recites although it is possible to use target fusion to generate all entity embeddings used in ConMask, such a process would result in a large number of parameters. Furthermore, because the target fusion function is an extraction function it would be odd to apply it to entity names where no extraction is necessary. So, we also employ a simple semantic averaging function η(W) = (1/kl) ΣiklW[i;:] that combines word embeddings to represent entity names and for generating background representations of other textual features, where W ϵ Rkl x k is the input embedding matrix from the entity description φ(∙) or the entity or relationship name  ψ(∙) (i.e. a plurality of embeddings associated with the attributes of the first entity and an inferred entity that is an aggregation of the plurality of embeddings)).
Claim 13 is a non-transitory computer-readable medium claim and its limitation is included in claim 2. Claim 13 is rejected for the same reasons as claim 2.
Claim 18 is a non-transitory computer-readable medium claim and its limitation is included in claim 9. Claim 18 is rejected for the same reasons as claim 9.
Claim 20 is a non-transitory computer-readable medium claim and its limitation is included in claim 11. Claim 20 is rejected for the same reasons as claim 11.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 4-6, 12, and 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over Shi (“Improving Knowledge Graph Quality with Network Representation Learning”, herein Shi) in view of Parkala Srinivas (US 20210034591 A1, herein Parkala Srinivas). 
Regarding claim 4, Shi teaches the method according to claim 1, further comprising: 
wherein the first subset of the set of entities includes the strict subset of the set of entities; wherein the second subset of the set of entities includes at least some entities that are both in the set of entities and outside the strict subset (pg. 57 para. 1 (from section 5.1) recites the present work focuses on the design of a scalable HIN (i.e. heterogeneous information network) representation learning model called Star2Vec that requires little human supervision. We evaluate Star2Vec on well-known social networks, LinkedIn and Facebook, which can be viewed as HINs containing dozens of node types such as person, school, company, job title, and so on (i.e. entity subsets where entities of a second type will be different than entities of a first type). Oftentimes, HINs are dominated by a single node type wherein non-dominating node types serve to describe alternate means of connecting the primary type. Because of this interesting topology, many previous HIN data mining methods focus on mining star-structures in the network, where the dominating type (e.g., paper in DBLP, person in Facebook) anchors the star structure and other node types surround it (i.e. the dominating type is considered the first subset and the strict subset (see 112(b) rejection for interpretation of the relationship between the first entity subset and the strict subset), and the non-dominating type is considered the second entity subset)).
However, Shi does not explicitly teach determining a frequency of each entity in a subset of the set of entities; wherein identifying the first subset of the set of entities comprises 
Parkala Srinivas teaches determining a frequency of each entity in a subset of the set of entities; wherein identifying the first subset of the set of entities comprises selecting a strict subset of the set of entities based on the frequency of each entity in the subset of the set of entities (para. [0015] recites after each incoming record has been matched to candidate records, all indexing, matching and payload attributes of the incoming and matched records are evaluated. In an embodiment, only the attributes of matched records exceeding a defined threshold of matching are evaluated. For each evaluated attribute, the frequency of occurrence (f1) among the selected candidates is determined. Frequency of occurrence is defined as the total number of times the attribute is present in the set of candidates divided by the total number of candidates considered (i.e. determining the frequency of each entity in the subset and selecting a strict subset based on the frequency of the entities)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the methods of separating entities by frequency from Parkala Srinivas with the methods of entity relationship prediction methods from Shi. Parkala Srinivas and Shi are both directed to predicting and completing entity relationships, and though Shi does recite that a frequency-based distribution may yield better results, Shi only uses a simple uniform distribution (see pg. 47 paragraph 1). One of ordinary skill would benefit from using the hashing and bucketing methods from Parkala Srinivas to divide up the entities from Shi based on how frequently the 
Regarding claim 5, the combination of Shi and Parkala Srinivas teaches the method according to claim 4, wherein the set of entities includes attribute entities and non- attribute entities, further comprising: 
for each attribute entity in the second subset of the set of entities: identifying a plurality of non-attribute entities that are associated with said each attribute entity; identifying a plurality of attribute entities that are associated with the plurality of non-attribute entities (Shi fig. 5.1, fig. 5.2, and pg. 63 para. 1 (from section 5.3.1.) recite here we borrow the concept of Approximate Functional Dependency (AFD) from the literature on probabilistic databases to describe those metapaths that (approximately) functionally determine another metapath. For example, in fig. 5.1 the edge Alice!Bob can be represented by metapath “person  co-worker  person”, which is approximately functionally determined by the metapath “person  works-at  company  works-at  person” because (in most cases) knowing that two people work at the same company means that they are co-workers. Hence the co-worker edge is an in-scope connection. Of course, the two persons may also be friends or spouses, but those connections may not be in-scope as indicated by paths in the HIN (i.e. identifying attribute entities are associated with non-attribute entities that were associated with an original non-attribute entity)); 
identifying a plurality of machine-learned embeddings, each of which is associated with at least one attribute entity in the plurality of attribute entities; generating a particular inferred embedding based on the plurality of machine-learned embeddings (Shi pg. 64 para. 3 (from section 5.3.1.) recites rather than walking from ui to ui + 1, we introduce a type-aware transition function P that extends paths based on the target node type and S. This modification is important because it allows the walker to traverse both observed in-scope edges as well as unobserved edges between highly similar nodes (i.e. identifying machine-learned embeddings for the attribute entities). Pg. 85 para. 2 (from section 6.2.2) recites the proposed ConMask model is able to add unobserved entities to existing knowledge graphs, but it still assumed a fixed schema and can therefore only complete unobserved relationships that are similar to relationships presented during training. For example, if ConMask is trained with the observed relationship type SpouseOf, then it might be able to infer the unseen relationship MarriedTo because the pre-trained word embedding of spouse and married have some degree of similarity (i.e. identifying an existing embedding and generating an inferred embedding based on the existing embedding)).
Regarding claim 6, Shi teaches the method according to claim 1.
However, Shi does not teach storing a plurality of frequency thresholds, each associated with a different entity type of a plurality of entity types; for each entity in the subset of the set of entities: determining a frequency of said each entity; determining, from among the plurality of entity types, a particular entity type of said each entity; selecting, from among the plurality of frequency thresholds, a particular frequency threshold that corresponds to the particular entity type; if the frequency of said each entity is higher than the particular frequency threshold, then assigning said each entity to the first subset of the set of entities; if the frequency of said each entity is lower than the particular frequency threshold, then assigning said each entity to the second subset of the set of entities.
(Parkala Srinivas para. [0019] recites Larger or smaller thresholds can be used resulting in the use of more and fewer resources respectively, for processing the records (i.e. a plurality of thresholds)); 
for each entity in the subset of the set of entities: determining a frequency of said each entity (Parkala Srinivas para. [0015] recites after each incoming record has been matched to candidate records, all indexing, matching and payload attributes of the incoming and matched records are evaluated. In an embodiment, only the attributes of matched records exceeding a defined threshold of matching are evaluated. For each evaluated attribute, the frequency of occurrence (f1) among the selected candidates is determined. Frequency of occurrence is defined as the total number of times the attribute is present in the set of candidates divided by the total number of candidates considered (i.e. determining the frequency of each entity in the subset)); 
determining, from among the plurality of entity types, a particular entity type of said each entity (Parkala Srinivas fig. 3 and para. [0036] recite at 310 the method profiles the data from the received record to determine the type or domain of the fields for the record. The method reviews selected candidates regarding the respective fields of the record to determine the frequency with which the fields occur in the candidates at 320 and the frequency with which the fields match for matching records at 330 (i.e. determining the type of each entity)); 
selecting, from among the plurality of frequency thresholds, a particular frequency threshold that corresponds to the particular entity type (Parkala Srinivas para. [0033] recites for each entity attribute, the method determines a frequency of occurrence and frequency of matching. The method then uses the frequencies to weight the entity attributes according to each frequency and determine an overall weight for each entity attribute according to the combination of the two frequency weights. The analysis also includes a statistical analysis of the entity attributes to determine the significance of each attribute-how many records each attribute relates to, as well as the size of the index or bucket associated with each attribute, and the number of possible values each attribute has. The method does not consider, and anonymizes, attributes associated with too large a bucket-those associated with a large number of records (i.e., a number of records above a threshold level), as well as attributes having too few possible values (i.e., a number of values below a threshold level) when selecting candidates (i.e. selecting a frequency threshold for the entity type)); 
if the frequency of said each entity is higher than the particular frequency threshold, then assigning said each entity to the first subset of the set of entities; if the frequency of said each entity is lower than the particular frequency threshold, then assigning said each entity to the second subset of the set of entities (Parkala Srinivas fig. 3 and para [0036] recites at 350, the method recommends new bucket roles based upon the calculated weights from 340. At 360, the method statistically analyzes the recommended attributes to determine the significance of the fields among the records of the data set. At 370 the method makes a recommendation based upon the combination of the originally recommended fields in view of the statistical analysis of the fields. The method uses this filtered recommendation of new bucket roles to revise the bucket role listing of element 306. In an embodiment, the method adds fields to the bucket roles of 306 and existing bucket roles of element 306 are removed (i.e. assigning entities to different subsets based on a frequency threshold)).
See claim 4 for motivation to combine.
Claim 12 is a non-transitory computer-readable medium claim and its limitation is included in claim 1. The only difference is that claim 12 requires a non-transitory computer-readable medium (Parkala Srinivas para. [0062] recites the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. Para. [0062] also recites a computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire (i.e. a storage medium containing instructions executed by one or more processors)). Therefore, claim 12 is rejected for the same reasons as claim 1.
Claim 14 is a non-transitory computer-readable medium claim and its limitation is included in claim 4. Claim 14 is rejected for the same reasons as claim 4.	
Claim 15 is a non-transitory computer-readable medium claim and its limitation is included in claim 5. Claim 15 is rejected for the same reasons as claim 5.
Claim 16 is a non-transitory computer-readable medium claim and its limitation is included in claim 6. Claim 16 is rejected for the same reasons as claim 6.

Claims 8-10, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shi (“Improving Knowledge Graph Quality with Network Representation Learning”, herein Shi) in view of Li et al (US 20200293902 A1, herein Li). 
Regarding claim 8, Shi teaches the method according to claim 1, wherein generating the inferred embedding for said each entity in the second subset results in a first plurality of inferred embeddings, one for each entity in the second subset (pg. 22 para. 1 (from section 3.2.3) recites for every negative candidate in y we sample a value from B(1; Py) to determine whether we include this candidate in the candidate-entity matrix Wc or not. Pg. 85 para. 2 (from section 6.2.2) recites the proposed ConMask model is able to add unobserved entities to existing knowledge graphs, but it still assumed a fixed schema and can therefore only complete unobserved relationships that are similar to relationships presented during training. For example, if ConMask is trained with the observed relationship type SpouseOf, then it might be able to infer the unseen relationship MarriedTo because the pre-trained word embedding of spouse and married have some degree of similarity (i.e. generating an inferred embedding for each entity in the second subset based on the existing embedding, which results in a plurality of inferred embeddings)).
However, Shi does not explicitly teach after generating the first plurality of inferred embeddings, generating a second plurality of inferred embeddings, one for each entity in the second subset; based on a comparison between the first plurality of inferred embeddings and the second plurality of inferred embeddings, determining an amount of difference between the first plurality of embeddings and the second plurality of embeddings.
(Li fig. 4 and para. [0069] recite word embeddings in TMSAword is first fixed to update topic modeling TMSAtopic. With the updated topics, TMSAword is then run to learn better word embeddings. This iterative process continues until converge is achieved. The whole procedure is illustrated in Algorithm 1 (i.e. generating a second plurality of embeddings)); 
based on a comparison between the first plurality of inferred embeddings and the second plurality of inferred embeddings, determining an amount of difference between the first plurality of embeddings and the second plurality of embeddings (Li para. [0069] recites after initializing (410) the residual matrix A, the topic matrix Z and topic embedding matrix T, bias vector c for decoder, weight matrix φ, embedding bias vector b, word embeddings in TMSAword is first fixed to update (415) topic modeling TMSAtopic. TMSAword is then updated (420) with the updated topics to learn better word embeddings. An overall objective function as in Equation 10 is then calculated (425). The weight matrix φ is updated (430) with backpropagation using the overall objective function. The word embedding matrix V is then updated (435) based on the updated weight matrix φ. Such updates are repeated until the topic difference is smaller than the pre-defined E or the given epoch number is reached (i.e. comparing and determining the amount of change between different sets of embeddings)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the methods of updating the embeddings from Li with the entity relationship prediction methods from Shi. Shi and Li are both directed to discovering relationships between embedded entities – as noted 
Regarding claim 9, the combination of Shi and Li teaches the method according to claim 1, wherein the plurality of nodes is a first plurality of nodes, the method further comprising: after generating the inferred embedding for each entity in the second subset, updating the graph to include a second plurality of nodes representing a second set of entities that includes at least some of entities in the set of entities, wherein the second plurality of nodes is different than the first plurality of nodes (Shi pg. 83 para. 3 (from section 6.1.2.) recites I then relax the closed-world assumption in KGC tasks and propose an open world KGC task in Chap. 4, which permits knowledge graphs to add new entities (i.e. updating the graph to includes a second plurality of nodes, wherein the second plurality is different than the first). I presented an open-world KGC model called ConMask, which is a context-based model that converts contextual information into structural embeddings using relationship dependent content masking and fully convolutional networks); 
determining whether to generate new machine-learned embeddings for at least some of the entities in the first subset of the set of entities (Li para. [0069] recites word embeddings in TMSAword is first fixed to update topic modeling TMSAtopic. With the updated topics, TMSAword is then run to learn better word embeddings. This iterative process continues until converge is achieved. The whole procedure is illustrated in Algorithm 1. Para. [0069] also recites after initializing (410) the residual matrix A, the topic matrix Z and topic embedding matrix T, bias vector c for decoder, weight matrix φ, embedding bias vector b, word embeddings in TMSAword is first fixed to update (415) topic modeling TMSAtopic. TMSAword is then updated (420) with the updated topics to learn better word embeddings. An overall objective function as in Equation 10 is then calculated (425). The weight matrix φ is updated (430) with backpropagation using the overall objective function. The word embedding matrix V is then updated (435) based on the updated weight matrix φ. Such updates are repeated until the topic difference is smaller than the pre-defined E or the given epoch number is reached (i.e. determining whether to generate new machine-learned embeddings)); 
in response to determining to generate new machine-learned embeddings for at least some of the entities in the first subset of the set of entities: identifying a third subset of the second set of entities and a fourth subset of the second set of entities (Shi pg. 21 para. 3-4 (from section 3.2.3) recites “Although ProjE limits the number of additional parameters, the projection operation may be costly due to the large number of candidate-entities (i.e., the number of rows in Wc). If we reduce the number of candidate-entities in the training phrase, we could create a smaller working set that only contains a subset of the embedding matrix WE. With this in mind, we use candidate sampling to reduce the number of candidate-entities. Candidate sampling is not a new problem; many recent works have addressed this problem in interesting ways. We experimented with many choices, and found that the negative sampling used in Word2Vec resulted the best performance. For a given entity e, relationship r, and a binary label vector y, we compute the projection with all of the positive candidates and only a sampled subset of negative candidates from Py following the convention of Word2Vec” (i.e. the positive candidates are the third subset and the negative candidates are the fourth subset));   
for each entity in the third subset of the second set of entities, using the one or more machine learning techniques to generate a second machine-learned embedding for said each entity (Shi pg. 21 para. 4 (from section 3.2.3) recites for a given entity e, relationship r, and a binary label vector y, we compute the projection with all of the positive candidates and only a sampled subset of negative candidates from Py following the convention of Word2Vec. For simplicity, Py can be replaced by a (0; 1) binomial distribution B(1; Py) shared by all training instances, where Py is the probability that a negative candidate is sampled and 1 – Py is the probability that a negative candidate is not sampled (i.e. generating a machine-learned embedding for each entity in the third subset)); 
for each entity in the fourth subset of the second set of entities: identifying a particular subset of entities in the third subset that are associated with said each entity (Shi pg. 22 para. 1 (from section 3.2.3) recites for every negative candidate in y we sample a value from B(1; Py) to determine whether we include this candidate in the candidate-entity matrix Wc or not (i.e. identifying a an entity in the fourth subset that is associated with one or more entities in the third subset); identifying one or more second embeddings for the particular subset of entities; based on the one or more second embeddings, generating a second inferred embedding for said each entity (Shi pg. 85 para. 2 (from section 6.2.2) recites the proposed ConMask model is able to add unobserved entities to existing knowledge graphs, but it still assumed a fixed schema and can therefore only complete unobserved relationships that are similar to relationships presented during training. For example, if ConMask is trained with the observed relationship type SpouseOf, then it might be able to infer the unseen relationship MarriedTo because the pre-trained word embedding of spouse and married have some degree of similarity (i.e. identifying an existing embedding and generating an inferred embedding based on the existing embedding)).
Regarding claim 10, the combination of Shi and Li teaches the method according to claim 1, wherein determining whether to generate new machine-learned embeddings is performed based on or more criteria that comprises a lapse of a particular amount of time since the inferred embedding was generated or an amount of change between different sets of inferred embeddings (Li fig. 4 and para. [0069] recite word embeddings in TMSAword is first fixed to update topic modeling TMSAtopic. With the updated topics, TMSAword is then run to learn better word embeddings. This iterative process continues until converge is achieved. The whole procedure is illustrated in Algorithm 1. Para. [0069] also recites after initializing (410) the residual matrix A, the topic matrix Z and topic embedding matrix T, bias vector c for decoder, weight matrix φ, embedding bias vector b, word embeddings in TMSAword is first fixed to update (415) topic modeling TMSAtopic. TMSAword is then updated (420) with the updated topics to learn better word embeddings. An overall objective function as in Equation 10 is then calculated (425). The weight matrix φ is updated (430) with backpropagation using the overall objective function. The word embedding matrix V is then updated (435) based on the updated weight matrix φ. Such updates are repeated until the topic difference is smaller than the pre-defined E or the given epoch number is reached (i.e. determining whether to generate new embeddings is based on the amount of change between different sets of embeddings)). See claim 8 for motivation to combine.
Claim 17 is a non-transitory computer-readable medium claim and its limitation is included in claim 8. Claim 17 is rejected for the same reasons as claim 8.
Claim 19 is a non-transitory computer-readable medium claim and its limitation is included in claim 10. Claim 19 is rejected for the same reasons as claim 10.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20190122111 A1 (Min et al) teaches predicting new relationships in a knowledge graph by embedding a partial triplet including a head entity description and a relationship or a tail entity description to produce a separate vector for each of the head, relationship, and tail; then using the vectors as input to train a convolutional neural network to predict new relationships in the knowledge graph.
“Discriminative Predicate Path Mining for Fact Checking in Knowledge Graphs” (Shi et al) teaches link-prediction in a knowledge graph by mining paths that alternatively define a generalized statement and using a set of mined rules to evaluate the veracity of the statement. 
“Fact Checking in Heterogeneous Information Networks” (Shi et al) teaches using link prediction, similarity search and network closure to gauge the truthfulness of an assertion by mining heterogeneous connectivity patterns within a network of factual statements.
“Open-World Knowledge Graph Completion” (Shi et al) teaches filling in the missing connections of a knowledge graph by learning embeddings of an entity and parts of a text-description related to the entity to connect to unseen entities in the knowledge graph, extracting relevant snippets and then training a fully convolutional neural network to fuse the extracted snippets with entities in the knowledge graph.
“ProjE: Embedding Projection for Knowledge Graph Completion” (Shi et al) teaches filling in missing information in a knowledge graph by learning joint embeddings of the knowledge graph’s entities and edges, and through changes to the standard loss function.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571)272-8350. The examiner can normally be reached on M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR 

/L.M.F./Examiner, Art Unit 2121           




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121