DETAILED ACTION
This Office Action is in response to Application No. 16/113,089’s amendment filed on December 14, 2021. 


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Arguments
Applicant's arguments filed December 14, 2021 have been fully considered but they are not persuasive. Applicant makes a plurality of arguments, which are addressed as seen below:


Applicant makes the argument on page 7 of remarks that a predicate token and temporal tokens necessarily represent different concepts and not a single concept and points to the present specification for an example. However, applicant’s current claim language does not reflect this difference and instead states determining the temporal triples and predicate tokens individually which means that possible embodiments contain triples without temporal tokens/information. Further, applicant is pointed to the citations used in claim 2 from Michalak which distinctly show that predicate tokens can include modifier tokens and represent relationships and/or temporal data. [ (Column 4, Lines 43-46) “As used herein, a “modifier” may provide additional determination and specification of the entity, predicate, or relationship. A modifier may be necessarily bound in a relationship. “ ]. Michalak’s first citation teaches the 
 [ (Column 16 , Lines 11- 14) “ ‘Temporal and Spatial’ reasoning can relate to the assignment of space (locale) and time as a set of ranges used to constrain relationships and resolved entities.”]
[ (Column 13, Lines 56-58) “Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. “ ]
With these two citations in tandem further elaborating on representations of predicate sequences which result in carried temporal information. This is shown with the reference explicitly stating that there are temporal and spatial reasoning used to constrain relationships and entities and the last citation explicitly mentioning time as another analytic that can be used for identifying and cataloguing.
Additionally, applicant recites a method claim 1 that contains contingent limitations. As method claims with optional determinations are not a requirement, broadest reasonable interpretation can exclude them [ MPEP 2111.04(II) ]. Specifically, the claim language recites “a predicate sequence including a concatenation of a predicate token and, for the triples having the temporal information available, a sequence of temporal tokens, “. The underlined portion represents the optional sequence within the claim.
 In summation, the examiner respectfully disagrees with applicant’s argument that the reference does not teach forming predicate sequences by concatenating a sequence of temporal tokens with predicate tokens is moot (as it is not required by the broadest reasonable interpretation of the claim language) and the argument that the reference does not teach predicate sequences which result in learned representations which carry temporal information is not persuasive due to the citations and reasoning posted above and in the previous office action.



Applicant makes further arguments on page 8 that Li alone or in combination with Michalak fails to disclose a predicate sequence made up of predicate tokens and asserts that Li does not teach inputting a predicate sequence (made up of predicate tokens) or a sequence of temporal tokens into said recursive network. The examiner respectfully disagrees. [ Li (¶0043) “In embodiments, the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings)” ] teaches a consistent sequence of inputs for the embeddings and [ Li (¶0055) ] in general goes over how the relations (which is the term Li uses for predicates in their triples) are queried after they have been put into the neural network and the way the model is able to take learned relationship embeddings and use them to correlate the correct subject relationship. Although the temporal tokens are not explicitly taught in Li, they are taught with Michalak and are included in the predicate tokens as seen in the previous office action and in arguments above. The rejection is maintained. 



Applicant’s arguments on page 9 regarding dependent claim 2 have been considered and the examiner respectfully disagrees. [ Michalak (Column 13, Lines 29-33)  “Each token can then be analyzed and assigned a grammatical part of speech (POS) tag (e.g., proper noun, adjective, adverb). The tokens can be further analyzed to determine if adjacent tokens should be cojoined together if they describe the same concept. “ ] The citation from Michalak does teach conjoining of tokens as long as they relate to the same concept or subject.  [ Michalak ( Column 3, Lines 58-60 ) “As used herein, a “concept” may consist of an aggregate of either entities, predicates, or modifiers.” ] and [ Michalak (Column 4, Lines 43-46 ) “As used herein, a 
Applicant’s arguments regarding dependent claim 8 on page 9 of the Remarks are also non-persuasive. The arguments for dependent claim 8 assert that Michalak fails to teach normalizing temporal information. [ Michalak (Col. 12, Line 38 – Col 13, Line 10) “The profile can also provide an interactive timeline that shows the number of times the concept is mentioned on any given date. A newsfeed can be tied to this timeline, and sentences may be displayed, where the concept appears as part of a subject-predicate-object triple during the selected period of time. Additionally, the newsfeed can display how long ago the action took place, the name of the document that reported the information, and the total number of documents that made the same statement. This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel." ] This citation shows a timeline which would be a normalized fashion of representing temporal data even if it is not the same method the applicant uses. However, Michalak also teaches [ (Column 21, Lines 61-63 ) “ Transactional attributes describe a relationship between two entities, which may use a verb or timestamp or location. “ ] Noting the use of the timestamp in this citation, it is clear that Michalak does teach using a normalized format for temporal data.
Examiner respectfully disagrees with applicant’s arguments regarding dependent claim 9 made on page 9 of Remarks. Applicant asserts that it would not be obvious for the normalized temporal information to be in any particular format let alone the one the applicant has claimed. However, Subramanian is clear that vocabulary size is adjustable in [ Subrumanian (0071) ] via the different equations and explicitly points out the sizes can be different. Given that the size of the vocabulary can be adjusted as needed, one of ordinary skill in the art prior to the effective filing date, would have no trouble determining what would be an ideal vocabulary size for their project as needed. Furthermore, although Applicant does not state what the context of "32" is in terms of size measurements, the example provided in the specification appears to simply be the traditional MM/DD/YY date format (or any two-digit date formats) that are very common throughout society in general, even predating computers. There are only 12 months in the year and only 10 different digits that can be used to describe a day or a year hence the 12 + 10 + 10 = 32. The argument is non-persuasive. 




Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1, 2, 3, 5, 7, 8, 13, 14, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak (US 9535902 B1) in view of Li (US 2017/0109355 A1).
Regarding claim 1, Michalak (US 9535902 B1) teaches 
a method of incorporating temporal information into a knowledge graph comprising triples in a form of subject, predicate and object 
[“In a grammatical sense, these can be looked at as subject-predicate-object triples, as they describe specific activities that occur between entities (e.g., a person, place, or thing). “ (Column 13, Lines 50-53)]
	Michalak teaches the triples in the form of a subject, predicate and object and teaches incorporating temporal information in a citation below.
for link prediction,
[“Such approaches can be used to capture linguistic phenomena by utilizing the models to label sequences of characters/tokens/elements with the correct linguistic information that a model was created to predict. “  (Column 5, Lines 39-42)]
Triples being used to predict linked relationships via linguistic models as they are the basis for the models. Linguistic models representing trained networks that analyze the triples and what connection they may possibly have to make a decision.
Method comprising: determining for each of the triples, a predicate sequence
[“Such machine-learning training algorithms can learn the weights of features and persist them in a model so that inference algorithms can use the model to predict a correct label sequence to assign to the terms as they are being processed. “    (Column 5, Lines 52-55)]
[“This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel. “  (Column 12, Lines 51-53)]
Here Michalak discloses that the model is able to utilize algorithms that are learned through machine learning to predict correct label sequences. Labels here could possibly be subject, predicate, and object.
  	Including a concatenation of a predicate token
[“This can include the text content of messages, the categorized individual tokens and semantic token groups comprising those messages and metadata such as properties, relationships and events. “   (Column 7, Lines 9-12)]
[“Each token can consist of a word, punctuation mark, or special character. Each token can then be analyzed and assigned a grammatical part of speech (POS) tag (e.g., proper noun, adjective, adverb). The tokens can be further analyzed to determine if adjacent tokens should be cojoined together if they describe the same concept. “  (Column 13, Lines 29-33)]
	Michalak shows here that the tokens can be made up of a variety of different things including relationships or predicates and explicitly states that adjacent tokens can be conjoined together if they are the same concept to make a concatenation.
	and, for the triples having temporal information available, a sequence of temporal tokens
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. For example, if the date of a document is known, a temporal reference to “next Thursday” can be assigned the correct date based on the document date. “   (Column 13, Lines 56-61)]

	The predicate tokens including at least a relation type token;
[“Additionally, these nodes may have relationships to other nodes in the graph (see “edges”). For example, a node may represent a single word of text (or “token”). That node may then have a child relationship to a node representing the phrase of which the word is a part (a “chunk”). The chunk node may have other children, representing other words in the phrase. “   (Column 12, Lines 8-14)]
	This directly shows that nodes can have relationships to other nodes and nodes can be tokens. 
	So as to learn representations of the predicate sequences which carry the temporal information
[“In the Reason phase, spatial and temporal reasoning may be applied and relationships uncovered that can allow resolved entities to be compared and correlated using various graph analysis techniques. “    (Column 6, Lines 33-36)]
[“As used herein, a “predicate” may refer to the type of action or activity and reference to that activity independent of the subjects or objects of that activity. “    (Column 4, Lines 52-54)]
	In regards to the learning representations of the predicate sequences which carry the temporal information, Michalak has already taught that there can be predicate sequences and further teaches that temporal reasoning may be applied to various entities which would include those predicate sequences. 
	What Michalak does not explicitly teach is inputting the predicate sequences into a recursive neural network and using the learned representations of the predicate sequences with embeddings of the subjects and objects in a scoring function for the link prediction. 
On the other hand, Li (US 2017/0109355 A1) does teach:
Inputting the predicate sequences into a recursive neural network
[“Recurrent Neural Networks (including basic RNN and its variations such as Bi-directional RNN, Bi-directional-LSTM, and Stacked-Bi-directional-GRU) is used to identify the subject string. In embodiments, the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings) “    (¶0039)]
[“in embodiments, based on the predicted subject and relation, a structured query is generated and sent to a KG server. Then, the KG server executes the structure query to obtain the object, i.e., answer to the question. In embodiments, the KG includes data in the format of N-Triples RDF and each RDF triple has the form (subject, relation, object). “    (¶0064)]
	Li shows putting sequences into recursive neural networks in the first citation and then specifies that sequences can be of a certain type such as the relation which is equivalent to the predicate.
	Using the learned representations of the predicate sequences with embeddings of the subjects and objects in a scoring function for the link prediction
[“In embodiments, the relation ranking module aims at identifying the correct relation implied by the question in natural language. In embodiments, as the name of the module suggests, instead of using classification to choose the best relation, this problem is formulated as a ranking problem. Essentially, if a candidate relation is semantically more similar to the question, it should have a higher rank. In embodiments in this disclosure, an embedding approach is taken to measure the semantic similarity between a relation and a question. “    (¶0045)]
	In regards to the learned representations of predicate sequences being used with embeddings of the subjects and objects in a scoring function, Li explicitly teaches using embeddings for the use of ranking and measuring which is used similarly to the scoring function for the link prediction.
[“the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings), an S-Bi-GRU 212 which learns to 
	The citation in paragraph 39 talks about the embeddings being created by transformation of input sequences which are formed of subjects and objects and those being used to calculate and predict the probability. 
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method and apparatus for incorporating information into a knowledge graph including: a plurality of attributes, entities and associations as taught by Michalak with the added systems and methods for inputting various attributes into a recursive neural network and receiving a linked prediction for answering questions, as taught by Li. One of ordinary skill in the art, before the effective filing date of the claimed invention, would have found it obvious because utilizing a recursive neural network and predicate sequences to determine a scoring function for link prediction in tandem with a method for incorporating information into a knowledge graph would provide variable reduction and machine learned pattern recognition through the recursive nature of the model, which would increase the accuracy over the time of a model’s use. This higher accuracy can be utilized in improvements to query retrieval, through the reduction of false positives and unknown answers returned to the user, along with a suite of other uses.

Regarding claim 2, The method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. With the rest of the claim taught by Michalak below.
	wherein at least some of the predicate tokens include a temporal modifier token
[“As used herein, a “modifier” may provide additional determination and specification of the entity, predicate, or relationship. A modifier may be necessarily bound in a relationship. “    (Column 4, Lines 43-46)]   
	Michalak’s first citation teaches the modifiers

	The second citation teaches that a possible analytic is a temporal reference
[“The profile can also provide an interactive timeline that shows the number of times the concept is mentioned on any given date. A newsfeed can be tied to this timeline, and sentences may be displayed, where the concept appears as part of a subject-predicate-object triple during the selected period of time. “   (Column 12, Lines 42-47)]
[“This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel. “    (Column 12, Lines 51-53)]
	The citations in column 12, includes both an example of a temporal modifier token being used in tandem with predicate tokens and an explicit mention to being able to filter to the predicate category specifically as in the claim above.

Regarding claim 3, The method according to claim 2 is taught by Michalak/Li as in the rejection for claim 2 above. With the rest of the claim taught by Michalak below. 
wherein the temporal modifier token in combination with a temporal token indicates a temporal range applicable to the relation type token
[“As used herein, a “modifier” may provide additional determination and specification of the entity, predicate, or relationship. A modifier may be necessarily bound in a relationship. “ (Column 4, Lines 43-46)]
	The first citation listed here teaches modifiers in combination with a relationship
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. “  (Column 13, Lines 56-58)]
	The second reference teaches that the relationship can be temporal or time related in nature.

Regarding claim 5, The method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. With the rest of the claim taught by Li (US 2017/0109355 A1) as seen below.
wherein the recursive neural network is a long short-term memory network
[“according to embodiment of the present disclosure, in which a sequential labeling model based on word-embedding and Recurrent Neural Networks (including basic RNN and its variations such as Bi-directional RNN, Bi-directional-LSTM, and Stacked-Bi-directional-GRU) is used to identify the subject string. “    (¶0064)]
	LSTM is an acronym for long short-term memory (network).
	See the motivation to combine in claim 1.

In regards to claim 7, The method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. With the rest of the claim taught by Li (US 2017/0109355 A1) as seen below.
Wherein each token of the predicate sequence is mapped to an embedding via a linear layer
[“the embedding layer 210 transforms the one or more words of the input query into one or more embeddings, where each embedding is a vector that represents the corresponding word. Then, at step 2444, the stacked-bidirectional RNN 212, to produce one or more tokens corresponding to the one or more embeddings“    (¶0042)]
[“the embedding layer 302 transforms the one or more words of the input question into one or more embeddings 303, where each embedding is a vector representing a corresponding word. At step 3444, the S-Bi-GRU generates a vector 306 that is a neural representation of the query question. Then, at step 3446, the linear projection layer 307 projects the vector “   (¶0052)]
	Here Li explicitly teaches mapping vectors via a linear layer and then how embeddings are directly translated from vectors.
So as to generate a sequence of embeddings which is used as input to the recursive neural network
[“In embodiments, the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings), “   (¶0039)]
	To further explain, the citations above from Li teach that the embedding layer can transform input query into tokens and have them corresponding to one or more embeddings and explicitly references a linear layer for similar processes. The words chosen can be any part of the input query including the predicate.
See the motivation to combine in claim 1.
	

Regarding claim 8, The method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. With the rest of the claim taught by Michalak below.
wherein the temporal information is only available for some of the triples, the method further comprising framing the temporal information in a same relative time system
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. For example, if the date of a document is known, a temporal reference to “next Thursday” can be assigned the correct date based on the document date. “  (Column 13, Lines 56-61)]
[“Among other possible formats, structured data may have the format of tabular data in which rows correspond to entities and columns correspond to attributes associated with the entities, or vice versa. For an entity that is a person, some examples of attributes are birth date, death date, parent, relative, employer, political affiliation, nationality, and social security number. “   (Column 20, Lines 51-57)]


Regarding claim 13, Michalak teaches a system of incorporating temporal information into a knowledge graph comprising triples in a form of subject, predicate and object 
[“In a grammatical sense, these can be looked at as subject-predicate-object triples, as they describe specific activities that occur between entities (e.g., a person, place, or thing). “ (Column 13, Lines 50-53)]
for link prediction,
[“Such approaches can be used to capture linguistic phenomena by utilizing the models to label sequences of characters/tokens/elements with the correct linguistic information that a model was created to predict. “  (Column 5, Lines 39-42)]
	determining for each of the triples, a predicate sequence
[“Such machine-learning training algorithms can learn the weights of features and persist them in a model so that inference algorithms can use the model to predict a correct label sequence to assign to the terms as they are being processed. “    (Column 5, Lines 52-55)]
[“This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel. “  (Column 12, Lines 51-53)]
  	Including a concatenation of a predicate token
[“This can include the text content of messages, the categorized individual tokens and semantic token groups comprising those messages and metadata such as properties, relationships and events. “   (Column 7, Lines 9-12)]

	and, for the triples having temporal information available, a sequence of temporal tokens
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. For example, if the date of a document is known, a temporal reference to “next Thursday” can be assigned the correct date based on the document date. “   (Column 13, Lines 56-61)]
	The predicate tokens including at least a relation type token;
[“Additionally, these nodes may have relationships to other nodes in the graph (see “edges”). For example, a node may represent a single word of text (or “token”). That node may then have a child relationship to a node representing the phrase of which the word is a part (a “chunk”). The chunk node may have other children, representing other words in the phrase. “   (Column 12, Lines 8-14)]
So as to learn representations of the predicate sequences which carry the temporal information
[“In the Reason phase, spatial and temporal reasoning may be applied and relationships uncovered that can allow resolved entities to be compared and correlated using various graph analysis techniques. “    (Column 6, Lines 33-36)]
[“As used herein, a “predicate” may refer to the type of action or activity and reference to that activity independent of the subjects or objects of that activity. “    (Column 4, Lines 52-54)]
	What Michalak does not explicitly disclose is a system comprising of one or more computer processors, inputting the predicate sequences into a recursive neural network, and using the learned representations of the predicate sequences with embeddings of the subjects and objects in a scoring function for the link prediction. 
On the other hand, Li (US 2017/0109355 A1) does teach these limitations as seen below.
The system comprising one or more computer processors which, alone or in combination, are configured to provide for execution of the following steps
[“The computing system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of memory. “    (¶0049)]
	Inputting the predicate sequences into a recursive neural network
[“Recurrent Neural Networks (including basic RNN and its variations such as Bi-directional RNN, Bi-directional-LSTM, and Stacked-Bi-directional-GRU) is used to identify the subject string. In embodiments, the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings) “    (¶0039)]
[“in embodiments, based on the predicted subject and relation, a structured query is generated and sent to a KG server. Then, the KG server executes the structure query to obtain the object, i.e., answer to the question. In embodiments, the KG includes data in the format of N-Triples RDF and each RDF triple has the form (subject, relation, object). “    (¶0064)]
	Using the learned representations of the predicate sequences with embeddings of the subjects and objects in a scoring function for the link prediction
[“In embodiments, the relation ranking module aims at identifying the correct relation implied by the question in natural language. In embodiments, as the name of the module suggests, instead of using classification to choose the best relation, this problem is formulated as a ranking problem. Essentially, if a candidate relation is semantically more similar to the question, it should have a higher rank. In embodiments in this disclosure, an embedding approach is taken to measure the semantic similarity between a relation and a question. “    (¶0045)]

	With respect to Claim 13, it is substantially similar to Claim 1 and is rejected in the same manner, the same art and reasoning applying.
See the motivation to combine in claim 1.

Regarding claim 14, The system according to claim 13 is taught by Michalak/Li as in the rejection for claim 13 above. With the rest of the claim taught by Michalak below.
	wherein at least some of the predicate tokens include a temporal modifier token
[“As used herein, a “modifier” may provide additional determination and specification of the entity, predicate, or relationship. A modifier may be necessarily bound in a relationship. “    (Column 4, Lines 43-46)]
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. “    (Column 13, Lines 56-58)]
[“The profile can also provide an interactive timeline that shows the number of times the concept is mentioned on any given date. A newsfeed can be tied to this timeline, and sentences may be displayed, where the concept appears as part of a subject-predicate-object triple during the selected period of time. “   (Column 12, Lines 42-47)]
[“This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel“ (Column 12, Lines 51-53)]
	With respect to Claim 14, it is substantially similar to Claim 2 and is rejected in the same manner, the same art and reasoning applying.

Regarding claim 15, Michalak (US 9535902 B1) teaches A tangible, non-transitory computer-readable medium having instructions thereon which when executed on one or more processors provide for execution of a method of incorporating temporal information into a knowledge graph comprising triples in a form of subject, predicate and object 
[“In yet another aspect, the present disclosure relates to a non-transitory computer-readable medium. In one embodiment, the computer-readable medium stores instructions which, when executed by one or more processors, “  (¶0004)]
[“In a grammatical sense, these can be looked at as subject-predicate-object triples, as they describe specific activities that occur between entities (e.g., a person, place, or thing). “ (Column 13, Lines 50-53)]
for link prediction,
[“Such approaches can be used to capture linguistic phenomena by utilizing the models to label sequences of characters/tokens/elements with the correct linguistic information that a model was created to predict. “  (Column 5, Lines 39-42)]
	determining for each of the triples, a predicate sequence
[“Such machine-learning training algorithms can learn the weights of features and persist them in a model so that inference algorithms can use the model to predict a correct label sequence to assign to the terms as they are being processed. “    (Column 5, Lines 52-55)]
[“This news can also be filtered by predicate category, enabling the user easily view specific types of interactions, such as communication or travel. “  (Column 12, Lines 51-53)]
  	Including a concatenation of a predicate token
[“This can include the text content of messages, the categorized individual tokens and semantic token groups comprising those messages and metadata such as properties, relationships and events. “   (Column 7, Lines 9-12)]

	and, for the triples having temporal information available, a sequence of temporal tokens
[“Other analytics can include identifying and cataloging temporal and spatial references found in the text, including indirect references to time and location. For example, if the date of a document is known, a temporal reference to “next Thursday” can be assigned the correct date based on the document date. “   (Column 13, Lines 56-61)]
	The predicate tokens including at least a relation type token;
[“Additionally, these nodes may have relationships to other nodes in the graph (see “edges”). For example, a node may represent a single word of text (or “token”). That node may then have a child relationship to a node representing the phrase of which the word is a part (a “chunk”). The chunk node may have other children, representing other words in the phrase. “   (Column 12, Lines 8-14)]
So as to learn representations of the predicate sequences which carry the temporal information
[“In the Reason phase, spatial and temporal reasoning may be applied and relationships uncovered that can allow resolved entities to be compared and correlated using various graph analysis techniques. “    (Column 6, Lines 33-36)]
[“As used herein, a “predicate” may refer to the type of action or activity and reference to that activity independent of the subjects or objects of that activity. “    (Column 4, Lines 52-54)]
	What Michalak does not explicitly teach is inputting the predicate sequences into a recursive neural network, and using the learned representations of the predicate sequences 
	Inputting the predicate sequences into a recursive neural network
[“Recurrent Neural Networks (including basic RNN and its variations such as Bi-directional RNN, Bi-directional-LSTM, and Stacked-Bi-directional-GRU) is used to identify the subject string. In embodiments, the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings) “    (¶0039)]
[“in embodiments, based on the predicted subject and relation, a structured query is generated and sent to a KG server. Then, the KG server executes the structure query to obtain the object, i.e., answer to the question. In embodiments, the KG includes data in the format of N-Triples RDF and each RDF triple has the form (subject, relation, object). “    (¶0064)]
	Using the learned representations of the predicate sequences with embeddings of the subjects and objects in a scoring function for the link prediction
[“In embodiments, the relation ranking module aims at identifying the correct relation implied by the question in natural language. In embodiments, as the name of the module suggests, instead of using classification to choose the best relation, this problem is formulated as a ranking problem. Essentially, if a candidate relation is semantically more similar to the question, it should have a higher rank. In embodiments in this disclosure, an embedding approach is taken to measure the semantic similarity between a relation and a question. “    (¶0045)]
[“the model comprises an embedding layer 210 which transforms the discrete input sequence into a sequence of continuous vectors (word embeddings), an S-Bi-GRU 212 which learns to produce the features for classification, and a logistic regression (binary classification) layer 214 to predict the probability of each token being part of the subject chunk“   (¶0039)]
With respect to Claim 15, it is substantially similar to Claims 1 and 13 and is rejected in the same manner, the same art and reasoning applying.
See the motivation to combine in claim 1.


Claim 4 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak/Li in view of Minervini (US 2018/0144252 A1).
In regards to claim 4, the method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. What is not taught by Michalak/Li is wherein the scoring function is TransE or distMult 
However, Minervini does teach wherein the scoring function is TransE or distMult
[“Non-limiting examples of schema unaware models to which the present invention may be applied include the Translating Embeddings model (TransE), Bilinear-Diagonal model (DistMult) and Complex Embeddings model (ComplEx).” (¶0044)]
	This citation has Minervini showing that TransE and DistMult are both included algorithms in his disclosure
[“TransE is an energy based model, and this updating can also be equated to minimising the value of a loss function across all of the triples, thereby attempting to reach minimum of the loss function. This minimum is indicative of the representation of the data deduced to most accurately represent reality“    (¶0047)]
	This citation is explaining what one of the algorithms (TransE) would be used for in reference to a knowledge graph. As implied in the previous citation, this use and function would apply to DistMult as well.
	Therefore, it would be obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention to utilize different scoring functions for the use of incorporating information, attributes and data into a knowledge graph. The reason it would be obvious is because one of ordinary skill in the art would know to, before the effective filing date of the invention, apply known formulas used for calculating distance on a graph to determine a score 

In regards to claim 10, the method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. 
Regarding the rest of the claim which is wherein the knowledge graph is based on a company graph, and wherein the link prediction is performed to complete a query directed to predicting which of the subjects have performed a transaction for a particular one of the objects representing a company at a predetermined time or range of times
Michalak teaches the following two pieces: 
wherein the knowledge graph is based on a company graph,
at a predetermined time or range of times

[“Examples of such descriptive attributes include person or organization name, location, phone number, email address, or job title. Transactional attributes describe a relationship between two entities, which may use a verb or timestamp or location. Types of transactional attributes include interactions with another entity, and the actions of an entity by themselves; “    (Column 21, Lines 59-65)]
	This citation has Michalak showing that the relationships in the knowledge graph were able to already include relationship between two entities and the descriptive attributes include organization names. It also shows that there are timestamps which are recorded which would show the predetermined time attribute.T
	What Michalak does not explicitly disclose is wherein the link prediction is performed to complete a query directed to predicting which of the subjects have performed a transaction for a particular one of the objects representing a company
	However, that limitation is taught by Minervini as seen below.
and wherein the link prediction is performed to complete a query directed to predicting which of the subjects have performed a transaction for a particular one of the objects representing a company
[“Each of the models discussed herein assigns a prediction score to each statement (corresponding to a triple). The statement likelihood is directly correlated with the prediction score of the statement “    (¶0044)]
	Here Minervini discloses that the models utilize prediction score in order to determine statement likelihood which is used when completing a query. 
[“Triples with higher prediction scores are considered more likely than triples with a lower prediction score. As the embedding vectors are not known in advance, the model is typically configured to initialise the embedding vectors at random, and then incrementally update the vectors so to increase the prediction score of triples in the Knowledge Graph (equating to statements verified as true), “    (¶0047)]
	This second citation discloses that the predictions are still in the form of triples and would still be able to represent a company (subject) transaction (predicate) and object representing the company.  
With respect to Claim 10, it is substantially similar to Claim 4 and is rejected in the same manner, the same art and reasoning applying.
See the motivation in claim 4.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Michalak/Li in view of Wu (US 2018/0174020 A1).
In regards to claim 6, Method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above.
wherein each of the representations of the predicate sequences is determined from a last hidden state of the recursive neural network

However, Wu does disclose wherein each of the representations of the predicate sequences is determined from a last hidden state of the recursive neural network
[“Then the hidden layer will further make use of GRU to compute the sequence level representations for the query and two responses. “    (¶0132)]
	Wu discloses that the hidden layer will be the one to compute the sequence representations for the neural network.
	Therefore, it would be obvious to one of ordinary skill in the art, prior to the earliest effective filing date, to utilize a primary mode for carrying out machine learning as taught by Wu in tandem with a method and system that puts information and attributes into knowledge graphs that utilize recursive neural networks as taught by Michalak/Li. The reason it would be obvious is because one of ordinary skill in the art would know that the hidden layer of a recurrent neural network is where many assumptions, dimension reductions/expansions, and prediction happen in a model. It would follow that utilizing the last hidden layer would provide the most favorable chance of receiving an accurate prediction for a knowledge graph link.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Michalak/Li in view of Verdejo (US 2018/0150750 A1).
In regards to claim 11, Method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above, with the rest of the claim taught by Verdejo as seen below.
wherein the knowledge graph is based on criminal records

	And wherein the link prediction is performed to complete a query directed to predicting which of the subjects have committed a crime
[“By linking the phone numbers to other suspicious elements, as well as regular activity, criminal or enemy elements could be identified as well as the activities the criminal or enemy elements have been involved in. “   (¶0010)]
[“FIG. 5 is a flow chart of an example process for automatically predicting an event based on data“    (¶0009)]
	These two citations from Verdejo disclose how the knowledge graph can be used to predict events based on data, and an example of data that would be put in would be phone numbers, relationships to suspicious elements, criminal elements as well as other data. This can be utilized to make a prediction for criminal activity.
in a particular one of the objects representing a geographical areas at a predetermined time or range of times
[“For example, comparison engine 130 could perform a graph traversal to connect any mobile phone used at the time and place of an incident to any known suspect, with a shorter distance from the incident being more probable and a larger distance away being less probable. Hops (changes from one cell tower to another) could be used instead of distance. “  (¶0014)]
	This citation shows an example of how distance or geographical data could be input into the model and it also discloses that time and place are input into the model explicitly.
or to complete a query directed to predicting which of the objects representing the geographical areas are most likely to see a criminal activity by a particular one of the subjects at a predetermined time or range of times

[“For example, the analytics system may identify thousands, millions, or billions of pre-events, post-events, and/or contexts associated with thousands, millions, or billions of historical events. A pre-event may include an event that occurs before the historical event, a post-event may include an event that occurs after the historical event, and the context of the historical event may include a location where the historical event occurred, a device identifier for a device that provided the data associated with the historical event, a time of day or day of the week when the historical event occurred, a classification of the historical event, and/or the like. “   (¶0015)]
	To elaborate further, Verdejo teaches that linking of data to the knowledge graph can be used to predict criminal elements or the activities they have been involved with and has a reference figure that outlines the process of predicting. The claim limitation that recites objects representing a geographical areas is taught by Verdejo through the citation explaining that devices identifiers can provide data associated with the historical event. This in tandem with the citation above from 
	Therefore, it would be obvious to one of ordinary skill in the art, prior to the earliest effective filing date, to combine methods and systems for inputting data and attributes into knowledge graphs such as Michalak/Li, directed towards a goal such as predicting, outlining or analysis on one particular aspect like crime prevention or prediction. This would be obvious because combining the two would provide a targeted use case for the information apparatus but also provide an incredibly accurate prediction and analysis system through the use of the model’s triple system and recurrent ability for pattern recognition. The triple system allows for an enhanced breakdown of search or analysis by different category such as what the crime was, what was the subject of the crime or attributes that can be connected to the crime and the model would be able to provide a search of combined features or entities through link prediction that would return a possible result or answer that the user was looking for. 

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Michalak/Li in view of Dotan-Cohen (WO 2017/083210 A1).

In regards to claim 12, Method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above. Li also teaches the following:
And wherein the link prediction is performed to complete a query
[“In embodiments, based on the predicted subject and relation, a structured query is generated and sent to a KG server. Then, the KG server executes the structure query to obtain the object, i.e., answer to the question. In embodiments, the KG includes data in the format of N-Triples RDF and each RDF triple has the form (subject, relation, object). “    (¶0064)]
	What is not distinctly disclosed by Michalak/Li is the following:
wherein the knowledge graph is based on information taken from a sensor integrated management system,
Directed to predicting which of the subjects representing a component of the system have performed a communication for a particular one of the objects at a predetermined time or range of times
However, Dotan-Cohan teaches those limitations as seen below.
wherein the knowledge graph is based on information taken from a sensor integrated management system
[“Data sources 104a and 104b through 104n may comprise data sources and/or data systems, which are configured to make data available to any of the various constituents of operating environment 100, or system 200 described in connection to FIG. 2. (For example, in one embodiment, one or more data sources 104a through 104n provide (or make available for accessing) user data to user-data collection component 210 of FIG. 2.) Data sources 104a and 
	This citation from Dotan-Cohan explicitly discloses that the data can come from various systems, sensors and data that has been collected from integrated devices. All of which can be combined into a sensor integrated management system.
	Directed to predicting which of the subjects representing a component of the system have performed a communication for a particular one of the objects at a predetermined time or range of times
[“Data corresponding to user activity may be gathered over time using sensors of one or more of the user's computing devices. From this historical user activity information, a computer system may learn user activity patterns associated with the computing devices. By analyzing a user activity pattern, future user actions may be predicted or user intent may be inferred. “  (¶0005)] 
[“For example, computing devices associated with a user ("user devices") may employ one or more sensors to generate data relevant to a user's activity on a user device(s). The user activity may be monitored, tracked, and used for determining patterns of user activity. Examples of user activity patterns may include, without limitation, activity patterns based on time, location, content, or other context, as described herein “   (¶0004), (¶0005)]
	Next, Dota-Cohen teaches that the data gathered can be tracked and used to predict or infer actions and intent. This falls in line with the claim’s limitation of analyzing the data to predict if a component of the system had communicated for a particular one of the objects at a 
	Therefore, it would be obvious to one of ordinary skill in the art, prior to the earliest effective filing date, to direct a system and method for inputting vast amounts of data with various attributes and combine it with a method and system for gathering that data. It would be obvious to one of ordinary skill in the art, prior to the effective filing date, because it would be facilitating a need for the vast amounts of data that the knowledge graph would need in order to reliably build a data set and knowledge base to answer queries, but also to train and learn from in order to accurately make predictions. It also completes a need for directly gathering data rather than needing the data to be sanitized and formatted to be fed. This would allow for much faster prediction time in certain embodiments and even the ability to create feedback systems with the data and the systems.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Michalak/Li in view of Subramanian (US 2019/0354887 A1).
In reference to claim 9, Method according to claim 1 is taught by Michalak/Li as in the rejection for claim 1 above, with the rest of the claim taught by Subramanian as seen below.
wherein the temporal tokens have a vocabulary size of 32
[“the hardware processor 1304 of FIG. 13) may determine a word embedding similarity 110 (e.g., word2vec, GLOVE, etc.) between each concept of the plurality of concepts 106. In this regard, word embedding may provide for mapping of words or phrases from a vocabulary to vectors of real numbers. “    (¶0039)]
[“For Equation (2) and Equation (3), wci may represent the weight of concept ci in contents Cj, wcj may represent the weight of concept cj in contents Ci, f may represent the term frequency, 
	To explain further, Subramanian teaches a vocabulary for knowledge graphs and shows that they can be mapped to vectors or embeddings so they are analogous to the claim limitation in respect to temporal tokens. Subramanian also teaches that the vocabulary size is adjustable and can be any number represented by the equation above.
	Therefore, it would be obvious to one of ordinary skill in the art, prior to the earliest effective filing date, to utilize a vocabulary size in tandem with a method and system for incorporating data into a knowledge graph. The reason it would be obvious is because of MPEP §2144.05 – Obviousness of Similar and Overlapping Ranges, Amounts, and Proportions [R-10.2019]. Since the claim shows an obvious subrange that falls within the range of the prior art, and was achieved through routine experimentation, a prima facie case of obviousness exists. It is obvious to one of ordinary skill in the art to utilize a size of 32 for the vocabulary of the temporal tokens given the format desired for 10 possible digits for a day, 12 for the month and 10 for the year.
	In the alternative, this limitation can also be seen as a design choice as there are an endless amount of ways to divide the possibilities of representing a date in token form and therefore is still considered obvious. 



THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 10055450 B1 – Efficient management of temporal knowledge that includes the teachings of triples, time based information within the triples and tokenization.  
33.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL MERABI whose telephone number is (571)272-9685. The examiner can normally be reached Mon-Fri 7:30am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/M.A.M./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123