Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/03/2021 has been entered.

Status of Claims
This action is in reply to the amendments and remarks filed on 02/03/2021.
Claims 1-20 are pending.
Claims 1, 8, and 15 have been amended.

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 8, and 15 under 35 U.S.C. 103, have been considered but they are not persuasive. More specifically, the applicant presents arguments that no art of record teaches the amended claims 1, 8, and 15 limitations, and further that Michalak’s “annotations and tags are based not only on relationships identified from a natural language scan, but 
As a first matter, for all high level arguments directed to the prior art not teaching the newly added claim limitations amendments, see 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.
Next, Applicant’s spec paragraph 0040 states “[a]nnotations can take the form of tags, tokens, or any other solution for annotating a document that is now known or later developed”. Further, applicant’s specification does not explicitly define the term “automatically”, so the Examiner will interpret this to be not requiring human interaction for completing the claimed operations. Further, Applicant does not clarify what is specifically meant by the claimed “machine learning annotations”, and thus this can be interpreted broadly as long as the annotations are applied to a machine learning technique.
Michalak, Col. 5, lines 28-56, Col. 6, lines 9-25, Col. 6, lines 47-Col. 7, line 42, Col. 13, lines 22-38, and Col. 15, lines 47-64 teach “reading messages and enriching them with semantic annotations” through system operations (automatically annotating the unstructured documents with machine learning model annotations), and further assigning “grammatical part of speech (POS) tag[s]” and categories to tokenized text chunks based on a “library (e.g., lexicon)”, NLP of the ingested data, “cojoin[ing]” determinations of searched entity relationships (based on both the attributes retrieved from the structured data and the relationships identified from the natural language scan). It is further taught that “a machine-learning training algorithm” creates a machine learning model by utilizing the “annotated data (machine learning model annotations)” sections of the text in document form (to form a set of machine learning model documents), and that the algorithms can be trained with “supervised” data to thus create “annotated data” on unseen data (that have a format and types of information that simulate manually generated machine learning model annotated documents). 
See 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 3-8, 10-15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak et al (US Patent 9535902) hereinafter Michalak, in view of Deshpande et al (US Pub 20130325881) hereinafter Deshpande.
Regarding claims 1, 8, and 15, Michalak teaches a method, a system, and a computer program product for creating an artificial intelligence machine learning model, comprising: 
a memory medium comprising instructions; a bus coupled to the memory medium; and a processor coupled to the bus that when executing the instructions; a computer readable storage media, and program instructions stored on the computer readable storage media, that cause the system/at least one computer device to (Col. 1, line 41-Col. 2, line 31, Col. 27, line 11-Col. 28, line 52, and Fig. 11 teach a method, system with “one or more processors”, and “computer-readable medium” connected to a bus with stored executable “instructions” to perform the embodiments of the disclosure):
selecting a set of unstructured documents stored in an unstructured format in an intelligence database (Examiner note: Applicant’s spec paragraph 0033 states unstructured documents being “in a text-based format”.
Michalak, Col. 1, lines 41-58 and Col. 6, line 63-Col. 7, line 15 teach obtaining “unstructured data” as “text documents” (selecting a set of unstructured documents stored in an unstructured format) and it is further taught these can be stored in a knowledge base (intelligence database), where the knowledge base is further taught as being an “intelligent data caching”), each unstructured document of the set of unstructured documents being archived data that has been retained after a set of records of structured data has been created in a relational database based on the unstructured document (Michalak, Col. 1, lines 41-58 and Col. 6, line 63-Col. 7, line 15 teach “unstructured data” as “text documents” being stored in knowledge base further taught as being an “intelligent data caching” (each unstructured document of the set of unstructured documents being archived data that has been retained), and “obtaining structured data including predefined (after a set of records of structured data has been created) attributes associated with the entities (based on the unstructured document)”. It is further taught in Col. 6, line 63-Col. 7, line 42, Col. 12, lines 38-53, and Col. 13, line 62-Col. 14, line 10 that the unstructured/structured data can be in document form, and stored in a knowledge base (intelligence database) including an “intelligent data caching” and “an annotated message store” (structured data has been created in a relational database) locally, where these can further include “text content of messages…and metadata such as properties, relationships and events” such as a “Knowledge Graph” or “hashing column” (in a relational database)); 
retrieving attributes associated with a set of entities in the set of unstructured documents from the structured data that has previously been created based on the unstructured documents and stored separately within the intelligence database by searching the relational database for each entity of the set of entities and returning relational attributes from the relational database that are structurally related to the entity within the relational database,  (Col. 1, lines 41-58 and Fig. 9 teach “obtaining unstructured text data including a plurality of references corresponding to entities (set of entities in the set of unstructured documents), and determining (retrieving), from the unstructured text data, attributes associated with the entities”; and further “obtaining structured data including predefined (previously been created) attributes (for each entity of the set of entities) associated with the entities (set of entities in the set of unstructured documents/based on the unstructured documents)”. It is further taught in Col. 5, lines 3-4, Col. 5, line 64-Col. 6, line 8, Col. 6, line 63-Col. 7, line 52, Col. 12, lines 18-53, and Col. 13, line 62-Col. 14, line 10 that the structured data “may refer to attribute-value pairs and relationships with pre-defined meaning (for each entity of the set of entities)”, and the unstructured/structured data can be in document form, and stored in a knowledge base (intelligence database) including an “intelligent data caching” and “an annotated message store” (relational database) locally, where these can further include “text content of messages…and metadata such as properties, relationships and events” such as a “Knowledge Graph” or “hashing column” (structured data stored separately) that entities can be searched for (by searching the relational database) to provide attribute “relationships defined in the data” (and returning relational ; 
performing a natural language scan of the unstructured documents to identify relationships between the entities (Col. 6, lines 9-25 and Col. 6, line 63-Col. 7, line 15 teach NLP (performing a natural language scan) of the ingested data including unstructured data (unstructured documents); where the obtained “unstructured data” as “text documents” can be stored in an intelligent cached knowledge base. Col. 13, lines 22-44 and Col. 14, lines 18-27 further teach utilizing natural language processing in order to determine (identify) if different text can “be cojoined together if they describe the same concept (relationships between the entities)” and which “predefined category” different texts belong (relationships between the entities). Additionally, NLP is used in order to determine (identify) if different extracted instances of entities refer to the same entity (for example: “‘John’ and ‘Smith’”).); 
automatically annotating the unstructured documents with machine learning model annotations based on both the attributes retrieved from the structured data and the relationships identified from the natural language scan to form a set of machine learning model documents that have a format and types of information that simulate manually generated machine learning model annotated documents (Examiner note: Applicant’s spec paragraph 0040 states “[a]nnotations can take the form of tags, tokens, or any other solution for annotating a document that is now known or later developed”. Further, applicant’s specification does not explicitly define the term “automatically”, so the Examiner will interpret this to be not requiring human interaction for completing the claimed operations.
automatically annotating the unstructured documents with machine learning model annotations), and further assigning “grammatical part of speech (POS) tag[s]” and categories to tokenized text chunks based on a “library (e.g., lexicon)”, NLP of the ingested data, “cojoin[ing]” determinations of searched entity relationships (based on both the attributes retrieved from the structured data and the relationships identified from the natural language scan). It is further taught that “a machine-learning training algorithm” creates a machine learning model by utilizing the “annotated data (machine learning model annotations)” sections of the text in document form (to form a set of machine learning model documents), and that the algorithms can be trained with “supervised” data to thus create “annotated data” on unseen data (that have a format and types of information that simulate manually generated machine learning model annotated documents)); and 
forming the machine learning model based on the annotated documents (Col. 5, lines 51-56 and Col. 6, lines 47-62 teach “passing annotated data (based on the annotated documents) to a machine-learning training algorithms that creates (forms) an appropriate model (machine learning model)”; and also training (forming) a model (machine learning) after annotation).
However Michalak does not explicitly teach wherein the searching of the relational database includes a fuzzy logic search.
Deshpande teaches wherein the searching of the relational database includes a fuzzy logic search (paragraphs 0006, 0032, 0035, 0038, 0041, 0055, and 
Further, Michalak at least implies selecting a set of unstructured documents stored in an intelligence database (see mapping above); and retrieving attributes…from the structured data that has previously been created based on the unstructured documents and stored separately within the intelligence database (see mapping above), however Deshpande teaches selecting a set of unstructured documents stored in an intelligence database; retrieving attributes…from the structured data that has previously been created based on the unstructured documents and stored separately within the intelligence database (paragraphs 0006, 0035, 0038, and 0088 teach unstructured and structured data retrieved from the data system and that “unstructured data may…be stored along with the structured data…in a database (stored separately within the intelligence database)”, where the system “analyzes a document with unstructured data and extracts attribute values from the unstructured data for one or more entities of the data system associated with structured information (structured data that has previously been  
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Deshpande’s teachings of storing unstructured and structured data separately in the same database and using fuzzing matching into Michalak’s teaching of creating machine learning models from annotating unstructured documents from structured data in order to optimize data recall and storage efficiency (Deshpande, paragraphs 0006, 0032, 0035, 0038, 0041, 0055, and 0088).

Regarding claim 3 and 10, the combination of Michalak and Deshpande teach all the claim limitations of claims 1, 8, and 15 above; and further teach the attributes are retrieved from the intelligence database include attribute names for the entities in the structured data (Michalak, Col. 4, lines 24-31 and Col. 13, lines 39-44 teach using “[a] library (e.g., lexicon) of predefined categories (attribute names)” to “known entities and their respective category types”. Col. 1, lines 41-58 and Fig. 9 additionally teach “obtaining structured data including predefined attributes associated with the entities (entities in the set of structured data)”. It is further taught in Col. 6, line 63-Col. 7, line 42, Col. 12, lines 38-53, and Col. 13, line 62-Col. 14, line 10 that the unstructured/structured data can be in document form, and stored in a knowledge base (intelligence database) including an “intelligent data caching” and “an annotated message store” locally.).

Regarding claim 4, 11, and 17, the combination of Michalak and Deshpande teach all the claim limitations of claims 3 and 10 above; and further teach the attributes further include an entity to which an entity belongs, an attribute type, a relationship to a document, a semantic of an attribute, a semantic of the entity, and a value of an attribute (Michalak, Col. 12, line 38-Col. 13, line 10, Col. 7, lines 16-52, and Col. 20, lines 51-65 teach entities can be classified as a person, organizations, etc. (an entity to which an entity belongs) and entity attributes further include columns of “aliases, date of birth and death, places of residence, organization memberships, titles, spouses, siblings, and/or children (all above listed being an attribute type, a semantic of an attribute, a semantic of the entity)…number of times the concept is mentioned on any given date (value of an attribute)…the name of the document that reported the information (relationship to a document), and the total number of documents that made the same statement (relationship to a document/value of an attribute)”).

Regarding claim 5, 12, and 18, the combination of Michalak and Deshpande teach all the claim limitations of claims 1, 8, and 15 above; and further teach the identifying of the relationship further comprises analyzing a set of words in an unstructured document that connect a first entity and a second entity within the unstructured document (Michalak, Col. 14, lines 45-61 teach local analysis of unstructured documents forming “coreference chain (connecting)” for entities (first and second entities) in one document (within the unstructured document) and aggregating context forming a “chain signature” for entities (first and second entities) to compare , and 
wherein the annotating further comprises documenting the relationship in a first token associated with the first entity and in a second token associated with a second entity (Examiner note: Applicant’s spec paragraph 0040 states “[a]nnotations can take the form of tags, tokens, or any other solution for annotating a document that is now known or later developed”.
Michalak, Col. 5, lines 51-56, Col. 6, lines 47-62, Col. 13, lines 22-44, and Col. 14, lines 45-61 teach “reading messages and enriching them with semantic annotations”, and further assigning (documenting) “grammatical part of speech (POS) tag[s]” and categories to tokenized text chunks (first and second tokens) based on a “library (e.g., lexicon)” of annotations. An example taught, “‘John (first entity)’ and ‘Smith (second entity)’” are tagged as nouns (first and second tokens) and conjoined since they reference the same person in the text. Another example taught is determining that “‘She’, ‘her’, ‘Barbara’, ‘Ms. Streisand’, ‘famous singer’” all refer to the same person in the same document and are all equal to each other.).

Regarding claim 6, 13, and 19, the combination of Michalak and Deshpande teach all the claim limitations of claims 1, 8, and 15 above; and further teach training an artificial intelligence using the machine learning model (Michalak, Col. 5, lines 28-56, Col. 6, lines 47-62, and Col. 7, line 66-Col. 8, line 8 teach using “supervised or semi-supervised” machine learning prediction and classification algorithms (AI) with learned (trained) weights and features persisted through models (using the machine learning model), and further continually training the models through back-propagation of results (training the artificial intelligence using the machine learning model)).

Regarding claim 7, 14, and 20, the combination of Michalak and Deshpande teach all the claim limitations of claims 1, 8, and 15 above; and further teach parsing, prior to the forming of the machine language model, the annotated documents to remove from a document unannotated portions of the document (Michalak, Col. 5, lines 51-56 teach “a machine-learning training algorithm” utilizing the “annotated data (parsing the annotated documents)” sections of the text in order to create a machine learning model (prior to the forming of the machine language model); therefore, this is interpreted as the system not using the unannotated sections (removing from a document unannotated portions of the document) for model creation).

Claims 2, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Michalak et al (US Patent 9535902) hereinafter Michalak, in view of Deshpande et al (US Pub 20130325881) hereinafter Deshpande, and further in view of Feldman et al (US Pub 20060253273), hereinafter Feldman.
Regarding claims 2, 9, and 16, the combination of Michalak and Deshpande teach all the claim limitations of claims 1, 8, and 15 above; and further teach forwarding the unstructured documents to an external tokenizer (Examiner note: Applicant’s spec, paragraph 0036 lists possible types of “external tokenizer[s]” but nowhere does the applicant explicitly explain how the tokenizer is “external”.
Michalak, Col. and Col. 13, lines 22-38 teach reading the received (forwarded) unstructured documents for “tokenization” by “breaking up the text into ‘tokens’” in the “NLP process” (tokenizer). Col. 1, line 59-Col. 2, line12 and claim 13 further teach that the instructions can be executed by “one or more processors”, thus, due to the broadness of the claim language and as is well known in the art, different processes (such as tokenizing) can be delegated to separate processors (external tokenizer) in order to improve processing efficiency.); 
retrieving, from the external tokenizer, a set of extracted words that are nouns from the unstructured documents (Michalak, Col. 13, lines 22-44 teach assigning a known “grammatical part of speech (POS) tag (e.g., proper noun, adjective, adverb)” and “predefined category” to text that has been broken up, such as “‘John’ and ‘Smith’” (extracted words that are nouns from the unstructured documents), in the “NLP process” (tokenizer). Col. 1, line 59-Col. 2, line12 and claim 13 further teach that the instructions can be executed by “one or more processors”, thus, due to the broadness of the claim language and as is well known in the art, different processes (such as tokenizing) can be delegated to separate processors (external tokenizer) in order to improve processing efficiency.); and 
designating the set of extracted words as the set of entities (Michalak, Col. 13, lines 22-44 teach the “NLP process” of tokenizing the documents into broken up text (extracted words), assigning grammatical tags, and conjoining adjacent tokens “creates (designates) the elements (or entities) that can be used in downstream analytics.”).
Michalak at least implies external tokenizer (see mapping above), however Feldman teaches external tokenizer (paragraphs 0028-0029 and 0065 uses “[e]xternal tokenizers” for “handling different languages, as well as for special domains” within the “unstructured documents” for natural language processing). 
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify creating machine learning models from annotating unstructured documents from structured data, as taught by Michalak as modified by storing unstructured and structured data in the same database as taught by Deshpande, to include “external tokenizers” as taught by Feldman in order to increase diversity in language processing capabilities and derive improved accuracy (Feldman, paragraphs 0028-0029 and 0065).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Hosokawa et al (US Pub 20170344625) teaches utilizing annotating documents with entities, tokenizing documents, and machine learning model techniques. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123              

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123