Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are pending of which claims 1, 8 and 15 are independent.  
The application was filed on 12/5/17 and does not claim any foreign priority or domestic benefit.  This application is currently assigned to International Business Machines Corporation. 

Information Disclosure Statement
 	The information disclosure statements (IDSs) filed on 12/5/17 has been considered.

Oath/ADS
An Application Data Sheet was submitted 12/5/17, and an Oath/declaration was submitted on 12/5/17.
Comments
It is noted that, during examination, a claim must be given its broadest reasonable interpretation consistent with the specification.  Under a broadest reasonable interpretation, words of the claim must be given their plain meaning, unless such meaning is inconsistent with the specification.  M.P.E.P. 2173.01(I).  It is respectfully submitted that each claim is to be interpreted based on the language of the claim itself, so long as that interpretation is consistent with the specification.  Further, "though understanding of the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim.  For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment."  M.P.E.P. 2111.01(II).  
It is noted that care be taken such that the claims themselves explicitly recite all the claimed elements relied upon in overcoming the rejections set forth herein.  That is, for any additional limitations discussed in the specification to be considered, the claims should be amended such that the limitations are explicitly recited in the claims themselves.  Appropriate consideration of each and every feature of the claims has been made.  
Applicants’ representative is welcome and encouraged to contact the examiner (Maryam Ipakchi) to discuss the application in an attempt to expedite prosecution.  The examiner may be reached Maryam.ipakchi@uspto.gov) provided written authorization to communicate thereby is provided.  Any email communication must include written authorization for the USPTO to communicate with the Examiner concerning any subject matter of this application via electronic mail (see, MPEP 502.03).  Sample authorization language:  "Recognizing that Internet communications are not secure, I hereby authorize the USPTO to communicate with me concerning any subject matter of this application by electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Interview requests may be made via an Interview Agenda setting forth proposed participants, items to be discussed and proposed interview times (see MPEP 713.01(III.)).  The Interview Agenda may be submitted via the AIR Form (http://www.uspto.gov/patent/uspto-automated-interview-request-air-form.html) and/or faxed to the examiner at (571)270-4237 so that the Examiner may review the materials in advance to provide meaningful discussion in order to advance prosecution.

Computer Program Product
With regard to computer program product independent claim 15, [0020] of Applicants’ originally filed specification is noted, which sets forth:  “A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.”  Thus, as the language disqualifies transitory items, it is understood that the computer program product of claim 15 and dependent claims 16-20 include a tangible non-transitory component and are not software per se.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 1-20  are rejected under 35 U.S.C. 103 as being unpatentable over US 9710544 to Smith

Regarding independent claims 1, 8, and 15,  Smith teaches:

(Claim 1) A method for generating a context-aware knowledge base, the method comprising: | (Claim 8) A computer system for generating a context-aware knowledge base, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: | (Claim 15)  15. A computer program product for generating a context-aware knowledge base, comprising: one or more computer-readable storage devices and program instructions stored on at least one of the one or more tangible storage devices, the program instructions executable by a processor, the program instructions comprising: (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article.)

extracting document object model (DOM) tag elements associated with one or more webpages; identifying and extracting webpage data associated with the extracted DOM tags (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article.);

determining a context associated with the identified and extracted webpage data by detecting and extracting resource description framework (RDF) triplets in candidate DOM tag elements; (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article. In other examples, the external dataset may be or include structured data, for example, data in a relational database having a plurality of fields of information about given key values, like business names, product names, entity names, and the like, and the external dataset may be a collection of responses to queries corresponding to the key values. triples in a resource description framework (RDF) format,)

ranking the extracted RDF triplets (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9, col. 7, lines 40 – col. 8, line 15 some cases, the vectors may be appended to one another in the same order as documents are listed across rows or columns in the adjacency matrix (e.g., as tuples) to facilitate linear algebra operations and conserve memory over systems that label these values independent of sequence. Examiner notes rank based on what? Rank v. order?);

validating one or more RDF triplets associated with the ranked RDF triplets (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9, col. 7, lines 40 – col. 8, line 15 some cases, col. 10, lines 18-30 edge weights for respective pairs of the second-graph nodes (e.g., a plurality of nodes of the second graph) may be determined. In some embodiments, for each pair of the second-graph nodes, a respective edge weight (indicating similarity, or other relationship, between a first attribute corresponding to a first node of the respective pair and a second attribute corresponding to a second node of the respective pair) may be determined, e.g., in accordance with one or more of steps 108-114; examiner notes validate based on what/how?);  and 

connecting the validated RDF triplets to a knowledge graph associated with a knowledge base of the one or more webpages (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9, col. 7, lines 40 – col. 8, line 15 some cases, col. 10, lines 18-65 embodiments may probabilistically walk the first graph and measure the probability of traveling from a node associated with the first attribute to a node associated with a second attribute. For instance, some embodiments may determine the probability of randomly walking in a document similarity graph from a document that mentions a first person to another document that mentions another person. Higher probabilities are expected to indicate a similarity relationship between the two people, or other attributes; examiner notes ‘associated’?).

Smith pertains to various features of derivative graphs and, more specifically, to pivoting from a graph of semantic similarity of documents to a derivative graph of relationships between entities mentioned in the documents or other features (e.g., other features of unstructured text in the documents) (Smith, col. 1, lines 10-20). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, with the teachings of the various exemplary embodiments before them to build upon and combine features described with regard to different embodiments of Smith together as an exemplary embodiment in order to better draw inferences and group documents according to preferences and needs.

Regarding dependent claims 2, 9 and 16,  Smith teaches:

2. The method of claim 1, wherein extracting the DOM tag elements associated with the one or more webpages further comprises: determining a relationship between the extracted DOM tag elements. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9, col. 7, lines 40 – col. 8, line 15 some cases, col. 10, lines 18-65, col. 19, line 51; col. 23, line 37 – col. 23, line 65  variety of types of relationships may be processed with some embodiments. For instance, semantic similarity or relatedness of entities mentioned in documents, sentiments expressed in documents, or terminology in documents may be determined with computational natural language processing of unstructured plain text corpora. In some embodiments, a corresponding graph may be constructed, with documents, paragraphs, entities, sentiments, or terms as nodes, and weighted edges indicating relationships, like similarity, relatedness, 

Regarding dependent claims 3 and 10,  Smith teaches:

3. The method of claim 1, wherein identifying and extracting the webpage data associated with the extracted DOM tags further comprises: extracting text associated with the extracted DOM tag elements. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9, col. 6, lines 38-63 when evaluating the quality of a connection between documents in the internal dataset indicated by the graph (taken as input for the processes 100, 200, or 300), the corresponding subsets of information from the external dataset may be retrieved and serve as the external dataset for purposes of subsequent steps. This correspondence may be determined before subsequent processes (e.g., by extracting entities and searching for every document in an analyzed corpus) or after subsequent processes in different embodiments (e.g., by searching within an external dataset based on the below-described adjacent nodes identified during evaluation of graph quality after nodes are identified as adjacent), col. 7, line 40 – col. 8, line 15 some cases, col. 10, lines 18-65, col. 19, line 51; col. 24, line 50  -- col. 25, line 14; examiner notes ‘associated’).

Regarding dependent claims 4, 11 and 17,  Smith teaches:

4. The method of claim 2, wherein determining the context associated with the identified webpage data further comprises: detecting and extracting the RDF triplets in the candidate DOM tag elements based on an order associated with the determined relationship. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article. In other examples, the external dataset may be or include structured data, for example, data in a relational database having a plurality of fields of information about given key values, like business names, product names, entity names, and the like, and the external dataset may be a collection of responses to queries corresponding to the key values. In another example, the external dataset may be triples in a resource description framework (RDF) format,), col. 7, lines 40 – col. 8, line 15 some cases, col. 10, lines 18-65, col. 19, line 51; col. 23, line 37 – col. 23, line 65  variety of types of relationships may be processed with some embodiments. For instance, semantic similarity or relatedness of entities mentioned in documents, sentiments expressed in documents, or terminology in documents may be determined with computational natural language processing of unstructured plain text corpora. In some embodiments, a corresponding graph may be constructed, with documents, paragraphs, entities, sentiments, or terms as nodes, and weighted edges indicating relationships, like similarity, relatedness, species-genus relationships, synonym relationships, possession relationships, relationships in which one node acts on another node, relationships in which one node is an attribute of another, and the like; col. 24, line 50  -- col. 25, line 14 As adjacent n-grams are encountered during 


Regarding dependent claims 5, 12 and 18,  Smith teaches:

5. The method of claim 1, wherein ranking the extracted RDF triplets further comprises: determining a confidence score for the extracted RDF triplets, wherein the confidence score represents a level of connection between an extracted subject and an extracted object associated with the extracted RDF triplets. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article. In other examples, the external dataset may be or include structured data, for example, data in a relational database having a plurality of fields of information about given key values, like business names, product names, entity names, and the like, and the external dataset may be a collection of responses to queries corresponding to the key values. In another example, the external dataset may be triples in a resource description framework (RDF) format,); col. 7, line 40 – col. 8, line 15 some cases, col. 10, lines 18-65Some embodiments may probabilistically walk the first graph and measure the probability of traveling from a node associated with the first attribute to a node associated with a second attribute. For instance, some embodiments may determine the probability of randomly walking in a document similarity graph from a document that mentions a first person to another document that mentions another person. Higher probabilities are expected to indicate a similarity relationship between the two people, or other attributes, col. 19, line 51; col. 24, line 50  -- col. 25, line 14; examiner notes associated – based on what? Confidence of what? How is level of connection related to confidence?)

Regarding dependent claims6, 13 and 19,  Smith teaches:

6. The method of claim 1, wherein validating the one or more RDF triplets associated with the ranked RDF triplets further comprises: generating and setting one or more threshold confidence scores; and enabling a user to edit and validate the one or more RDF triplets associated with the ranked RDF triplets. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article. In other examples, the external dataset may be or include structured data, for example, data in a relational database having a plurality of fields of information about given key values, like business names, product names, entity names, and the like, and the external dataset may be a collection of responses to queries corresponding to the key values. In another example, the external dataset may be triples in a resource description framework (RDF) format,);  col. 6, lines 38-60  In some cases, when evaluating the quality of a connection between documents in the internal dataset indicated by the graph (taken as input for the processes 100, 200, or 

Regarding dependent claims 7, 14 and 20,  Smith teaches:

7. The method of claim 1, further comprising: tracking changes to the validated RDF triplets. (Smith, FIGS. 1, 4, 5; col. 5, line 42 – col. 6, line 9 documents may be excerpted, for example, excluding all but the first and last paragraph of the document, or first and last paragraphs following a heading, as indicated by a markup language of the document. In some embodiments, documents may be excerpted by crawling a document object model and extracting unstructured text based on the location and context of the unstructured text within the document object model, for example, text within a bracketed set of tags indicating a title or body of an article. In other examples, the external dataset may be or include structured data, for example, data in a relational database having a plurality of fields of information about given key values, like business names, product names, entity names, and the like, and the external dataset may be a collection of responses to queries corresponding to the key values. In another example, the external dataset may be triples in a resource description framework (RDF) format,), col. 7, lines 40 – col. 8, line 15 some cases, col. 10, lines 18-65, col. 19, line 51; col. 24, line 50  -- col. 25, line 14 As adjacent n-grams are encountered during parsing, corresponding rows or columns of n-grams in the co-occurrence matrix may be updated by summing current values of the row or column with corresponding values of the adjacent n-gram vector. Similarity of n-grams (and corresponding entities) may be determined based on similarity of resulting vectors in the co-occurrence matrix, e.g., based on cosine similarity; examiner notes validated based on what? Tracking what changes?)



Double Patenting
Applicant appears to have multiple co-pending related applications.  Applicant should take caution to ensure that related applications do not include claims of identical scope or of obvious variants thereof.  In view of this notice to the Applicant and Applicant’s own superior knowledge of pending and/or issued related applications, the examiner retains the ability to issue a double patenting rejection in a Final rejection if appropriate without establishing a new grounds of rejection.




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  For the prior art applied to the claims, as set forth above, the Examiner has cited particular columns and line numbers (or paragraphs) in the references for the convenience of the applicant.  Although the specified citations are representative of the teachings of the art and are applied to specific imitations within the individual claim, other passages and figures may apply as well. More particularly, e.g., in the instances the Examiner has identified Figures of the applied prior art reference, it is understood that the corresponding portions of the written description describing the identified Figures is relied upon.  It is respectfully requested from the Applicant in preparing responses, to fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or identified by the Examiner. The entire reference(s) is/are to be considered to provide disclosure relating to the claimed invention.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARYAM M IPAKCHI whose telephone number is (571)270-3237.  The examiner can normally be reached on M-F Flex 6-3pm (AltFriOff).  Any interview requests should be made via an Interview Agenda setting forth proposed participants, items to be discussed and proposed interview times (see MPEP 713.01(III.)).  The Interview Agenda may be submitted via the AIR Form (http://www.uspto.gov/patent/uspto-automated-interview-request-air-form.html) and/or faxed to the examiner at (571)270-4237 so that the Examiner may review the materials in advance to provide meaningful discussion in order to advance prosecution.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar, can be reached on (571)270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MARYAM M IPAKCHI/               Primary Examiner, Art Unit 2171