DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to because there are duplicate reference numerals in Figure 1.  Applicants should change one of these reference numerals for each of these two instances to another reference numeral.  Then Applicants should go through the Specification, and change each of the occurrences of the old reference numerals to the new reference numerals to reflect the changes to the drawing so that they are consistent with the Specification.
Figure 1 includes two reference numerals 122 for database service 122 and configuration rules 122.
Figure 1 includes two reference numerals 124 for reference hierarchy 124 and results (final set of concept labels) 124.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office Action to avoid abandonment of the application.  Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended.  The figure or figure number of an amended drawing should not be labeled as “amended.”  If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings 

Specification
The disclosure is objected to because of the following informalities:
In ¶[0001], “U.S. Patent Application No.” has a blank space that should be filled in as “16/784,000”, which appears to correspond to Docket No. 058083-1160489.  (two occurrences)
In ¶[0005], “if content labels” should be “of content labels”.
In ¶[0027], “text fragment labeling system 104” should be “text fragment labeling system 1104”.  See Figure 11.
In ¶[0041], “a usefulness scores” should be “a usefulness score”.
In ¶[0041], “that contain at least semantic unit” should be “that contain at least one semantic unit”.   
In ¶[0044], “for the set of document” should be “for the sets of documents”.
In ¶[0050], “UI 114” should be “UI 104”.  See Figure 1.
In ¶[0053], “U.S. Patent Application No.” has a blank space that should be filled in as “16/784,000”, which appears to correspond to Docket No. 058083-1160489.  (three occurrences)

In ¶[0060], “U.S. Patent Application No.” has a blank space that should be filled in as “16/784,000”, which appears to correspond to Docket No. 058083-1160489.  (three occurrences)
In ¶[0064], “a sematic unit” should be “a semantic unit”.
In ¶[0064], it appears that “high certainty/low probability” should be “high certainty/high probability” for strong specificity of a low entropy value, as this corresponds to “low certainty/low probability” for weak specificity for a high entropy value.
In ¶[0081], “reference hierarchy 602” should be “reference hierarchy 124”.  See Figure 1.
In ¶[0083], “reference hierarchy 122” should be “reference hierarchy 124”.  See Figure 1.
In ¶[0084], “reference hierarchy 122” should be “reference hierarchy 124”.  See Figure 1.
In ¶[0085], “knowledge source 122” should be “knowledge source 112”.  See Figure 1.
In ¶[0096], “is more relevant (i.e., is a better” should be “is more relevant to (i.e., is a better”.
In ¶[0103], “FIG. 2” should be “FIG. 12”.
In ¶[0112], there is a missing closed double quote around ““the)”.

In ¶[0117], there is already an Equation (2) at ¶[0066], so it appears this should be Equation (16).
In ¶[0118], it appears that “a million concept labels” should be “a hundred thousand concept labels”.  Here, if there are a million concept labels, then the log ratio would be “log(1000000/1000) = 3” instead of “log(100000/1000) = 2”, and then the corresponding product of ‘tf’ and ‘idf’ would be “0.03*3=0.09”.
In ¶[0122], there is already an Equation (3) at ¶[0087], so it appears this should be Equation (17).
In ¶[0131], there is already an Equation (4) at ¶[0088], so it appears this should be Equation (18).
In ¶[0131], “thus” appears that it should be “thus is”.
In ¶[0136], “Higher” should be “The higher”.
In ¶[0142], “text fragment labeling system 104” should be “text fragment labeling system 1104”.  See Figure 11.
In ¶[0143], “are reference and identify” should be “as reference and identify”.
In ¶[0146], there is no step 306 in Figure 12.  
In ¶[0147], there is no step 308 in Figure 3.
In ¶[0147], “documents 122” appears that it should be “documents 1122”.  See Figure 11.
In ¶[0147], “reference information 120” appears that it should be “reference information 1120”.  See Figure 11.

In ¶[0148], “FIG. 2” should be “FIG. 12”.
In ¶[0158], “distinct” should be “are distinct”.
In ¶[0163], “text fragment labeling system 104” should be “text fragment labeling system 1104”.  See Figure 11.
In ¶[0169], “text fragment labeling system 104” should be “text fragment labeling system 1104”.  See Figure 11.
In ¶[0170], “the one or more servers providing” should be “the one or more servers provide”.
In ¶[0173], “that enable” should be “that enables”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 to 4 and 15 to 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S. Patent Publication 2015/0293905) in view of Franceschini et al. (U.S. Patent Publication 2016/0012336).
Concerning independent claims 1 and 19 to 20, Wang et al. discloses a method, system, and computer program product for summarization of a document, comprising:
e.g., Google (¶[0012]); concept detection unit 124 may detect a concept(s) in a sentence of a document; a concept is defined as words and phrases that present some semantics of a sentence; concept detection unit 124 can detect concepts in each and every sentence of a document (¶[0014]); here, sentences of a document can be ‘semantic units’ (“a plurality of semantic units”); implicitly, a summary can be generated for one document or for a plurality of documents (“from a plurality of documents”);
 “for each semantic unit in the plurality of semantic units, determining, by the computer system, from a reference set of concept labels, one or more concept labels applicable to the semantic unit” – server 102 also has access to a concept library 130 (“a reference set of concept labels”); concept library 130 can be but not limited to some publicly available concept libraries including Wikipedia, Baidu Baike, BabelNet, etc.; Wikipedia has more than 350 million manually edited concepts and a Wikipedia concept is represented as an article name in Wikipedia (¶[0012]); a concept is detected for each sentence in a document (“for each semantic unit in the plurality of semantic units, determining . . . one or more concept labels applicable to the semantic unit”); a concept in a sentence is detected based on a predefined concept library (“determining . . . from a reference set of concept labels, one or more concept labels applicable to the semantic unit”); said predefined concept library can be but not limited to Wikipedia, and each 
“based on the concept labels determined for the plurality of semantic units, determining, by the computer system an initial set of concept labels for the plurality of documents” – a term is compared with an article name of a Wikipedia page to see if they match with each other; if they match each other, then this Wikipedia page, i.e., a Wikipedia concept, is a candidate concept for this term (“an initial set of concept labels”) (¶[0016]: Figure 2: Step 201); “On Oct. 31, 1999, a plane carrying 217 mostly Egyptian passengers crashed into the Atlantic Ocean off Massachusetts” is the sentence, and detected concepts include “Atlantic Ocean” and “Massachusetts” (¶[0019] - ¶[0022]); a threshold can be set, and if a weight assigned to a detected concept is less than said threshold, this concept is ignored (¶[0025]); broadly, “Atlantic Ocean” and “Massachusetts” are “an initial set of concept labels”.
Concerning independent claims 1 and 19 to 20, Wang et al. discloses a general idea of extracting concept labels from a plurality of documents using a reference set of concept labels from article titles of Wikipedia.  Broadly, this generates an initial set of concept labels.  Compare, e.g., Specification, ¶[0036].  Additionally, Wang et al. discloses that Wikipedia is used as an ontology.  (¶[0016])  Generally, an ontology is represented as “a reference hierarchy”.  However, Wang et al. does not disclose using a reference hierarchy to identify hierarchical relationships between two or more concept Franceschini et al.
Concerning independent claims 1 and 19 to 20, Franceschini et al. teaches linking text to concepts in a knowledge base.  (Abstract)  A concept graph is a knowledge base in which knowledge is represented with nodes in the graph as concepts and edges in the graph representing known relations between the concepts.  An estimate of how relevant a concept in a concept graph is to another concept in a concept graph is computed.  (¶[0038] - ¶[0039])  A document 102 is input from a corpus of documents and text from document 102 is automatically linked to concepts in a knowledge base 106.  Concepts from the knowledge based that are found in document 102 are referred to as ‘extracted concepts’.  (¶[0046]: Figure 1)  Wikipedia can be used as a knowledge base, and Wikipedia articles can be assigned to nodes in the concept graph.  (¶[0092])  Specifically, Franceschini et al. teaches:
“based on the concept labels determined for the plurality of semantic units, determining, by the computer system, an initial set of concept labels for the plurality of documents” – a text string is received, where the text string can be a collection of words, a sentence, a paragraph, or a whole document (“a plurality of semantic units”) (¶[0087]: Figure 2: Step 202); reference is made to raw data that is extracted from a document that have a role of picking out mentions of concepts in the text; raw data extracted is referred to as a priori data about the document; a document D includes concepts c1, c2, c3, . . . , cK (¶[0116] - ¶[0017]); here, an a priori set of concepts c1, c2, c3, . . . , cK are “an initial set of concept labels”;
e.g., all or a subset of a concept graph; this representation can connect document D with concepts not necessarily present in the original description of it, via exploitation of deep conceptual connections as seen in a concept graph; this representation is referred to as including a posteriori inferences; an embodiment includes taking a priori data about a document and then employing a concept graph to improve the conceptual understanding of the document (¶[0116] -¶[0118]); implicitly, a concept graph is “a reference hierarchy associated with the reference set of concept labels” and a concept graph is “the reference hierarchy identifying hierarchical relationships between two or more concept labels in the reference set of concept labels”; here, Wikipedia provides a concept graph that is “associated with the reference set of concept labels”; 

“outputting, by the computer system, information identifying the final set of concept labels for the plurality of documents” – a task of summarizing the most relevant concepts in a document includes extracting concepts from those documents; concepts within the documents can be ranked, and a desired number of them can be chosen for a word concept cloud display; extracted concepts can be displayed in a cloud, or they could be displayed in the context of the text that contains them by rendering the text and the concepts in differentiating ways (¶[0236]).
Concerning independent claims 1 and 19 to 20, Franceschini et al. teaches refining an initial set of concepts extracted from documents as a priori information into a final set of concepts as a posteriori information that accounts for relationships between concepts provided by a concept graph.  Here, a concept graph can be based on Wikipedia, and is “a reference hierarchy”.  An advantage is to provide semantic Wang et al. by using a concept graph that accounts for relationships between concepts to obtain a final set of concepts as taught by Franceschini et al. for a purpose of taking advantage of large volumes of crowd sourced data including Wikipedia and does not rely upon Latent Semantic Analysis.

Concerning claim 2, Wang et al. discloses that concept library 130 can be but not limited to some publicly available concept libraries including Wikipedia, Baida Baike, BabelNet, etc.; Wikipedia has more than 350 million manually edited concepts and a Wikipedia concept is represented as an article name in Wikipedia (“titles of a plurality of reference documents”) (¶[0012]); said predefined concept library can be but not limited to Wikipedia, and each Wikipedia page is a reference concept (¶[0016]: Figure 2: Step 201); a concept is detected in each sentence of the document based on a predefined concept library including Wikipedia, which includes a number of reference concepts as pages of Wikipedia (“wherein the plurality of reference documents comprise Wikipedia articles”) (¶[0018]: Figure 3: Step 304).
Concerning claim 3, Franceschini et al. teaches that a text string is received, where the text string can be a collection of words, a sentence, a paragraph, or a whole document (“wherein the plurality of semantic units comprise a plurality of paragraphs in the plurality of documents”).  (¶[0087]: Figure 2: Step 202)
Wang et al. discloses that relevance measures between sentences are computed according to reference concepts in a concept library corresponding to detected concepts (“for each concept label in the reference set of concepts, computing . . . a relevance score for the concept label . . . , the relevance score for the concept label indicative of a degree of relevance of the concept label”) (¶[0017]: Figure 2: Step 202); if concepts are detected in a sentence, then respective weights are assigned to the detected concepts; a weight represents a degree of similarity between the detected concept and its corresponding reference concept in the concept library (¶[0018]: Figure 3: Step 307); a threshold can be set for the weights, and if a weight assigned to a detected concept is less than the weight, this detected concept is ignored (“based on the relevance score computed for the concept labels in the reference set of concept labels . . . selecting . . the one or more concept labels . . . “) (¶[0025]: Figure 3); once all sentences are processed, relevance measures between the sentences are computed according to detected concepts (¶[0026]: Figure 3: Step 309).  Wang et al., then, expressly discloses a degree of relevance (“a relevance score”) as a weight, and this weight is used to select a set of concepts according to a threshold.  Similarly, Franceschini et al. teaches that a priori data includes concepts c1, c2, c3, . . . , cK extracted from a document and confidence scores s1, s2, s3, . . . , sK constituting what is referred to as a priori information about the document (¶[0117]).  These confidence scores s1, s2, s3, . . . , sK of a priori information, then, are “a relevance score” for concept labels c1, c2, c3, . . . , cK.
Concerning claim 15, Franceschini et al. teaches that a document may have nonzero scores for concepts that are actually not present in the document, but that are 
Concerning claim 16, Franceschini et al. teaches that information about the connections between all of these concepts in the list comes from the concept graph; incorporation of knowledge in the concept graph may lead to refinement including increasing confidence scores for each concept, or downgrading of confidence scores 
Concerning claim 17, Franceschini et al. teaches an embodiment where one or more concepts can be selected as a query, e.g., a single concept q can be selected as a query (“receiving information identifying selection of a particular concept label from the final set of concept labels”), computing a score that is assigned to every document for the query, and returning documents in the order implied by that score (“responsive to receiving, outputting information identifying all documents from the plurality of documents that include at least one semantic unit for which the particular concept label is identified as being applicable”).  (¶[0153])  That is, a user enters a concept, and documents corresponding to that concept are returned in a manner similar to a standard search query.
Concerning claim 18, Franceschini et al. teaches an embodiment where a document is received (“receiving information identifying selection of a particular document from the plurality of documents”), and concepts are extracted from the document (“responsive to the receiving outputting information identifying all concept labels in the final set of concept labels applicable to the particular document”) (¶[0152]).  That is, a user enters a document, and a procedure is performed of extracting and outputting concepts from a priori and a posteriori information. 


Allowable Subject Matter
Claims 5 to 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Concerning claims 5 to 10, the prior art of record does not appear to disclose or reasonably suggest computing an entropy value for a semantic unit that indicates a degree of specificity of a concept label, and ordering the semantic units based on the entropy values, where the ordered list of semantic units is then used to determine an initial set of concepts labels.
Concerning claims 11 to 14, the prior art of record does not appear to disclose or reasonably suggest updating a graph to add nodes corresponding to a set of ancestor concept labels, where the updating comprising adding connections to a graph to represent hierarchical relationships between the nodes representing the set of ancestor concept labels and the nodes representing the concept labels in the initial set of concept labels.  Even if it were obvious that a concept graph is a directed acyclic graph (DAG) in Franceschini et al., this reference does not clearly teach updating a graph by adding connections representing a set of ancestor nodes.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608. The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.   To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        November 8, 2021