DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 10 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 10 recites the limitation "…wherein information gain is determined by determining a probability that the answer provided by the user…" in the first limitation.  There is insufficient antecedent basis for this limitation in the claim. There is no recitation of “an answer” earlier in claim 10, in parent claim 8, which claim 10 depends from, or in claim 1, which claim 8 depends from.
Claim 20 corresponds to claim 10 and is rejected accordingly.
Claim 1 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for 
Claim 1 limitations “a labelling engine to apply to each datum within the corpus…a clarification engine to: generate a decision tree…” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. Claim 1 recites generic placeholder “engine” coupled with functional language “to apply to each datum within the corpus a label corresponding to each one of a plurality of predetermined indicator variables” and “generate a decision tree using the set of search results…prune the decision tree in response to a question posed to a user…” without reciting sufficient structure to achieve the function. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph. Claims 2-9 depend from claim 1, include all the limitations of claim 1, and are rejected accordingly.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 

If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
Claim Objections
Claims 8 and 18 are objected to because of the following informalities:
Grammatical oversight – claim 8 recites “…wherein maximizing information gain comprises…split the search results into subsets according its value to produce a node…”
Claim 18 corresponds to claim 8 and is objected to accordingly.
Appropriate correction is required.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-5, 8, 9, 11-15, 18 and 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Shah et al. (Pub. No. US 2018/0218374 A1, hereinafter “Shah”). 
Regarding claim 1, Shah teaches:
a labelling engine to apply to each datum within the corpus a label corresponding to each one of a plurality of predetermined indicator variables, each indicator variable relating to context of the respective data (Shah – the memory 204 in Fig. 2 includes a knowledge base 210 (i.e. corpus) that serves as a store of user queries (i.e. indicator variables) that are anticipated at the service desk of the enterprise. The system queries are stored along with corresponding answers (i.e. labels) in the knowledge base 210 [0034]. The queries in the knowledge base are tagged [0056]. The processor 
and a clarification engine to: generate a decision tree using the set of search results, the decision tree comprising nodes corresponding to the indicator variables and edges corresponding to the labels, the decision tree generated to maximize information gain based on pruning the decision tree in response to obtaining a desired label for a selected indicator variable; and prune the decision tree in response to a question posed to a user to obtain a label for an indicator variable (Shah – the information stored in the knowledge base configures a knowledge graph (i.e. decision tree) including a network of interconnected nodes and branches. The knowledge graph may be systematically pruned to match a user query to a system query [0034]. More specifically, if the user’s response to the displayed system queries indicates no match with the system query, then the processor may be configured to prune the knowledge graph (i.e. remove parts of the knowledge graph associated with the previous set of system queries from a current search domain) to identify a different 
Claim 11 corresponds to claim 1 and is rejected accordingly.
Regarding claim 2, Shah teaches:
wherein each indicator variable corresponds to a question and each label of associated edges corresponds to an answer associated with the question (Shah - the memory 204 in Fig. 2 includes a knowledge base 210 (i.e. corpus) that serves as a store of user queries (i.e. indicator variables) that are anticipated at the service desk of the enterprise. The system queries are stored along with corresponding answers (i.e. labels) in the knowledge base 210 [0034].)  
Claim 12 corresponds to claim 2 and is rejected accordingly.
Regarding claim 3, Shah teaches:
wherein each indicator variable represents a category of interest to a particular field represented by the corpus of data (Shah – system queries (i.e. indicator variables) are associated with a search domain (i.e. category of interest) [0045].)  
Claim 13 corresponds to claim 3 and is rejected accordingly.
Regarding claim 4, Shah teaches:
wherein at least one of the labels is unknown (Shah – if a user query is determined to an incident, then an answer of a matching system query (a system query matching the user query) may be provided as a reply to the user. In some embodiments, the reply  
Claim 14 corresponds to claim 4 and is rejected accordingly.
Regarding claim 5, Shah teaches:
wherein each datum in the corpus comprises one or more webpages (Shah – the processor may be configured to communicate, using the communication interface, with public data sources (for example, sources like Wikipedia, technical community forums, etc.) and private data sources (for example, online technical libraries) to augment information stored in the knowledge base [0034].)  
Claim 15 corresponds to claim 5 and is rejected accordingly.
Regarding claim 8, Shah teaches:
wherein maximizing information gain comprises determining the information gained in knowing a value of each indicator variable, the indicator variable with a largest potential information gain being used to split the search results into subsets according its value to produce a node in the decision tree, and wherein the question posed to the user results in obtaining a label or value for the indicator variable (Shah – the information stored in the knowledge base configures a knowledge graph (i.e. decision tree) including a network of interconnected nodes and branches. The knowledge graph may be systematically pruned to match a user query to a system query [0034]. More specifically, if the user’s response to the displayed system queries indicates no match with the system query, then the processor may be configured to prune the knowledge graph (i.e. remove parts of the knowledge graph associated with the previous set of system queries from a current search domain) to identify a different set of system queries that are more likely to match the user query [0045].)  
Claim 18 corresponds to claim 8 and is rejected accordingly.
Regarding claim 9, Shah teaches:
wherein the clarification engine iteratively performs, to prune the decision tree, determining the information gained and posing the question to the user that will provide the largest information gain (Shah – if the user’s response to the displayed system queries indicates no match with the system query, then the processor may be configured to prune the knowledge graph (i.e. remove parts of the knowledge graph associated with the previous set of system queries from a current search domain) to identify a different set of system queries that are more likely to match the user query. The system may be caused to repeat the steps (i.e. iteratively  
Claim 19 corresponds to claim 9 and is rejected accordingly.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that 
Claims 6, 7, 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Shah in view of Boyan et al. (Pub. No. US 2010/0145902 A1, hereinafter “Boyan”).
Regarding claim 6, Shah does not appear to teach:
wherein at least a portion of the data are manually labelled and the labelling engine applies inheritance of the labels to webpages associated with the manually labelled data
However, Boyan teaches:
wherein at least a portion of the data are manually labelled and the labelling engine applies inheritance of the labels to webpages associated with the manually labelled data (Boyan – source training documents may be tagged in response to user input. In response to tagging, a system may construct an automated agent to traverse an information source to extract its structured data into a database with a given schema. For example, tagging a few pages of a website may allow the system to construct a web-scraping agent that traverses the entire website to acquire and restructure its data [0077]. Also see [0080] where data tags are applied and a root type is defined (i.e. inheritance). A root type can be chosen arbitrarily from amongst the various entity types defined in the domain model, wherein a given  
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Shah and Boyan before them, to modify the teachings of Shah of a labelling engine to apply to each datum within the corpus a label corresponding to each one of a plurality of predetermined indicator variables, each indicator variable relating to context of the respective data, and a clarification engine to: generate a decision tree using the set of search results, the decision tree comprising nodes corresponding to the indicator variables and edges corresponding to the labels, the decision tree generated to maximize information gain based on pruning the decision tree in response to obtaining a desired label for a selected indicator variable and prune the decision tree in response to a question posed to a user to obtain a label for an indicator variable, wherein each datum in the corpus comprises one or more webpages with the teachings of Boyan of wherein at least a portion of the data are manually labelled and the labelling engine applies inheritance of the labels to webpages associated with the manually labelled data. One would have been motivated to make such a modification to integrate scraped information into a comprehensive, consistently structured database (Boyan - [0005, 0007]).
Claim 16 corresponds to claim 6 and is rejected accordingly.
Regarding claim 7, Shah does not appear to teach:
wherein the labelling engine uses a trained supervised learning classifier for each of the indicator variables to label the data, the 
However, Boyan teaches:
wherein the labelling engine uses a trained supervised learning classifier for each of the indicator variables to label the data, the supervised learning classifier trained using a set of manually labelled data for training and testing (Boyan – each page in a tree may be manually or automatically assigned to a bucket of similarly formatted pages. For websites and information sources where the type of page returned by following a navigational element varies dynamically, the agent may use a classifier to automatically determine which bucket each page belongs to. To establish training data for such classifier, bucket identities can be assigned manually during the hand-tagging process, or can be inferred during that stage by an unsupervised clustering algorithm based on the features of the page [0088-0089].)  
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Shah and Boyan before them, to modify the teachings of Shah of a labelling engine to apply to each datum within the corpus a label corresponding to each one of a plurality of predetermined indicator variables, each indicator variable relating to context of the respective data, and a clarification engine to: generate a decision tree using the set of search results, the decision tree comprising nodes corresponding to the indicator variables and edges corresponding to the labels, the decision tree generated to 
Claim 17 corresponds to claim 7 and is rejected accordingly.
Claims 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shah in view of Yamagami et al. (Pub. No. US 2018/0005126 A1, hereinafter “Yamagami”).
Regarding claim 10, Shah does not appear to teach:
wherein information gain is determined by determining a probability that the answer provided by the user to each question is accurate, and that a desired search result to the search query will be found in the set of documents represented within that answer
However, Yamagami teaches:
wherein information gain is determined by determining a probability that the answer provided by the user to each question is accurate, and that a desired search result to the search query will be found in the set of documents represented within that answer (Yamagami – the user’s answer reliability calculator collects the user’s answer instance data stored on the user’s answer instance data memory and calculates, as reliability, a percentage of the user’s correct answers to inquiries asking about attributes (correct answer rate). The reliability is an index that represents the user’s correct answer rate to an inquiry asking about an attribute [0074]. Also see [0076], where the information gain calculator calculates, on a per attribute basis of the classification target data included in the pre-segmentation data set, an amount of reduction in the entropy of the data set caused by the segmentation.)  
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Shah and Yamagami before them, to modify the teachings of Shah of a labelling engine to apply to each datum within the corpus a label corresponding to each one of a plurality of predetermined indicator variables, each indicator variable relating to context of the respective data, and a clarification engine to: generate a decision tree using the set of search results, the decision tree comprising nodes corresponding to the indicator variables and edges corresponding to the labels, the decision tree generated to maximize information gain based on pruning the decision tree in response to obtaining a desired label for a selected indicator variable and prune the decision tree in response to a question posed to a user to obtain a label for an indicator variable, wherein each datum in the corpus comprises one or more webpages with the teachings of Yamagami of wherein information gain is determined by determining a probability that the answer provided by the user to 
Claim 20 corresponds to claim 10 and is rejected accordingly.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANJIT P DORAISWAMY whose telephone number is (571)270-5759. The examiner can normally be reached Monday-Friday 9:00 AM - 5:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on (571) 270-3750. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit 





/R.P.D./Examiner, Art Unit 2166                                                                                                                                                                                                        
/MARK D FEATHERSTONE/Supervisory Patent Examiner, Art Unit 2166