Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicants’ submission filed on 1/28/2021 has been entered.

DETAILED ACTION
The instant application having Application No. 16029052 has a total of 20 claims pending in the application. 


Claim Rejections - 35 USC § 103
	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6, 8-14, 13, 15-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al (US 6233575 B1) in view of Zheng (US 20150058320 A1), Swaminathan et al (US 20120011150 A1), Nissen (US 20150227590 A1),  and Sirat et al (US 5218646 A).
	As per claims 1, 8 and 15, Agrawal discloses, “A computer-implemented method comprising: deploying… to search in documents retrieved from a plurality of web-based data sources for, based on one or more domain keywords, a domain knowledge dataset comprising a plurality of domain knowledge data instances” (C10, particularly L24-38; EN: this denotes gathering training data for the system via a webcrawler, which will make use of key words, to find examples of topic documents to use. Each document is a domain knowledge data instance). “Each domain knowledge data instance in the plurality of domain knowledge data instances comprising a plurality of property values for a plurality of properties” (C10, particularly L47-58; EN: this denotes taking statistics based on terms (i.e. property values for a plurality of properties)). “Each property value in the plurality of property values corresponding to a respective property in the plurality of properties” (C10, particularly L47-58; EN: this denotes taking statistics based on terms (i.e. property values for a plurality of properties)).
	“Using the plurality of domain knowledge instances in the domain knowledge dataset to determine a plurality of combination of frequently cooccurring properties” (C10, particularly L47-58; EN: this denotes going through the documents to find the features that appear together to best represent the classes the documents represent). “By learning, from the domain knowledge dataset in the documents retrieved from the plurality of web-based data source, through machine learning … implemented by a computing device, each combination of frequently cooccurring properties in the plurality of combinations of frequently cooccurring EN: this denotes testing each term and grouping them together based on what is effective and removing the ones that aren’t. As each term is added or removed, new combinations are made and considered). “Wherein each combination in the plurality of combinations of frequently cooccurring properties includes a set of multiple properties” (C11, particularly L3-15; EN: This denotes the set having numerous words (i.e. properties) that occur together to define a class).  “Wherein each property in each such combination of frequently cooccurring properties has a support computed from frequencies of occurrences in the plurality of domain knowledge data instances” (C11, particularly L16-22; EN: This denotes each term being measured on its own frequency, and thus is a frequently occurring property. Here each term has its own individual frequency for the single term/item).
	“selecting, based on one or more artifact significant score thresholds” (c11, particularly L3-15; EN: this denotes identifying the most discriminatory terms and taking the top discriminating terms with the cutoff point of the ordering being the threshold).   “A specific combination of frequently cooccurring properties from among the plurality of combinations of frequently cooccurring properties” (c11, particularly L3-22; EN: this denotes identifying the most discriminatory terms via the statistics, which includes the use of frequency and taking the top discriminating terms with the cutoff point of the ordering being the threshold). 
	“Storing the selected combinations of frequently co-occurring properties as a knowledge artifact” (C11, particularly 16-22; EN: this denotes storing the groups as class identifiers). 
EN: this denotes the system being used to respond to user queries). 
	However, Agrawal fails to explicitly disclose, “Deploying one or more search engines to search…”, “wherein each combination in the plurality of combinations of frequently cooccurring properties includes a set of multiple properties, wherein the plurality of domain knowledge data instances includes a domain knowledge data instance in which all subsets in the set of multiple properties occur concurrently” ,  “Wherein each property in each such combination of frequently cooccurring properties has a support computed from frequencies of occurrences in the plurality of domain knowledge data instances, wherein each such property exceeds a minimum support threshold”, “By learning, from the domain knowledge dataset in the documents retrieved from the plurality of web-based data source, through machine learning with a machine learning model”,  “storing … in a knowledge neuron” and “knowledge neuron.” 
	Zheng discloses, “Deploying one or more search engines to search” (Pg.1, particularly paragraph 0004; EN: this denotes using search engines with web crawlers to retrieve information from the internet). 
	“By learning, from the domain knowledge dataset in the documents retrieved from the plurality of web-based data source, through machine learning with a machine learning model” (pg.9, particularly paragraph 0079; EN: this denotes numerous machine learning algorithms which can be used to determine which words are most helpful to determining a class). 
	Swaminathan discloses, “wherein each combination in the plurality of combinations of frequently cooccurring properties includes a set of multiple properties, wherein the plurality of domain knowledge data instances includes a domain knowledge data instance in which all EN: this denotes using an AND search query in order to identify documents which contain all of the relevant keywords. When combined with the webcrawler search of Agrawal, this denotes using an AND operator to make sure each category has at least one reference (i.e. domain knowledge data instance) that has all of the relevant keywords (i.e. all subsets in the set of multiple properties occurring concurrently)). 
	Nissen discloses, “Wherein each property in each such combination of frequently cooccurring properties has a support computed from frequencies of occurrences in the plurality of domain knowledge data instances, wherein each such property exceeds a minimum support threshold” (Pg.19, particularly paragraph 0167; EN: this denotes using tf and tf_idf scores above a threshold and ignoring those that don’t have the proper support).
Sirat discloses, “storing … in a knowledge neuron” and “knowledge neuron.” (C4, particularly L10-23; EN: this denotes neurons in a neural network having class differentiation information, such as those described by the Agrawal reference). 
Agrawal and Zheng are analogous art because both involve classification. 
At the time of filing it would have been obvious to one skilled in the art of classification to combine the work of Agrawal and Zheng in order to use search engines for finding information on the internet and machine learning for to determine properties of a document. 
	The motivation for using search engines for searching the internet is that “search engines [can] retrieve[] web pages by a web crawler” (Zheng, pg.1, paragraph 0004), or in light of Agrawal, allow a search engine to be part of the process of using their web crawler to identify the associated documents. 

Therefore at the time of invention it would have been obvious to one skilled in the art of classification to combine the work of Agrawal and Zheng in order to use search engines for finding information on the internet and machine learning for to determine properties of a document.
Agrawal and Swaminathan are analogous art because both involve internet search.
At the time of invention it would have been obvious to one skilled in the art of internet search to combine the work of Agrawal and Swaminathan to identify documents that contain all relevant keywords.   
	The motivation for doing so would be to “determine a set of documents wherein each document in the set includes and every keyword forming part of the query” (Swaminathan, Pg.4, paragraph 0050) or in the case of Agrawal, allow the system to identify relevant documents that contain all the keywords of the relevant topic/classification. 
Therefore at the time of invention it would have been obvious to one skilled in the art of internet search to combine the work of Agrawal and Swaminathan to identify documents that contain all relevant keywords.   
Agrawal and Nissen are analogous art because both involve topic classification.
At the time of invention it would have been obvious to one skilled in the art of topic classification to combine the work of Agrawal and Nissen to require minimum support for used terms.  

Therefore at the time of invention it would have been obvious to one skilled in the art of topic classification to combine the work of Agrawal and Nissen to require minimum support for used terms.  
Agrawal and Sirat are analogous art because both involve classification. 
At the time of invention it would have been obvious to one skilled in the art of classification to combine the work of Agrawal and Sirat in order to store classification information in a neuron. 
	The motivation for doing so would be to allow a neuron to “adapt its synaptic coefficients for the basis of multi-class examples” (Sirat, C4, L10-23) or in the case of Agrawal, allow the classification terms to be placed within neurons of a neural network to perform the classifications based on this training data. 
Therefore at the time of invention it would have been obvious to one skilled in the art of classification to combine the work of Agrawal and Sirat in order to store classification information in a neuron.
As per claims 2, 9, and 16, Agrawal discloses, “computing a plurality of sets of one or more artifact significant scores for the plurality of combinations of frequently cooccurring properties, each set of one or more artifact significant scores in the plurality of sets of one or more artifact significance scores corresponding to a respective combination of frequently cooccurring properties in the plurality of combinations of frequently cooccurring properties;” EN: this denotes performing calculations to determine the optimal cut-off point for features to use and what features are noise or stop words for the sets. Each possible cut off point represents a different score). 
“Comparing the plurality of sets of one or more artifact significance scores with one or more artifact significant score thresholds to select the specific combination of frequently occurring properties form the among the plurality of combinations of frequently co-occurring properties” (C17, particularly L17-34; EN: this denotes a minimization of the error rate, with that minimized point being the threshold). 
As per claims 3, 10, and 17, Agrawal discloses, “wherein the one or more artifact significant score threshold relates to one or more of a total number of properties in a combination of frequently cooccurring properties, support based scores, similarity based scores, interlink-based scores, confidence based scores, lift-based scores, knowledge relevance scores, or natural language processing generated scores” (C17, particularly L17-34; EN: this denotes a minimization of the error rate, with that minimized point being the threshold. The examiner is interpreting this to be a confidence based score as it denotes confidence in the correct classification vs an error).
As per claims 4, 11, and 18, Agrawal discloses, “Wherein the one or more domain keywords …, and wherein one or more domain keywords include one or more of: one or more subject keywords or one or more inference keywords…” (C10, particularly L24-38; EN: this denotes gathering training data for the system via a webcrawler, which will make use of key words, to find examples of topic documents to use. Each document is a domain knowledge data instance. In this case the topics are subjects).
EN: this denotes neurons in a neural network having class differentiation information, such as those described by the Agrawal reference. It further denotes that this information can come from prior learning performed on the network).
As per claims 6, 13, and 20, Agrawal discloses, “further comprising using one or more other machine learning methods to validate the specific combination of frequently co-occurring properties, wherein the one or more other machine learning methods comprises one or more of: regression based machine learning methods, classification based machine learning methods, decision tree based machine learning methods, or random forest based machine learning methods” (C8, particularly L54-65; EN: this denotes classification learning techniques). 



Claim Rejections - 35 USC § 103
Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al (US 6233575 B1) in view of Zheng (US 20150058320 A1), Nissen (US 20150227590 A1) and Sirat et al (US 5218646 A) as applied to claims 1, 8, and 15 above, and further in view of Lewis et al (US 20050100209 A1). 
As per claim 5, 12, and 19, Agrawal fails to explicitly disclose, “Wherein the specific combination of frequently co-occurring properties has a total number of properties no shorter than any other combination of frequently cooccurring properties in the plurality of combination of frequently co-occurring properties.”
EN: this denotes the feature set being a minimum. When combined with the Agrawal set which starts with the highest first and seeks to find the optimal number without overfitting. This denotes ending with the least amount of features to meet the threshold, a minimum requirement as disclosed by Lewis). 
Agrawal and Lewis are analogous art because both involve training classifiers. 
At the time of invention it would have been obvious to one skilled in the art of feature selection to combine the work of Agrawal and Lewis in order to select the smallest amount of features possible to meet the needs of the program.
	The motivation for doing so would be to “include those features necessary to define an n-dimensional feature space in which all the output classes are distinct and well separated” (Lewis, Pg.3, paragraph 0032) or in the case of Agrawal, make sure the feature set is minimized, and is not smaller than it needs to be. 
Therefore at the time of invention it would have been obvious to one skilled in the art of feature selection to combine the work of Agrawal and Lewis in order to select the smallest amount of features possible to meet the needs of the program.

Claim Rejections - 35 USC § 103
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al (US 6233575 B1) in view of Zheng (US 20150058320 A1), Nissen (US 20150227590 A1) and  as applied to claims 1, 8, and 15 above, and further in view of Agarwal (US 20050223002 A1).  
As per claims 7 and 14, Agrawal discloses, “Wherein property values in the plurality of knowledge domain data instances for a specific property in the plurality of properties are aggregated…” (C12, particularly L60-68; EN: this denotes aggregating along some dimensions for the statistics of the system). 
Agrawal fails to explicitly disclose, “… aggregated based on a step function.”
Agarwal discloses, “… aggregated based on a step function” (pg.5, particularly paragraph 0052; EN: this denotes aggregating by a number of known methods, including step functions). 
Agrawal and Agarwal are analogous art because both involve aggregation.
At the time of invention it would have been obvious to one skilled in the art of aggregation to combine the work of Agrawal and Agarwal in order to make use of step-functions for aggregation. 
	The motivation for doing so would be to use “any other measure that relates to the data considered” (Agarwal-2, Pg.5, paragraph 0052) or in the case of Agrawal, make use of step-function when it is appropriate for the data. 
Therefore at the time of invention it would have been obvious to one skilled in the art of aggregation to combine the work of Agrawal and Agarwal in order to make use of step-functions for aggregation.

Response to Arguments

Applicant's arguments with respect to claims 1-20 have been considered but are moot in view of the new ground(s) of rejection.

Conclusion
The examiner requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BEN M RIFKIN whose telephone number is (571)272-9768.  The examiner can normally be reached on Monday-Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/BEN M RIFKIN/     Primary Examiner, Art Unit 2198