DETAILED ACTION
	This action is in response to Applicant’s Amendment ("Response”) received on May 4, 2022 in response to the Office Action dated February 24, 2022. This action is made Non-Final.
	Claims 1-20 are pending.
Claims 1, 8, and 15 are independent claims.
	Claims 1-20 are rejected.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant’s Response
	In Applicant’s Response, Applicant amended claims 4, 6, 11, 13, 18, and 20, and submitted arguments against the prior art in the Office Action dated February 24, 2022.
	Based on the Applicant’s amendments, the Examiner withdraws the rejection of claims 4, 6, 11, 13, 18, and 20 based on improper Markush groupings of alternatives.

	
Claim Interpretation
	Claims 15-20 are interpreted as statutory under a 35 USC 101 CRM analysis. The Examiner notes the Specification, Paragraph 0024, recites “a computer readable storage medium, as used herein, is not to be construed as being transitory signals per se…” Accordingly, the “medium” recited in claims 15-20 are interpreted as non-transitory medium. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 6-9, 13-16, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Basson et al., US Patent Application Publication no. US 2015/0347467, and further in view of Alkov et al., US Patent Application Publication no. US 2014/0358928 (“Alkov”).
Claim 1:
	Basson teaches or suggests a method implemented by an information handling system that includes a processor and a memory accessible by the processor, the method comprising:
	receiving a document and a document type, wherein the document type identifies a document category to which the received document belongs (see para. 0014 - starting document 162 may be a digital document including text and/or metadata; para. 0015 – parameterized search is partially based, may be any model relating to a domain of knowledge (e.g., a business plan model relating to a business domain). the parameters on which the parameterized search is partially based may each comprise a word or phrase relating to a narrowing of the domain of the model; para. 0016 - select one or more of the candidate documents (not shown) to be added to the corpus 104 as a select document 108. The selection by the selection component 170A may be made based on one or more predefined selection criteria. phrase including a parameter and an element 116 of the model. document a sufficient number of whose topics correspond to elements 116 of the model; para. 0026 - additional documents …  may be obtained from the same or similar type of source; para. 0033 – additional documents 142 that are considered high quality documents with respect to gourmet restaurants.);
	retrieving a set of metrics corresponding to the document type (see para. 0017 - domain of the model 112 may be defined as {business}. The corresponding model 112 may be, for example, a {business plan}. The business plan may have a plurality of elements; para. 0019 - having a desired level of quality in relation to the model 112 of a domain, and in relation to the model's 112 constituent elements 116. The quality of a document in relation to the model 112 may be based on one or more selection criteria;
	automatically determining a quality of the received document, wherein the quality of the received document is based on a set of linguistic features found in the document as compared to the retrieved set of metrics (see para. para. 0017 - domain of the model 112 may be defined as {business}. The corresponding model 112 may be, for example, a {business plan}. The business plan may have a plurality of elements; para. 0019 - having a desired level of quality in relation to the model 112 of a domain, and in relation to the model's 112 constituent elements 116. The quality of a document in relation to the model 112 may be based on one or more selection criteria; para. 0025 - evaluate the quality of other documents (i.e., one or more additional documents; para. 0032 – document 142 that meets a desired quality measure in relation to the select topics; para. 0033 – additional documents 142 that are considered high quality documents with respect to gourmet restaurants.);
	ingesting the document into a corpus that is utilized by a question-answering (QA) system, wherein the ingesting is based on the determined quality (see para. 0032 – new corpus 142 may include the additional documents 142 that are considered high quality documents with respect to gourmet restaurants, but not the select documents 108 of the corpus 104 that relate to restaurant businesses generally. Accordingly, this new corpus 130 may be more focused, and may be used by a QA tool to search for information about restaurant businesses more quickly and efficiently; para. 0035 – additional documents 142 are added to the new corpus. additional document 142 that meets the desired quality of the comparison 146 component; para. 0038 - select documents 134 that correspond to the parameters. Based on the contents of the retrieved select documents, the QA tool may perforn1 further analysis; para. 0040 - candidate documents according to a predefined or specified criteria, and add the selected candidate documents to the corpus 104 as select documents.).
	Basson appears to fail to explicitly disclose linguistic metrics; based on a set of linguistic features as compared to the retrieved set of linguistic metrics.
	Alkov teaches or suggests linguistic metrics corresponding to the document type; based on a set of linguistic features as compared to the retrieved set of linguistic metrics (see para. 0021 - clustering of questions in accordance with the features/attributes extracted from the questions. In one aspect of the illustrative embodiments, as part of a question analysis phase, the question is analyzed to identify various features/attributes of the question, e.g., focus, lexical answer type (LAT), question classification (QClass), and question sections (QSections). subsequently submitted questions may be similarly clustered such as by measuring the Euclidean dimensional distance of the subsequent questions from cluster centers. Depending on the training/testing objective, the subsequently submitted questions can be either accepted or rejected based on the clustering of the subsequently submitted questions with regard to the defined clusters; para. 0036 - Categorizing the questions, such as in terms of roles, type of information, tasks, or the like, associated with the question, in each document of a corpus of data may allow the QA system to more quickly and efficiently identify documents containing content related to a specific query; para. 0041 - performs deep analysis on the language of the input question and the language used in each of the portions of the corpus of data; para. 0096 - after having been generated by the separate feature/attribute extraction, clustering, and generation of the training and testing question sets; para. 0098 - separate training question sets and testing questions sets may be generated for different domains. training of the QA system, clustering may be performed on the training questions to generate training clusters associated with different question domains, e.g., topics, areas of interest, question subject matter categories, or the like. These question domains may be of various types including, for example, healthcare, financial, legal, or other types of question domains; para. 0101 - receive input questions and their extracted features/attributes for purposes of clustering; para. 0103 - receive input questions and perform analysis on these questions to extract features/attributes of the input question for use in clustering.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Basson, to include linguistic metrics corresponding to the document type; based on a set of linguistic features as compared to the retrieved set of linguistic metrics for the purpose of efficiently identifying documents containing content related to specific query, thereby improving accuracy, performance, and confidence in a QA or knowledge system, as taught by Alkov (para. 0038). 
Claim(s) 8 and 15:
Claim(s) 8 and 15 correspond to Claim 1, and thus, Basson and Alkov teach or suggest the limitations of claim(s) 8 and 15 as well.

Claim 2:
	Basson further teaches or suggests suggests computing a quality score corresponding to the determined quality; and comparing the quality score to a quality threshold, wherein the ingesting is performed in response to the quality score meeting the quality threshold (see para. 0027 - additional document 142. In a related embodiment, a threshold T may be specified in lieu of a distance measure indicating similarity, whereby a given additional document 142 is considered to meet a desired level of similarity where at least T % of the topics 154 found in the additional document 142 match; para. 0032 – new corpus 142 may include the additional documents 142 that are considered high quality documents with respect to gourmet restaurants, but not the select documents 108 of the corpus 104 that relate to restaurant businesses generally. Accordingly, this new corpus 130 may be more focused, and may be used by a QA tool to search for information about restaurant businesses more quickly and efficiently; para. 0035 – additional documents 142 are added to the new corpus. additional document 142 that meets the desired quality of the comparison 146 component; para. 0038 - select documents 134 that correspond to the parameters. Based on the contents of the retrieved select documents, the QA tool may perforn1 further analysis; para. 0040 - candidate documents according to a predefined or specified criteria, and add the selected candidate documents to the corpus 104 as select documents; para. 0044 - upon a predetermined or specified threshold number or percentage of topics 154 matching the select topics 150, the corresponding additional document 142 may be added to a new corpus.).
Claim(s) 9 and 16:
Claim(s) 9 and 16 correspond to Claim 2, and thus, Basson and Alkov teach or suggest the limitations of claim(s) 9 and 16 as well.

Claim 6:
	Basson further teaches or suggests wherein the document type is selected from the a plurality of document types, the plurality of document types comprising a business letter, a poem, an essay, a legal document, a promissory note, a medical record, a scientific article, a blog entry, a financial memo, a resume, a patent application, and a post to a social media site (see para. 0015 - any model relating to a domain of knowledge (e.g., a business plan model relating to a business domain. domain of the model 112 is {business}, the sub-domain may be {restaurant}; para. 0017 - corpus 104 represents a repository of information regarding the restaurant business that meet a criteria for quality.).
Claim(s) 13 and 20:
Claim(s) 13 and 20 correspond to Claim 6, and thus, Basson and Alkov teach or suggest the limitations of claim(s) 13 and 20 as well.

Claim 7:
	Basson further teaches or suggests wherein the type of document further includes a document subtype (see para. 0015 - the parameters on which the parameterized search is partially based may each comprise a word or phrase relating to a narrowing of the domain of the model 112 (e.g., a sub-domain), which serves to narrow the scope of the starting documents 162 available on the database 158. For example, where the domain of the model 112 is {business}, the sub-domain may be {restaurant}; para. 0030 - generate the new corpus 130 such that it contains documents deemed particularly useful to a sub-domain.).
Claim(s) 14:
Claim(s) 14 correspond to Claim 7, and thus, Basson and Alkov teach or suggest the limitations of claim(s) 14 as well.

Claims 3, 10, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Basson, in view of Alkov, and further in view of Barsony et al., US Patent no. US 10,467,252 (“Barsony”).
Claim 3:
	Basson further teaches or suggests computing a quality score corresponding to the determined quality; and ingesting, into the corpus, metadata that is associated with the ingested document (see para. 0027 - additional document 142 received by the corpus generation system 100, by comparing its topics 154 to the select topics 150, to determine whether the additional document 142 meets a desired level of quality. quality of the additional document 142, as assessed by the comparison component 146, may include determining a "distance" measure indicating similarity; para. 0035 - where one or more additional documents 142 are added to the new corpus 130, the select topics 150 may be updated to include the topics 154 of each additional document 142 that meets the desired quality of the comparison 146 component. Effectively, the select topics 150 are expanded, and may be used in assessing the quality of any additional document 142 that is subsequently evaluated by the comparison component.).
	Barsony further teaches or suggests ingesting the quality score as metadata (see col. 1, lines 50-55 – characterizing and defining groups within large corpuses of documents using a combination of one or more of human judgment, tiered similarity analysis techniques, and language/concept analysis; col. 2, lines 6-18 - data can be received that characterizes quality control review of at least a portion of the documents. types of contextual characteristics can be used, for example, similarity score, type of similarity algorithm used to characterize document, document family, document type, metadata describing properties of the document, and the like; col. 7, lines 20-26 - Each of the documents can have associated contextual information that characterizes the document. Some or all of this contextual information can initially be a part of the documents within the corpus of documents 210 or it can be added/assigned by the document classification and characterization engine 220 and/or one or more of the similarity analysis; col. 8, lines 60-62 - smart assignment functionality can group documents based on other criteria such as relevance scores.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Contreras, to include ingesting the quality score as metadata for the purpose of efficiently including contextual characteristics with documents, improving quality control of documents within a group, as taught by Barsony (col. 2). 
Claim(s) 10 and 17:
Claim(s) 10 and 17 correspond to Claim 3, and thus, Basson, Alkov, and Barsony teach or suggest the limitations of claim(s) 10 and 17 as well.

Claims 4, 5, 11, 12, 18, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Basson, in view of Alkov, and further in view of Connor et al., US Patent no. US 9,773,166 (“Connor”).
Claim 4:
	Basson appears to fail to explicitly disclose wherein each of the linguistic features are weighted based on an importance of the respective linguist feature to the document type.
	Connor teaches or suggests wherein each of the linguistic features are weighted based on an importance of the respective linguist feature to the document type (see col. 3, lines 52-55 - by identifying feature weights that maximize the likelihood that a document will be correctly classified. collection of training documents include a group of positive documents identified as known longform documents; col. 4, line 62 – col. 6, line 43 - Features that can be selectively extracted from the content of the training documents include the following; col. 7, lines 6-13 - learns respective weights to apply to each input feature. iterative process attempts to find optimal weights. by identifying feature weights that maximize the likelihood that a document will be correctly classified.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Contreras, to include wherein each of the linguistic features are weighted based on an importance of the respective linguist feature to the document type for the purpose of efficiently classifying documents by tuning feature weights, improving document classification, as taught by Connor (col. 7). 
Claim(s) 11 and 18:
Claim(s) 11 and 18 correspond to Claim 4, and thus, Basson, Alkov, and Connor teach or suggest the limitations of claim(s) 11 and 18 as well.

Claim 5:
	Basson fails to explicitly disclose prior to receiving the document, training a machine learning (ML) system of the document type and the linguistic metrics corresponding to the document type, wherein the training comprises: inputting a plurality of training document to the ML system, wherein the training documents are known to be high quality documents of the document type; extracting the linguistic metrics from the plurality of training documents; and providing a weighting of the extracted linguistic metrics based on an importance of the respective linguistic metrics to the document type.
	Connor teaches or suggests prior to receiving the document, training a machine learning (ML) system of the document type and the linguistic metrics corresponding to the document type, wherein the training comprises: inputting a plurality of training document to the ML system, wherein the training documents are known to be high quality documents of the document type; extracting the linguistic metrics from the plurality of training documents; and providing a weighting of the extracted linguistic metrics based on an importance of the respective linguistic metrics to the document type (see col. 3, lines 52-55 - by identifying feature weights that maximize the likelihood that a document will be correctly classified. collection of training documents include a group of positive documents identified as known longform documents and negative documents identified as not being longform documents; col. 4, line 62 – col. 6, line 43 - Features that can be selectively extracted from the content of the training documents include the following; col. 7, lines 6-13 - learns respective weights to apply to each input feature. iterative process attempts to find optimal weights. by identifying feature weights that maximize the likelihood that a document will be correctly classified.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Contreras, to include prior to receiving the document, training a machine learning (ML) system of the document type and the linguistic metrics corresponding to the document type, wherein the training comprises: inputting a plurality of training document to the ML system, wherein the training documents are known to be high quality documents of the document type; extracting the linguistic metrics from the plurality of training documents; and providing a weighting of the extracted linguistic metrics based on an importance of the respective linguistic metrics to the document type for the purpose of efficiently classifying documents by tuning feature weights, improving document classification, as taught by Connor (col. 7). 
Claim(s) 12 and 19:
Claim(s) 12 and 19 correspond to Claim 5, and thus, Basson, Alkov, and Connor teach or suggest the limitations of claim(s) 12 and 19 as well.






Response to Arguments
Applicant’s further arguments have been considered but are not persuasive because the arguments do not correspond to the rationales as used in the current rejection.

	
	
	
	
	
	

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew T McIntosh whose telephone number is (571)270-7790. The examiner can normally be reached M-Th 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached on 571-272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANDREW T MCINTOSH/Primary Examiner, Art Unit 2176