DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

In response to Applicant’s claims filed on February 25, 2021, claims 1-20 are now pending for examination in the application.

Information Disclosure Statement
The information disclosure statements (IDS) filed on October 11, 2021, November 11, 2021, April 04, 11, 2022, June 10, 2022, and September 16, 2022 have been considered by the Examiner and made of record in the application file. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

	Claim 1-20  is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
	Claim 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The judicial exception is not integrated into a practical application. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The eligibility analysis in support of these findings is provided below, in accordance with the 2019 Revised Patent Subject Matter Eligibility Guidance, hereinafter 2019 PEG.
	Step 1. In accordance with Step 1 of the eligibility inquiry (as explained in MPEP 2106), it is noted that the system of claims 1-20 are directed to one of the eligible categories of subject matter and therefore satisfy Step 1.
Step 2A. In accordance with Step 2A, prong one of the 2019 PEG: 
In claims 1, the limitations directed to additional elements include: a computer-readable media and processors.
In exemplary claim 1, limitations reciting the abstract idea are as follows:

1. A method comprising: 
receiving documents from one or more databases, the documents including at least one of patents or patent applications (data gathering and storing retrieving information as WURC at 2B); 
generating first data representing the documents, the first data distinguishing components of the documents, the components including at least a title portion, an abstract portion, and a claims portion (data gathering and storing retrieving information as WURC at 2B); 
generating a user interface configured to display (Insignificant extra-solution activity/data gathering): 
the components of individual ones of the documents (data gathering and storing retrieving information as WURC at 2B); and 
an element configured to accept user input indicating whether the individual ones of the documents are in class or out of class (data gathering and storing retrieving information as WURC at 2B); 
generating a classification model based at least in part on user input data corresponding to the user input, the classification model trained utilizing at least a first portion of the documents indicated to be in class by the user input data (Insignificant extra-solution activity/data gathering); and 
causing the user interface to display an indication of: 
the first portion of the documents marked as in class in response to the user input (Insignificant extra-solution activity/data gathering); 
a second portion of the documents marked as out of class in response to the user input (Insignificant extra-solution activity/data gathering); 
a third portion of the documents determined to be in class utilizing the classification model (Insignificant extra-solution activity/data gathering); and 
a fourth portion of the documents determined to be out of class utilizing the classification model (Insignificant extra-solution activity/data gathering).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation Mental Process but for the recitation of generic computer components, then it falls within the “Mental Process” grouping of abstract ideas set forth in the 2019 PEG. Accordingly, the claim recites an abstract idea.
With respect to Step 2A prong two of the 2019 PEG, the judicial exception is not integrated into a practical application. The additional elements are directed to a computer-readable media and processors. However, these elements fail to integrate the abstract idea into a practical application because they fail to provide an improvement to the functioning of a computer or to any other technology or technical field, fail to apply the exception with a particular machine, fail to apply the judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, fail to effect a transformation of a particular article to a different state or thing, and fail to apply/use the abstract idea in a meaningful way beyond generally linking the use of the judicial exception to a particular technological environment. 
	Furthermore, although these elements have been fully considered, they are directed to the use of generic computing elements (paragraphs 68-76 of the published instant specification make it clear that the disclosed functionality is implemented on well-known computing systems and general purpose computing devices), to perform the abstract idea, which is not sufficient to amount to a practical application (as noted in the 2019 PEG) and is tantamount to simply saying "apply it" using a general purpose computer, which merely serves to tie the abstract idea to a particular technological environment (computer based operating environment) by using the computer as a tool to perform the abstract idea.
	Since the analysis of Step 2A prong one and prong two results in the conclusion that the claims are directed to an abstract idea, additional analysis under Step 2B of the eligibility inquiry must   be conducted in order to determine whether any claim element or combination of elements amount to significantly more than the judicial exception.
	Step 2B. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional limitations are directed to a computer-readable media and a processor, at a very high level of generality and without imposing meaningful limitations on the scope of the claim. In addition, paragraphs 68-76 of the published instant specification describe generic off-the-shelf computer-based elements for implementing the claimed invention, which does not amount to significantly more than the abstract idea and is not enough to transform an abstract idea into eligible subject matter. Such generic, high-level, and nominal involvement of a computer or computer-based elements for carrying out the invention merely serves to tie the abstract idea to a particular technological environment, which is not enough to render the claims patent-eligible, as noted at pg. 74624 of Federal Register/Vol. 79, No. 241, citing Alice, which in turn cites Mayo. Further, See, e.g., Alice Corp. Pty. Ltd. v. CLS Bank Int'l, 134 S. Ct. 2347, 2359-60, 110 USPQ2d 1976, 1984 (2014). See also OIP Techs. v. Amazon.com, 788 F.3d 1359, 1364, 115 USPQ2d 1090, 1093-94 (Fed. Cir. 2015) ("Just as Diehr could not save the claims in Alice, which were directed to 'implement[ing] the abstract idea of intermediated settlement on a generic computer', it cannot save O/P's claims directed to implementing the abstract idea of price optimization on a generic computer.") (citations omitted). See also, Affinity Labs of Texas LLC v. DirecTV LLC, 838 F.3d 1253, 1257-1258 (Fed. Cir. 2016) (mere recitation of a GUI does not make a claim patent-eligible); Intellectual Ventures I LLC v. Capital One Bank, 792 F.3d 1363, 1370 (Fed. Cir. 2015) ("the interactive interface limitation is a generic computer element".)
	The additional elements are broadly applied to the abstract idea at a high level of generality ("similar to how the recitation of the computer in the claims in Alice amounted to mere instructions to  apply the abstract idea of intermediated settlement on a generic computer,") as explained in MPEP § 2106.05(f)) and they operate in a well-understood, routine, and conventional manner. 
MPEP § 2106.05 (d)(II) sets forth the following:
The courts have recognized the following computer functions as well-understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g. at a high level of generality) as insignificant extra-solution activity.
Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec...; TLI Communications LLC v. AV Auto. LLC...; OIP Techs., Inc., v. Amazon.com, Inc... ; buySAFE, Inc. v. Google, Inc...;
Performing repetitive calculations, Flook ... ; Bancorp Services v. Sun Life...;
Electronic recordkeeping, Alice Corp...; Ultramercial... ;
Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc...;
Electronically scanning or extracting data from a physical document, Content Extraction and Transmission, LLC v. Wells Fargo Bank...; and
A web browser's back and forward button functionality, Internet Patent Corp. v. Active Network, Inc...
	Courts have held computer-implemented processes not to be significantly more than an abstract idea (and thus ineligible) where the claim as a whole amounts to nothing more than generic computer functions merely used to implement an abstract idea, such as an idea that could be done by a human analog (i.e., by hand or by merely thinking).
	In addition, when taken as an ordered combination, the ordered combination adds nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements integrate the abstract idea into a practical application. Their collective functions merely provide conventional computer implementation. Therefore, when viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a practical application of the abstract idea or that the ordered combination amounts to significantly more than the abstract idea itself.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 5, 10, 13, and 18 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Toivanen et al. (US Pub. No. 20190213407).


As to claim 1, Toivanen et al. teaches a method comprising: 
receiving documents from one or more databases, the documents including at least one of patents or patent applications (Paragraph 102 discloses the patent documents are retrieved from multiple patent offices thereafter merged to a single data structure, which in this embodiment is a database); 
generating first data representing the documents, the first data distinguishing components of the documents, the components including at least a title portion, an abstract portion, and a claims portion (Paragraph 127 discloses generating a word cloud from the full-text description of patents assigned to a given topic, or from other elements in the patent document such as technology classifications, titles, abstracts, applicants, assignees and so forth); 
generating a user interface (Paragraph 31 discloses a graphical user interface) configured to display: 
the components of individual ones of the documents (Paragraph 103 discloses Each patent document is tokenized, converting patents to vectors using a bag-of-words representation); and 

an element configured to accept user input indicating whether the individual ones of the documents are in class or out of class (Paragraph 110 discloses user can complement data anytime by inputting any similar type of item to the model); 
generating a classification model based at least in part on user input data corresponding to the user input, the classification model trained utilizing at least a first portion of the documents indicated to be in class by the user input data (Paragraph 47 discloses modelling the collection data form by using at least one of an unsupervised, semi-supervised and supervised classification algorithm to accomplish a model); and 
causing the user interface to display an indication of: 
the first portion of the documents marked as in class in response to the user input (Paragraph 63 discloses meta-information can be retrieved efficiently to be displayed at graphical user interface for users and Paragraph 136 discloses The overview display confirms also that all the records have received highest relevance scores in Topic 3, and overall their Topic relevance scores show similarities); 
a second portion of the documents marked as out of class in response to the user input (Paragraph 137 discloses In the GUI, the user can filter all records by statistical threshold value (lower, higher, or equal to) in one or more topics, and thereby isolate collections of documents for closer examination or processing); 
a third portion of the documents determined to be in class utilizing the classification model (Paragraph 149 discloses modelling generated topic area, by using modelling generated classification data, or human curation, the user creates curated collections of data that can be called MyLists or MyCollections); and 
a fourth portion of the documents determined to be out of class utilizing the classification model (Paragraph 50 discloses Using a system assigned or user selected similarity index threshold the process excludes low-scoring user record).

As to claim 5, Toivanen et al. teaches a system, comprising: 
one or more processors (Paragraph 184 discloses a processor); and 
non-transitory computer-readable media storing (Paragraph 98 discloses a storage 126) computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: 
generating first data representing documents received from one or more databases, the first data distinguishing components of the documents (Paragraph 127 discloses generating a word cloud from the full-text description of patents assigned to a given topic, or from other elements in the patent document such as technology classifications, titles, abstracts, applicants, assignees and so forth); 
generating a user interface (Paragraph 31 discloses a graphical user interface) configured to display: 
at least one of the components of individual ones of the documents (Paragraph 103 discloses Each patent document is tokenized, converting patents to vectors using a bag-of-words representation); and 
an element configured to accept user input indicating whether the individual ones of the documents are in class or out of class (Paragraph 50 discloses Using a system assigned or user selected similarity index threshold the process excludes low-scoring user record); 
generating a model based at least in part on user input data corresponding to the user input (Paragraph 47 discloses modelling the collection data form by using at least one of an unsupervised, semi-supervised and supervised classification algorithm to accomplish a model); 
determining, utilizing the model, a first portion of the documents that are in class and a second portion of the documents that are out of class (Paragraph 47 discloses modelling the collection data form by using at least one of an unsupervised, semi-supervised and supervised classification algorithm to accomplish a model); and 
causing the user interface to display an indication of the first portion of the documents and the second portion of the documents (Paragraph 149 discloses modelling generated topic area, by using modelling generated classification data, or human curation, the user creates curated collections of data that can be called MyLists or MyCollections).

As to claim 10, Toivanen et al. teaches the system of claim 5, the operations further comprising causing display, via the user interface, of: 

a first indication of a first number of the documents marked in class (Paragraph 63 discloses meta-information can be retrieved efficiently to be displayed at graphical user interface for users and Paragraph 136 discloses The overview display confirms also that all the records have received highest relevance scores in Topic 3, and overall their Topic relevance scores show similarities);

a second indication of a second number of the documents marked out of class (Paragraph 137 discloses In the GUI, the user can filter all records by statistical threshold value (lower, higher, or equal to) in one or more topics, and thereby isolate collections of documents for closer examination or processing);

a third indication of a third number of the documents determined to be in class utilizing the model (Paragraph 149 discloses modelling generated topic area, by using modelling generated classification data, or human curation, the user creates curated collections of data that can be called MyLists or MyCollections); and

a fourth indication of a fourth number of the documents determined to be out of class utilizing the model (Paragraph 50 discloses Using a system assigned or user selected similarity index threshold the process excludes low-scoring user record).


As to claim 13, Toivanen et al. teaches a method, comprising:

generating first data representing documents received from one or more databases, the first data distinguishing components of the documents (Paragraph 127 discloses generating a word cloud from the full-text description of patents assigned to a given topic, or from other elements in the patent document such as technology classifications, titles, abstracts, applicants, assignees and so forth);

generating a user interface (Paragraph 31 discloses a graphical user interface) configured to display:

the components of individual ones of the documents (Paragraph 103 discloses Each patent document is tokenized, converting patents to vectors using a bag-of-words representation); and 

an element configured to accept user input indicating whether the individual ones of the documents are in class or out of class (Paragraph 110 discloses user can complement data anytime by inputting any similar type of item to the model);

generating a model based at least in part on user input data corresponding to the user input (Paragraph 47 discloses modelling the collection data form by using at least one of an unsupervised, semi-supervised and supervised classification algorithm to accomplish a model);

determining, based at least in part on the model and the user input data, a first portion of the documents that are in class and a second portion of the documents that are out of class (Paragraph 137 discloses In the GUI, the user can filter all records by statistical threshold value (lower, higher, or equal to) in one or more topics, and thereby isolate collections of documents for closer examination or processing); and

causing the user interface to display an indication of first portion of the documents and the second portion of the documents (Paragraph 50 discloses Using a system assigned or user selected similarity index threshold the process excludes low-scoring user record).

With respect to claim 18, it is rejected on grounds corresponding to above rejected claim 10, because claim 18 is substantially equivalent to claim 10.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2-4, 6-8, 12, 14-16 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Toivanen et al. (US Pub. No. 20190213407) in view of de Li (US Pub. No. 20200151591)

The Toivanen et al. teaches all the limitations of claim 1.  With respect to claim 2, Toivanen et al. does not disclose a confidence value.
However, Li teaches the method of claim 1, wherein the user input comprises first user input, the user input data comprises first user input data (Paragraph 76 discloses input from reviewer 250), and the method further comprises: 

determining a first confidence value associated with results of the classification model (Paragraph 155 discloses a confidence score associated with the test predicted output can be generated and Paragraph 157 discloses document classification models); 

receiving second user input indicating classification of at least one document determined to be in class utilizing the classification model (Paragraph 9 discloses the updated CEE formed such that the second input comprises at least the first predicted output and the second predicted output comprises document data); 

causing the classification model to be retrained based at least in part on second user input data corresponding to the second user input (Paragraph 91 discloses d document classification models from Shared Model Learning to attempt to classify the document as a globally shared document class. If this produces a high confidence document class prediction, the prediction may be saved. A human reviewer can later verify this document class prediction (in some instances the prediction may not be and/or cannot be automatically accepted) and decide whether to assign the document to a different document class or create a new document class based on the global document class);

determining a second confidence value associated with results of the classification model as retained (Paragraph 155 discloses the further confidence score are above the predetermined threshold, the first class can be designated as being the same or similar to the second class); and

generating a user interface indicating a trendline representing a change from the first confidence value to the second confidence value, the trendline indicating an increase, a decrease, or no change in confidence associated with the use of the second user input data to retrain the classification model (Paragraph 155 discloses when the confidence score and the further confidence score are above the predetermined threshold, the first class can be designated as being the same or similar to the second class).
Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Li (information extraction from documents).  This would have improved data classification by allowing retraining and further improving classification.  See Li Paragraphs 4-8.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.


The Toivanen et al. teaches all the limitations of claim 1.  With respect to claim 3, Toivanen et al. does not disclose a confidence value.
However, Li teaches the method of claim 1, wherein the user input comprises first user input, the user input data comprises first user input data, and the method further comprises:

receiving second user input indicating classification of at least one document determined to be in class utilizing the classification model (Paragraph 9 discloses the updated CEE formed such that the second input comprises at least the first predicted output and the second predicted output comprises document data);

causing the classification model to be retrained based at least in part on second user input data corresponding to the second user input (Paragraph 91 discloses d document classification models from Shared Model Learning to attempt to classify the document as a globally shared document class. If this produces a high confidence document class prediction, the prediction may be saved. A human reviewer can later verify this document class prediction (in some instances the prediction may not be and/or cannot be automatically accepted) and decide whether to assign the document to a different document class or create a new document class based on the global document class);

determining a change in a number of the third portion of the documents marked in class utilizing the classification model as retrained (Paragraph 91 discloses d document classification models from Shared Model Learning to attempt to classify the document as a globally shared document class. If this produces a high confidence document class prediction, the prediction may be saved. A human reviewer can later verify this document class prediction (in some instances the prediction may not be and/or cannot be automatically accepted) and decide whether to assign the document to a different document class or create a new document class based on the global document class); and

generating a user interface indicating an influence value of the second user input on output by the classification model, the influence value indicating that a likelihood that additional user input will have a statistical impact on performance of the classification model (Paragraph 155 discloses when the confidence score and the further confidence score are above the predetermined threshold, the first class can be designated as being the same or similar to the second class).
Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Li (information extraction from documents).  This would have improved data classification by allowing retraining and further improving classification.  See Li Paragraphs 4-8.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.


The Toivanen et al. teaches all the limitations of claim 1.  With respect to claim 4, Toivanen et al. does not disclose a confidence value.
However, Li teaches the method of claim 1, further comprising:

generating second data indicating a relationship between a first document of the documents and a second document of the documents, the relationship indicating that the first document includes at least one component that is similar to a component of the second document (Paragraph 161 teaches Once these global classes are created, new MLMs can be trained for each of them using the training datasets of the members of each global class. In this way, classes that are common to multiple customers can be inferred, including finding classes that encompass the superset of a cluster of similar classes (e.g. receipts) and classes that are more specific subsets of a global document class (e.g. meal and expense receipts));

determining that the user input data indicates that the first document is in class (Paragraph 161 teaches Once these global classes are created, new MLMs can be trained for each of them using the training datasets of the members of each global class. In this way, classes that are common to multiple customers can be inferred, including finding classes that encompass the superset of a cluster of similar classes (e.g. receipts) and classes that are more specific subsets of a global document class (e.g. meal and expense receipts));

determining that the second document is in class based at least in part on the second data indicating the relationship (Paragraph 161 teaches Once these global classes are created, new MLMs can be trained for each of them using the training datasets of the members of each global class. In this way, classes that are common to multiple customers can be inferred, including finding classes that encompass the superset of a cluster of similar classes (e.g. receipts) and classes that are more specific subsets of a global document class (e.g. meal and expense receipts)); and

wherein the first portion of the documents utilized to train the classification model includes the second document (See Paragraph 161 teaches these scenarios are illustrated in FIG. 7 where two existing document classes or global document classes with a high degree of overlap are either merged to form a new global class (the “union” scenario) or kept as a subclass and superclass (the “subclass” scenario)). 
Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Li (information extraction from documents).  This would have improved data classification by allowing retraining and further improving classification.  See Li Paragraphs 4-8.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.


With respect to claim 6, it is rejected on grounds corresponding to above rejected claim 2, because claim 6 is substantially equivalent to claim 2.

With respect to claim 7, it is rejected on grounds corresponding to above rejected claim 3, because claim 7 is substantially equivalent to claim 3.

With respect to claim 8, it is rejected on grounds corresponding to above rejected claim 4, because claim 8 is substantially equivalent to claim 4.

The Toivanen et al. teaches all the limitations of claim 5.  With respect to claim 12, Toivanen et al. does not disclose retraining.
However, Li teaches the system of claim 5, the operations further comprising:

searching, utilizing the model, one or more databases for additional documents determined to be in class by the model (Paragraph 78 discloses documents in various common textual (e.g. word processing, HTML) and image (e.g. JPEG, TIFF) formats can be accepted by document preprocessing engine 230 via various methods such as import from a database);

receiving an instance of at least one additional documents from the one or more databases (Paragraph 78 discloses documents in various common textual (e.g. word processing, HTML) and image (e.g. JPEG, TIFF) formats can be accepted by document preprocessing engine 230 via various methods such as import from a database);

receiving user input indicating classification of the at least one additional documents (Paragraph 100 discloses CEE 205 retrains MLM 220 using the enlarged training dataset, some predictions may be updated. As a result of such updates, a document that is waiting for review can be re-assigned to a different reviewer or a document may be automatically accepted bypassing the review); and 

retraining the model based at least in part on the user input indicating the classification of the additional documents (Paragraph 100 discloses CEE 205 retrains MLM 220 using the enlarged training dataset, some predictions may be updated. As a result of such updates, a document that is waiting for review can be re-assigned to a different reviewer or a document may be automatically accepted bypassing the review).
Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Li (information extraction from documents).  This would have improved data classification by allowing retraining and further improving classification.  See Li Paragraphs 4-8.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.

With respect to claim 14, it is rejected on grounds corresponding to above rejected claim 2, because claim 14 is substantially equivalent to claim 2.

With respect to claim 15, it is rejected on grounds corresponding to above rejected claim 3, because claim 15 is substantially equivalent to claim 3.

With respect to claim 16, it is rejected on grounds corresponding to above rejected claim 4, because claim 16 is substantially equivalent to claim 4.

With respect to claim 20, it is rejected on grounds corresponding to above rejected claim 12, because claim 20 is substantially equivalent to claim 12.

Claim(s) 9 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Toivanen et al. (US Pub. No. 20190213407) in view of de Barsoney et al. (US Patent No. 10467252)

The Toivanen et al. teaches all the limitations of claim 5.  With respect to claim 9, Toivanen et al. does not disclose a ranking.
However, Barsoney et al.  teaches the system of claim 5, the operations further comprising:

determining, for individual ones of the documents marked as in class, a confidence value indicating a degree of confidence that the individual ones of the documents were marked correctly (See Column 10 Lines 23-59 discloses a confidence interval);

determining a ranking of the individual ones of the documents marked as in class based at least in part on the confidence value (See Column 9 Lines 45-62 teaches documents can be given a relevance score by issue during an predictive characterization ranking/scoring process. Upper and lower thresholds with regard to score/ranking can be set to automatically categorized the document as “Not Responsive” or “Responsive” or “Privileged” based on its score; documents that fall between the thresholds set, also called the Grey Zone, can be categorized appropriately or not categorized and submitted to human review); and

causing the user interface to display the individual ones of the documents marked as in class based at least in part on the ranking (See Column 9 Lines 45-62 teaches documents can be given a relevance score by issue during an predictive characterization ranking/scoring process. Upper and lower thresholds with regard to score/ranking can be set to automatically categorized the document as “Not Responsive” or “Responsive” or “Privileged” based on its score; documents that fall between the thresholds set, also called the Grey Zone, can be categorized appropriately or not categorized and submitted to human review).
Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Barsoney et al.  (document classification).  This would have improved data classification by allowing ranking and further improving classification.  See Barsoney et al. Column 1 Lines 23-46.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.

With respect to claim 17, it is rejected on grounds corresponding to above rejected claim 9, because claim 17 is substantially equivalent to claim 9.




Claim(s) 11 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Toivanen et al. (US Pub. No. 20190213407) in view of de Tagra et al. (US Pub. No. 20200104367)

The Toivanen et al. teaches all the limitations of claim 5.  With respect to claim 11, Toivanen et al. does not disclose a ranking.
However, Tagra et al.  teaches the system of claim 5, the operations further comprising:

causing display, via the user interface, of: 
a first section indicating first keywords determined to be statistically relevant by the model for identifying the first portion of the documents, wherein the first keywords are displayed in a manner that indicates a first ranking of statistical importance of the first keywords, wherein the first keywords are selectable via user input to be removed from the first section (Paragraph 36 discloses more specifically, the words associated with the values selected at step are identified as keywords for the applied content (618). The identified keywords are referred to as context sensitive keywords. In one embodiment, the identified key words may be implicitly present in the context and not expressly present. Similarly, in one embodiment, the context sensitive keywords may be application to identify related documents, such as relative electronic messages, or to auto-capture the keywords); 
a second section indicating second keywords determined to be statistically relevant by the model for identifying the second portion of the documents, wherein the second keywords are displayed in a manner that indicates second ranking of the statistical relevance of the second keywords, wherein the second keywords are selectable via user input to be removed from the second section (Paragraph 36 discloses more specifically, the words associated with the values selected at step are identified as keywords for the applied content (618). The identified keywords are referred to as context sensitive keywords. In one embodiment, the identified key words may be implicitly present in the context and not expressly present. Similarly, in one embodiment, the context sensitive keywords may be application to identify related documents, such as relative electronic messages, or to auto-capture the keywords); and 
based at least in part on receiving the user input indicating that at least one of the first keywords or the second keywords should be removed, retraining the model to account for removal of the at least one of the first keywords or the second keywords (Paragraph 97 discloses the model is dynamically and automatically retrained on next context).

Therefore, it would have been obvious before the effective filing data of invention was made to a person having ordinary skill in the art to modify Toivanen et al. (natural language processing of collections of documents) with Tagra et al.  (natural language processing).  This would have improved data classification by allowing keywords to further improve classification.  See Tagra et al. Paragraphs 1-8.  In addition, the references teach features that are directed to analogous art and they are directed to the same field of endeavor: classification models.  The close relation between both of the references highly suggest an expectation of success.

With respect to claim 19, it is rejected on grounds corresponding to above rejected claim 11, because claim 19 is substantially equivalent to claim 11.

Relevant Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US PG-PUB 20190318109 is directed to ENTERPRISE DOCUMENT CLASSIFICATION:   [0102] providing a valuation model for automatically estimating a business value of a file. Providing the valuation model may, for example, include training a machine learning algorithm to estimate the business value based on a training set of files each having a known business value. This may include training a machine learning model to recognize files with (known) high business value based on, e.g., ownership, authorship, content, access controls, and so forth. For example, the model may be trained to recognize credit card numbers, social security numbers, or other sensitive information including financial information, personal information, and other sensitive content within files indicative of actual or potential business value. The model may also or instead be trained to recognize potentially sensitive documents based on document type. For example, the model may be trained to classify documents as patent applications, resumes, financial statements, bank statements and so forth, with the corresponding classification used to assign an estimated value as appropriate.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS E ALLEN whose telephone number is (571)270-3562. The examiner can normally be reached Monday through Thursday 830-630.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on (571) 272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/N.E.A/Examiner, Art Unit 2154                                                                                                                                                                                                        
/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154