DETAILED ACTION
The present application is being examined under the pre-AIA  first to invent provisions. 
This Office Action is in response to the remarks entered on 2/10/2021. Claims 1, 8, 10 and 15 are amended.
Response to Amendment
Applicant’s amendments necessitated new grounds of rejection.         
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).

Claims 1-14 are rejected under 35 U.S.C. 103(a) as being unpatentable over Davis et al. (US Patent No. 7,089,238- hereinafter Davis) in view of Cao et al (US Patent No. 8,250,008- hereinafter Cao), in view of Goodwin et al (US Pub. No. 2003/0135818- hereinafter Goodwin), and further in view of Hennig et al US Pub. No. 2012/0101965- hereinafter Hennig).
Referring to Claim 1, Davis teaches a method comprising: 
receiving training data that comprises an initial set of relevant documents (see Column 2: lines 23-30; Davis teaches an initial set of documents, which are used for creating a training, set. This corresponds to ‘initial set of relevant documents); 
generating a coded control set of data based on the training data using coding determinations for the coded control set  (see Column 2: lines 34-53; Davis teaches addition of documents by receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. A categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. The category score is higher or lower, corresponding to a degree of confidence in assignment of a particular document to a particular category. Therefore, since the initial training data is used for training the categorization engine, this corresponds to ‘generating a coded control set based on the training data), the coding determinations comprising automated coding of the relevant documents performed by a predictive coding system (see Column 2: lines 34-53; Davis teaches addition of documents by receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. A categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. The category score is higher or lower, corresponding to a degree of confidence in assignment of a particular document to a particular category. Therefore, since the initial training data is used for training the categorization engine for automatic classification, this corresponds to ‘automated coding of the relevant documents by a predictive coding system’); 
automatically coding additional relevant documents with the coded control set using the predictive coding system (see Column 2: lines 34-53; Davis teaches a categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. The category score is higher or lower, corresponding to a degree of confidence in assignment of a particular document to a particular category. Therefore, since the initial training data is used for training the categorization engine for automatic classification, this corresponds to ‘automatic coding of additional relevant documents’);
presenting a relevant document from the additional relevant documents to a human reviewer, the relevant document having automated coding from the predictive coding system (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that didn’t meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user. Therefore, the automated coded document not meeting the threshold for forwarding/presenting it to the reviewer corresponds to the relevant document); 
allowing the human reviewer to correct at least a portion of the automated coding by performing a hard coding correction, the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding (see Column 2: lines 54-56: “[d]ocuments verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223”. Therefore, this is interpreted as the human reviewer performing a correction in the editorial review for quality control);
receiving the hard coding correction to the at least a portion of the automated coding of the relevant document (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user, wherein the corrected classified document is flagged as being reviewed by the human); 
updating the coded control set with the relevant document having the hard coding correction (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223. Therefore, the coded control set is updated); and
applying the updated coded control set to additional documents to automatically code the additional documents (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the updated training set is used for automatic categorization/coding of additional documents).
However, Davis fails to teach:
that the received training data includes computer-suggested documents from machine learning; and 
fails to explicitly teach:
the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding;
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts; and 
supplementing the initial set of relevant documents with the contextually similar documents.
the received training data including computer-suggested documents from machine learning (see Column 4: lines 38-43, and Column 6: lines 6-14; Cao teaches machine learning algorithms used to train models for identifying duplicate accounts based on the attribute values of the accounts. The model training system is a data processing system, such as a personal computer or server, that can use a machine-learning algorithm to create a model for identifying duplicate accounts. Furthermore, Cao teaches a model refinement system that classifies pairs of data in the initial training set used to generate the decision tree and then filters the initial training set to remove data records representing non-duplicate accounts that are incorrectly classified as duplicate accounts to generate a filtered training. The model refinement then provides the filtered training set to the model training system. Therefore, the machine-learning algorithm corresponds to the claimed ‘machine learning’ and the provided filtered training set corresponds to the ‘received training data including computer-suggested documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teachings of Davis with the above teachings of Cao by receiving training data for generating a coded control set of data, as taught by Davis, wherein the training data includes computer-suggested documents from machine learning, as taught by Cao. The modification would have been obvious because one of ordinary skill in the art would be motivated to use machine learning algorithms to increase the accuracy of the models trained (Cao at Column 5: 45-47).
Even though Davis teaches the concept of a human reviewer performing a hard coding correction during the editorial review, Goodwin explicitly teaches, in an analogous the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding (see [0026]; Goodwin teaches “categories may be changed or other categories may be assigned to the document by, for example, a system administrator or other user, after the document's creation” and “[a]n actions record may be maintained, step 106. The actions record may identify actions that have been performed on a document by a user, time of action, duration of action, and other criteria”. Moreover, at [0034]- “[c]ategory assigning module 302 may enable one or more categories to be assigned to a document. The categories may be, for example, assigned when the document is created. Alternatively, a system administrator and/or one or more users may be granted rights for assigning, changing, or deleting categories assigned to one or more documents”. Therefore, since Goodwin teaches that a system administrator or user can change the previous category of a document, this corresponds to the claimed hard correction comprising a change of one code to another).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis and Cao with the above teachings of Goodwin by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning, as taught by Davis and Cao, and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Goodwin. The modification would have been obvious because one of ordinary skill in the art would be motivated to change the category of a previous classified document to a new one for correcting and updating the accuracy provided by documents in a training set of used for automatic 
Hennig teaches, in an analogous system,
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts (see [0041]; Hennig teaches “the distribution of probabilities of the corpus/topic distribution prediction may be compared with a distribution of probabilities of a document/topic distribution of an input document to determine whether the input document concerns similar topics as the document corpus. In another example, one or more documents within the document corpus having semantically similar topics may be grouped as determined by the topic model”. Moreover, at [0049]; Hennig teaches “In one example, the training component 306 may be configured to group one or more documents within the document corpus 302 having semantically similar topics as determined by the topic model 308”. Therefore, the comparison of a document with the corpus using probabilities of the corpus/topic distribution to determine if they are semantically similar corresponds to the claimed identification of contextually similar documents); and 
supplementing the initial set of relevant documents with the contextually similar documents (see [0036] and Fig. 3; Hennig teaches “an addition of a new document to the document corpus may be detected. A new document representation of the new document and new features of the new document may be processed using the topic model. In this way, the topic model may be updated based upon the processing of the new document (e.g., the topic model may be trained by updating current parameters with parameters specified during the processing of the new document representation and/or the new features”. Moreover, it can be seen at Fig. 3 the updating of the Topic model by the training component. Therefore, this updating of the topic model with new documents based on similarity corresponds to the claimed supplementing the initial set of relevant documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis, Cao and Goodwin with the above teachings of Hennig by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Davis, Cao and Goodwin, and supplementing the initial set of relevant documents, as taught by Hennig. The modification would have been obvious because one of ordinary skill in the art would be motivated to update one or more current feature/topic parameters with specified feature/topic parameters in the training of the topic model during sequential processing of document representations and/or features (as suggested by Hennig at [0032]).

Referring to Claim 2, Davis teaches the method according to claim 1, further comprising applying the coding determinations of the coded control set to contextually similar data in a corpus of documents (see Column: 4 lines 60-67; Davis teaches the similar document window 503 provides information about documents similar to the selected document. For k nearest neighbor coding, this panel provides access to nearest neighbors of record. In FIG. 5A, the similar document window 503 displays the similar documents list view. In this view, the similarity column displays a similarity score, which reflects the similarity of the listed documents to the selected document; therefore, since a similarity score is determined, this corresponds to ‘applying the determinations to contextually similar data’).

Referring to Claim 3, Davis teaches the method according to claim 1, further comprising receiving a review of each document in the updated coded control set to ensure that proper coding has been implemented (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review).

Referring to Claim 4, Davis teaches the method according to claim 1, wherein generating a coded control set of data further comprises generating the coded control set of data based on a plurality of uncoded documents from a corpus, the plurality of uncoded documents comprising documents that are relevant or selected randomly (see Column 2: lines 28-30; Davis teaches “[u]ncoded documents 101 are loaded and registered 102 into a workfile. A user codes the documents to create a training set”. Moreover, at Column 2: lines 41-47: “[t]he documents 201 may be coded or uncoded. An input queue 202 may be used to organize addition of documents 201 to the training set, for instance, when a news dissemination service is receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents”. Therefore, the coded control set is based on uncoded documents from a corpus, and since the documents are selected as a portion from multiple feeds this corresponds to the documents being relevantly or randomly selected).

Referring to Claim 5, Davis teaches the method according to claim 4, wherein the portion of the documents are randomly sampled from an un-reviewed document population (see Column 2: lines 56-62; Davis teaches editorial review, for quality control or other purposes, may also include a random sample 212 of documents that were above a confidence threshold during coding. Selection of a random sample 212 for editorial review balances addition to the training set of difficult cases, with low confidence scores, and easier cases, with higher confidence scores. Therefore, this corresponds to ‘selecting a portion of documents that are selected randomly’).

Referring to Claim 6, Davis teaches the method according to claim 1, further comprising receiving a determination if the document coded using the coded control set was miscoded, prior to the step of receiving the hard coding correction to the document (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review; therefore, the miscoded determination is prior to have a hard coding correction by the editorial reviewer).

Referring to Claim 7, Davis teaches the method according to claim 1, further comprising updating the training data with each new coded document created using the coded control set (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the retuned training set corresponds to the ‘updated training data using the coded control set’).

Referring to Claim 8, Davis teaches a method comprising: 
receiving an initial set of relevant documents (see Column 2: lines 23-30; Davis teaches an initial set of documents, which are used for creating a training, set. This corresponds to ‘initial set of relevant documents);
receiving coded documents from document reviewer of a plurality of document reviewers (see Column 2: lines 40-47; Davis teaches adding documents to an established training set. The documents 201 may be coded or uncoded. An input queue 202 may be used to organize addition of documents 201 to the training set, for instance, when a news dissemination service is receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents); 
updating the initial set of relevant documents with the coded documents (see Column 2: lines 40-47; Davis teaches adding documents to an established training set. The documents 201 may be coded or uncoded. An input queue 202 may be used to organize addition of documents 201 to the training set, for instance, when a news dissemination service is receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. Therefore, since the documents are added, the initial set is updated); 
receiving a coding correction to at least one of the coded documents from a human reviewer (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user, wherein the corrected classified document is flagged as being reviewed by the human); and 
updating the initial set of relevant documents with the coding correction (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223); and
automatically coding additional documents using a predictive coding system that is configured to automatically code the additional documents using the updated initial set of relevant documents with the hard coding correction (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the updated training set is used for automatic categorization/coding of additional documents).
However, Davis fails to teach that the received initial set includes computer-suggested documents from machine learning; 
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts; and 
supplementing the initial set of relevant documents with the contextually similar documents.
Cao teaches, in an analogous system, the received initial set including computer-suggested documents from machine learning (see Column 4: lines 38-43, and Column 6: lines 6-14; Cao teaches machine learning algorithms used to train models for identifying duplicate accounts based on the attribute values of the accounts. The model training system is a data processing system, such as a personal computer or server, that can use a machine-learning algorithm to create a model for identifying duplicate accounts. Furthermore, Cao teaches a model refinement system that classifies pairs of data in the initial training set used to generate the decision tree and then filters the initial training set to remove data records representing non-duplicate accounts that are incorrectly classified as duplicate accounts to generate a filtered training. The model refinement then provides the filtered training set to the model training system. Therefore, the machine-learning algorithm corresponds to the claimed ‘machine learning’ and the provided filtered training set corresponds to the ‘received set of relevant computer suggested documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teachings of Davis with the above teachings of Cao by receiving training data for generating a coded control set of data, as taught by Davis, wherein the training data includes computer-suggested documents from machine learning, as taught by Cao. The modification would have been obvious because one of ordinary skill in the art would be motivated to use machine learning algorithms to increase the accuracy of the models trained (Cao at Column 5: 45-47).
Even though Davis teaches the concept of a human reviewer performing a hard coding correction during the editorial review, Goodwin explicitly teaches, in an analogous system, a coding correction (see [0026]; Goodwin teaches “categories may be changed or other categories may be assigned to the document by, for example, a system administrator or other user, after the document's creation” and “[a]n actions record may be maintained, step 106. The actions record may identify actions that have been performed on a document by a user, time of action, duration of action, and other criteria”. Moreover, at [0034]- “[c]ategory assigning module 302 may enable one or more categories to be assigned to a document. The categories may be, for example, assigned when the document is created. Alternatively, a system administrator and/or one or more users may be granted rights for assigning, changing, or deleting categories assigned to one or more documents”. Therefore, since Goodwin teaches that a system administrator or user can change the previous category of a document, this corresponds to the claimed hard correction comprising a change of one code to another).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis and Cao with the above teachings of Goodwin by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning, as taught by Davis and Cao, and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Goodwin. The modification would have been obvious because one of ordinary skill in the art would be motivated to change the category of a previous classified document to a new one for correction and updating the accuracy provided by documents in training set of used for automatic categorization (as suggested by Davis at Abstract) and maintain a record of user changes to the documents classifications, as suggested by Goodwin at [0026]).
Hennig teaches, in an analogous system,
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts (see [0041]; Hennig teaches “the distribution of probabilities of the corpus/topic distribution prediction may be compared with a distribution of probabilities of a document/topic distribution of an input document to determine whether the input document concerns similar topics as the document corpus. In another example, one or more documents within the document corpus having semantically similar topics may be grouped as determined by the topic model”. Moreover, at [0049]; Hennig teaches “In one example, the training component 306 may be configured to group one or more documents within the document corpus 302 having semantically similar topics as determined by the topic model 308”. Therefore, the comparison of a document with the corpus using probabilities of the corpus/topic distribution to determine if they are semantically similar corresponds to the claimed identification of contextually similar documents); and 
supplementing the initial set of relevant documents with the contextually similar documents (see [0036] and Fig. 3; Hennig teaches “an addition of a new document to the document corpus may be detected. A new document representation of the new document and new features of the new document may be processed using the topic model. In this way, the topic model may be updated based upon the processing of the new document (e.g., the topic model may be trained by updating current parameters with parameters specified during the processing of the new document representation and/or the new features”. Moreover, it can be seen at Fig. 3 the updating of the Topic model by the training component. Therefore, this updating of the topic model with new documents based on similarity corresponds to the claimed supplementing the initial set of relevant documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis, Cao and Goodwin with the above teachings of Hennig by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Davis, Cao and Goodwin, and supplementing the initial set of relevant documents, as taught by Hennig. The modification would have been 

Referring to Claim 9, Davis teaches the method according to claim 8, further comprising providing a computer-generated judgment on the coded documents with each having an explicit confidence score (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review).

Referring to Claim 10, Davis teaches a method comprising: 
transmitting a first set of documents of a corpus for coding, the first set of documents being automatically coded (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review. Therefore, these are automatically categorized/coded, and those under a certain score, are then transferred for human review); 
updating a set of relevant documents with the first set of documents hard coded by a human (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223); and 
transmitting another set of documents of the corpus based on the updated set of relevant documents (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the categorization of documents for automatic categorization corresponds to the ‘another set of documents’); 
automatically coding the another set of documents using a predictive coding system that is configured to automatically code the another set of documents using the updated set of relevant documents with the first set of documents hard coded by the human (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, automatic categorization/coding is done using the automatic categorization engine being already updated);
presenting a relevant document from the another set of documents to a human reviewer (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user. Therefore, the automated coded document not meeting the threshold for forwarding/presenting it to the reviewer corresponds to the relevant document);  
allowing the human reviewer to correct the automated coding of the relevant document by performing a hard coding correction, the hard coding correction comprising changing at least a portion of the automated coding from a first coding to a second coding (see Column 2: lines 54-56: “[d]ocuments verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223”. Therefore, this is interpreted as the human reviewer performing a correction in the editorial review for quality control); 
receiving the hard coding correction to at least a portion of the automated coding of the relevant document (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user, wherein the corrected classified document is flagged as being reviewed by the human); and 
updating the set of relevant documents with the relevant document having the hard coding correction (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223. Therefore, the coded control set is updated).

that the received training data includes computer-suggested documents from machine learning; and 
fails to explicitly teach:
the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding;
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts; and 
supplementing the initial set of relevant documents with the contextually similar documents
Cao teaches, in an analogous system, including computer-suggested documents from machine learning (see Column 4: lines 38-43, and Column 6: lines 6-14; Cao teaches machine learning algorithms used to train models for identifying duplicate accounts based on the attribute values of the accounts. The model training system is a data processing system, such as a personal computer or server, that can use a machine-learning algorithm to create a model for identifying duplicate accounts. Furthermore, Cao teaches a model refinement system that classifies pairs of data in the initial training set used to generate the decision tree and then filters the initial training set to remove data records representing non-duplicate accounts that are incorrectly classified as duplicate accounts to generate a filtered training. The model refinement then provides the filtered training set to the model training system. Therefore, the machine-learning algorithm corresponds to the claimed ‘machine learning’ and the provided filtered training set corresponds to the ‘computer suggested documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teachings of Davis with the above teachings of Cao by receiving training data for generating a coded control set of data, as taught by Davis, wherein the training data includes computer-suggested documents from machine learning, as taught by Cao. The modification would have been obvious because one of ordinary skill in the art would be motivated to use machine learning algorithms to increase the accuracy of the models trained (Cao at Column 5: 45-47).
Even though Davis teaches the concept of a human reviewer performing a hard coding correction during the editorial review, Goodwin explicitly teaches, in an analogous system, the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding (see [0026]; Goodwin teaches “categories may be changed or other categories may be assigned to the document by, for example, a system administrator or other user, after the document's creation” and “[a]n actions record may be maintained, step 106. The actions record may identify actions that have been performed on a document by a user, time of action, duration of action, and other criteria”. Moreover, at [0034]- “[c]ategory assigning module 302 may enable one or more categories to be assigned to a document. The categories may be, for example, assigned when the document is created. Alternatively, a system administrator and/or one or more users may be granted rights for assigning, changing, or deleting categories assigned to one or more documents”. Therefore, since Goodwin teaches that a system administrator or user can change the previous category of a document, this corresponds to the claimed hard correction comprising a change of one code to another).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis and Cao with the above teachings of Goodwin by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning, as taught by Davis and Cao, and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Goodwin. The modification would have been obvious because one of ordinary skill in the art would be motivated to change the category of a previous classified document to a new one for correction and updating the accuracy provided by documents in training set of used for automatic categorization (as suggested by Davis at Abstract) and maintain a record of user changes to the documents classifications, as suggested by Goodwin at [0026]).
Hennig teaches, in an analogous system,
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts (see [0041]; Hennig teaches “the distribution of probabilities of the corpus/topic distribution prediction may be compared with a distribution of probabilities of a document/topic distribution of an input document to determine whether the input document concerns similar topics as the document corpus. In another example, one or more documents within the document corpus having semantically similar topics may be grouped as determined by the topic model”. Moreover, at [0049]; Hennig teaches “In one example, the training component 306 may be configured to group one or more documents within the document corpus 302 having semantically similar topics as determined by the topic model 308”. Therefore, the comparison of a document with the corpus using probabilities of the corpus/topic distribution to determine if they are semantically similar corresponds to the claimed identification of contextually similar documents); and 
supplementing the initial set of relevant documents with the contextually similar documents (see [0036] and Fig. 3; Hennig teaches “an addition of a new document to the document corpus may be detected. A new document representation of the new document and new features of the new document may be processed using the topic model. In this way, the topic model may be updated based upon the processing of the new document (e.g., the topic model may be trained by updating current parameters with parameters specified during the processing of the new document representation and/or the new features”. Moreover, it can be seen at Fig. 3 the updating of the Topic model by the training component. Therefore, this updating of the topic model with new documents based on similarity corresponds to the claimed supplementing the initial set of relevant documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis, Cao and Goodwin with the above teachings of Hennig by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Davis, Cao and Goodwin, and supplementing the initial set of relevant documents, as taught by Hennig. The modification would have been 

Referring to Claim 11, Davis teaches the method according to claim 10, further comprising: 
receiving a coding correction for at least one of the first set of coded documents (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review. Therefore, already categorized/coded documents are then verified/corrected); and 
updating the set of relevant documents with the coding correction, wherein the coding correction comprises a hard coding correction (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223).

Referring to Claim 12, Davis teaches the method according to claim 11, further comprising determining if the at least one of the first set of documents was miscoded, prior to the step of receiving the hard coding correction for the at least one of the first set of coded documents (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review; therefore, the miscoded determination is prior to have a hard coding correction by the editorial reviewer).

Referring to Claim 13, Davis teaches the method according to claim 11, wherein the first set of documents are selected based on comparison documents in the corpus with an initial set of relevant documents (see Column 2: lines 34-53; Davis teaches generating nearest neighbor and similarity measures for categorization of further documents. Addition of documents by receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. A categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. Therefore, since similarity measures for categorization is done, this corresponds to ‘select documents based on comparison documents’).

Referring to Claim 14, Davis teaches the method according to claim 11, further comprising: 
automatically coding a document using a coded control set of data (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review; therefore, the miscoded determination is prior to have a hard coding correction by the editorial reviewer); and 
providing the document to a reviewer if the document is incorrectly coded, prior to a step of receiving a hard coding correction to the document (see Column 4: lines 42-49; Davis teaches the status column provides information regarding confidence in coding of a document. "Okay" may be used to indicate that a document has been correctly categorized; "missing" may be used to indicate that a document with a high score has not been assigned to a topic; and "suspicious" may indicate that a document with a low score has been assigned to the topic. "Missing" and "suspicious" documents may be referred to a human for editorial review; therefore, the miscoded determination is prior to have a hard coding correction by the editorial reviewer).

Claims 15-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over Davis et al. (US Patent No. 7,089,238- hereinafter Davis) in view of Goodwin et al (US Pub. No. 2003/0135818- hereinafter Goodwin), and further in view of Hennig et al US Pub. No. 2012/0101965- hereinafter Hennig).
Claim 15, Davis teaches a method of analyzing documents for e-Discovery document review, comprising: 
generating, by a document coding system, an initial control set of documents from a corpus of documents to be coded (see Column 2: lines 23-30; Davis teaches an initial set of documents, which are used for creating a training, set. This corresponds to ‘initial control set’); 
generating a coding method based on the initial control set of documents and a set parameter (see Column 2: lines 23-32; Davis teaches uncoded documents 101 are loaded and registered 102 into a workfile. A user codes the documents to create a training set. The user may begin with a set of topics and add or delete topics 111 to the topic taxonomy. Individual documents can have topic assignments added or removed 112; therefore, this coded training set according to the taxonomy corresponds to the ‘initial control set and set parameter’); 
coding, based on the coding method, at least one document of the corpus of documents (see Column 2: lines 34-53; Davis teaches addition of documents by receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. A categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. The category score is higher or lower, corresponding to a degree of confidence in assignment of a particular document to a particular category. Therefore, since the initial training data is used for training the categorization engine for automatic classification, this corresponds to ‘coding one document’), the coding comprising automated coding to the at least one document of the corpus of documents by the document coding system (see Column 2: lines 34-53; Davis teaches addition of documents by receiving documents from multiple feeds and selecting a portion of them to add to the training set used in production for automatic classification of incoming documents. A categorization engine 211 is used to identify nearest neighbors and calculate similarity and category scores. The category score is higher or lower, corresponding to a degree of confidence in assignment of a particular document to a particular category. Therefore, since the initial training data is used for training the categorization engine for automatic classification, this corresponds to ‘automated coding of the relevant documents by a predictive coding system’); 
validating the at least one coded document based on a received hard coding from a human reviewer (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user, wherein the corrected classified document is flagged as being reviewed by the human); and 
based on the validating, updating the coding method (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223); and
automatically coding additional documents using the document coding system that is configured to automatically code the additional documents using the at least one coded document validated by the human reviewer (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the updated training set is used for automatic categorization/coding of additional documents);
allowing the human reviewer to correct at least a portion of the automated coding of the additional documents by performing a hard coding correction, the hard coding correction comprising changing of the at least a portion of the automated coding of the additional documents from a first coding to a second coding (see Column 2: lines 54-56: “[d]ocuments verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223”. Therefore, this is interpreted as the human reviewer performing a correction in the editorial review for quality control); and 
updating the coding method with the additional documents having the hard coding correction (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223. Therefore, the coded control set is updated).
However, Davis fails to explicitly teach:
the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding;
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts; and 
supplementing the initial set of relevant documents with the contextually similar documents.
the hard coding correction comprising changing of the at least a portion of the automated coding from a first coding to a second coding (see [0026]; Goodwin teaches “categories may be changed or other categories may be assigned to the document by, for example, a system administrator or other user, after the document's creation” and “[a]n actions record may be maintained, step 106. The actions record may identify actions that have been performed on a document by a user, time of action, duration of action, and other criteria”. Moreover, at [0034]- “[c]ategory assigning module 302 may enable one or more categories to be assigned to a document. The categories may be, for example, assigned when the document is created. Alternatively, a system administrator and/or one or more users may be granted rights for assigning, changing, or deleting categories assigned to one or more documents”. Therefore, since Goodwin teaches that a system administrator or user can change the previous category of a document, this corresponds to the claimed hard correction comprising a change of one code to another).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teachings of Davis with the above teachings of Goodwin by receiving training data for generating a coded control set of data, as taught by Davis, and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Goodwin. The modification would have been obvious because one of ordinary skill in the art would be motivated to change the category of a previous classified document to a new one for correction and updating the accuracy 
Hennig teaches, in an analogous system,
identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts (see [0041]; Hennig teaches “the distribution of probabilities of the corpus/topic distribution prediction may be compared with a distribution of probabilities of a document/topic distribution of an input document to determine whether the input document concerns similar topics as the document corpus. In another example, one or more documents within the document corpus having semantically similar topics may be grouped as determined by the topic model”. Moreover, at [0049]; Hennig teaches “In one example, the training component 306 may be configured to group one or more documents within the document corpus 302 having semantically similar topics as determined by the topic model 308”. Therefore, the comparison of a document with the corpus using probabilities of the corpus/topic distribution to determine if they are semantically similar corresponds to the claimed identification of contextually similar documents); and 
supplementing the initial set of relevant documents with the contextually similar documents (see [0036] and Fig. 3; Hennig teaches “an addition of a new document to the document corpus may be detected. A new document representation of the new document and new features of the new document may be processed using the topic model. In this way, the topic model may be updated based upon the processing of the new document (e.g., the topic model may be trained by updating current parameters with parameters specified during the processing of the new document representation and/or the new features”. Moreover, it can be seen at Fig. 3 the updating of the Topic model by the training component. Therefore, this updating of the topic model with new documents based on similarity corresponds to the claimed supplementing the initial set of relevant documents’).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the combination of Davis and Goodwin with the above teachings of Hennig by receiving training data for generating a coded control set of data wherein the training data includes computer-suggested documents from machine learning and allowing a human reviewer to correct the coding by changing coding from a first coding to a second coding, as taught by Davis and Goodwin, and supplementing the initial set of relevant documents, as taught by Hennig. The modification would have been obvious because one of ordinary skill in the art would be motivated to update one or more current feature/topic parameters with specified feature/topic parameters in the training of the topic model during sequential processing of document representations and/or features (as suggested by Hennig at [0032]).

Referring to Claim 16, Davis teaches the method of claim 15, wherein generating the initial control set of documents is based on a received coding category including a hard coding (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223. Therefore, the training set is based on receiving the hard coding).

Referring to Claim 17, Davis teaches the method of claim 16, wherein generating a coding method based on the initial control set of documents and a set parameter further comprises: 
analyzing the initial control set of documents to generate the set parameter associated with the received coding category (see Column 2: lines 23-32; Davis teaches uncoded documents 101 are loaded and registered 102 into a workfile. A user codes the documents to create a training set. The user may begin with a set of topics and add or delete topics 111 to the topic taxonomy. Individual documents can have topic assignments added or removed 112; therefore, this coded training set according to the taxonomy corresponds to the ‘initial control set and set parameter’).

Referring to Claim 18, Davis teaches the method of claim 15, further comprising: 
coding at least one other document based on the updated coding method (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Furthermore, at Column 1: lines 49-51, Davis teaches that once the training set has been retuned, it can be used for categorization of documents. Therefore, the updated training set is used for automatic categorization/coding of at least one other document); 
validating the at least one other coded document based on received hard coding (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Moreover, Fig. 2 as reproduced below, shows that the system continuously (see red arrows) is categorizing documents through the engine 211, documents with low score are sent for human review 213, the training set is updated in 223, and is sent back to 203 which incrementally trains again the categorization engine to reach to correctly coded documents 231. Therefore, the coded documents are revalidated again as the loop shows:

    PNG
    media_image1.png
    328
    632
    media_image1.png
    Greyscale
); and 
based on the validating, further updating the coding method (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Moreover, Fig. 2 as reproduced below, shows that the system continuously (see red arrows) is categorizing documents through the engine 211, documents with low score are sent for human review 213, the training set is updated in 223, and is sent back to 203 which incrementally trains again the categorization engine to reach to correctly coded documents 231. Therefore, the training set is further updated again as the loop shows:

    PNG
    media_image1.png
    328
    632
    media_image1.png
    Greyscale
).

Referring to Claim 19, Davis teaches the method of claim 18, wherein the at least one other document is a plurality of documents selected from the corpus of documents as input to the coding method (see Abstract; Davis teaches methods for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. Moreover, Fig. 2 as reproduced below, shows that the system continuously (see red arrows) is categorizing documents through the engine 211, wherein the at least one other document is selected from the corpus of documents 201-202 as input in the input queue, as the figure shows:

    PNG
    media_image2.png
    339
    639
    media_image2.png
    Greyscale
).

Referring to Claim 20, Davis teaches the method of claim 15, further comprising: 
validating the initial control set of documents based on received hard coding (see Column 2: line 51- Column 3 line 3; Davis teaches the process of editorial review for quality control of either random documents or documents that did not meet a confidence threshold. Further, Davis teaches that this editorial review is done by a human user, wherein the corrected classified document is flagged as being reviewed by the human); and 
based on the validating, updating the initial control set of documents (see Column 2: lines 54-56; Davis teaches that documents verified by editorial review are collected in a verified documents set 214 and used for incremental updating of the training set 223).
Response to Arguments
The Applicant’s arguments regarding the rejection of above-mentioned claims have been fully considered.
In reference to Applicant’s arguments about:
Rejections under 35 USC 103(a).
Examiner’s response:
	Applicants’ arguments have been considered, but they are directed to the newly added limitations “…identifying contextually similar documents to the initial set of relevant documents utilizing machine learning to automatically detect concepts within the initial set of relevant documents using a statistical analysis of word contexts; and 
supplementing the initial set of relevant documents with the contextually similar documents”. These amendments necessitated new grounds of rejection; therefore, arguments are moot in view of the new grounds of rejection.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LUIS A SITIRICHE whose telephone number is (571)270-1316.  The examiner can normally be reached on M-F 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126