DETAILED ACTION

Remarks
Claims 1-20 have been examined and rejected. This Office action is responsive to the amendment filed on 06/14/2022, which has been entered in the above identified application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

		Claim Objections
Claim 6, 7, 10, and 11 are objected to because of the following informalities:  
Claims 6, 7 recite ‘the displaying the plurality of candidate documents’; however, they should recite - - displaying the plurality of candidate documents - -.
Claims 10, 11 recite ‘the displaying the plurality of candidate answers’; however, they should recite - - displaying the plurality of candidate answers - -.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11, 12, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Woolf (US 20210342743 A1, published 11/04/2021) in view of Piscitello et al. (US 20050240576 A1, published 10/27/2005), hereinafter Piscitello.

Regarding claim 1, Woolf teaches the claim comprising:
A computer-implemented method comprising (Woolf Figs. 1-9; abs. the present invention further provides methods and tools that not only afford machine learning models that are easily created and configured without the necessity of hard coding by the user, but also to afford the user with the ability to share their “know-how” derived from these models to collectively improve the models; [0070], the systems/tools that carry out the methods disclosed herein comprise one or more central computing devices): 
receiving a search query from a user of data of the user (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0014], FIG. 5 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface for user entry of user elected search criteria; [0020], the present invention allows a user to configure and train a machine learning model, for searches and predictions of certain accessible content, documents, or other materials within a database; the user may further iteratively configure and refine the machine learning model to generate the desired model outcomes; [0049], training the primary machine learning model with a reference subset based on a comparative scoring analysis to produce a first training data set; [0066], the source may be the news, earnings reports, management discussions, company profiles, people biographies, transaction records (e.g., logs and transcripts of meetings), customer service, meeting documentation, or any combination thereof; [0068], providing an interface for the user to establish the user search criteria, e.g., via request within the interface; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0073], the establishment of the user elected search criteria is used to create a primary machine learning model; [0077], the primary machine learning model is trained, or presented, with one or more sets of training data from a database (e.g., subset of a database), which can come from any source (e.g., being provided by the system/tool, by the user, or searched for), and described herein as a reference subset)
performing a search of the data of the user, using a machine learning model, for the search query to generate a result (Woolf Figs. 1-9; [0077], the primary machine learning model is trained, or presented, with one or more sets of training data from a database (e.g., subset of a database), which can come from any source (e.g., being provided by the system/tool, by the user, or searched for), and described herein as a reference subset; the model processes the reference subset based on a comparative scoring analysis, e.g., calculating probability accuracy of the predictions of the application of the model to a given set of data, e.g., via classification; this scoring analysis provides both positive and negative comparisons; the comparative scoring analysis is based on textual scoring, e.g., derived from a machine learning based toolkit for natural language processing (e.g., tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution); [0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0136], the system 400 receives 114 a first selection of sources 118 from the user device 410; the system 400 thereafter sends 120 a request to the user device 410 for a plurality of initial search criteria 128; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0141], refine 162 the MLM by i) iterating the step of searching 140; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168); 
generating a confidence score for the result of the search; selecting a proper subset of the data to be provided to the user based on the confidence score (Woolf Figs. 1-9; [0072], the methods of user-directed iterative (UDI) machine learning of the present invention afford a user the ability to select, customize, or specifically individualize the search criteria based on attributes of the training data, e.g., the user-selected set of pre-processing parameters; the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); the user selects a set of source training data and a set of pre-processing parameters (i.e., as identified by the user elected search criteria), and the pre-processing parameters are used to convert the set of training data to a set of mathematical matrices suitable for building a machine learning model, i.e., the primary machine learning model; [0077], the model processes the reference subset based on a comparative scoring analysis, e.g., calculating probability accuracy of the predictions of the application of the model to a given set of data, e.g., via classification; the comparative scoring analysis is based on textual scoring, e.g., derived from a machine learning based toolkit for natural language processing (e.g., tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution); [0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154; [0143], after refining 162 the MLM, the system 400 may apply 170 a new selection of pre-processing parameters 178 to modify the MLM, including but not limited to changing the confidence interval; [0016], the confidence scores are readily displayed); 
displaying the proper subset of the data, with a visual indication added to a proper subset of text of the proper subset of the data based on the confidence score, to the user via a graphical user interface; receiving an indication from the user via the graphical user interface of one or more sections of the proper subset of the data for use in a next training iteration of the machine learning model for the search query (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0015], FIG. 6 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the first training data set; [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM; the system 400 receives 154 from the user device 410 a first plurality of selections 158 of the first plurality of training data 142, and advantageously, also the user's feedback on the relevance of each of the first plurality of training selections 158, such feedback being necessary for the MLM creation function 197; [0140], the system 400 applies 160 the first plurality of selections 158 of the first plurality of training data 142, including but not limited to the user's feedback on which results are accurate, relevant, or desired, and which are not accurate, relevant, or desired, to the first selection of the plurality of pre-processing parameters 138 to create the initial MLM; [0141], it has been found advantageous to have the system 400, at a later time, refine 162 the MLM by i) iterating the step of searching 140, in the plurality of databases 470 or other sources of data containing or comprising the first selection of sources 118 or a next selection of sources 164, for the plurality of initial search criteria 128, and/or for a next plurality of search criteria 165, such iterated search 140 resulting in a new plurality of training data 168, and then ii) predicting 166 the outcome values for the new plurality of training data 168 based on the MLM; the user later validates the outcomes values from the new plurality of training data 168, in steps described below as the MLM validation function 198; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168);
and performing the next training iteration of the machine learning model with the one or more sections of the proper subset of the data (Woolf Figs. 1-9; [0050], validating the first training data set within a database by user direction to create a user-directed machine learning model; [0051], training the user-directed machine learning model with a second reference subset based on comparative scoring to produce a second training data set; [0052], validating the second training data set within the database by user direction to create a first user-directed iterative machine learning model; [0141], it has been found advantageous to have the system 400, at a later time, refine 162 the MLM by i) iterating the step of searching 140, in the plurality of databases 470 or other sources of data containing or comprising the first selection of sources 118 or a next selection of sources 164, for the plurality of initial search criteria 128, and/or for a next plurality of search criteria 165, such iterated search 140 resulting in a new plurality of training data 168, and then ii) predicting 166 the outcome values for the new plurality of training data 168 based on the MLM; the user later validates the outcomes values from the new plurality of training data 168, in steps described below as the MLM validation function 198; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168; [0144], the system 400 thereafter receives 186 from the user device 410 a plurality of validation responses 188 from a user regarding the accuracy of the validation output 184; the system 400 processes 190 the plurality of validation responses 188, and integrates 192 the plurality of validation responses 188 to the MLM, modifying the MLM to improve accuracy and relevance of model output 152 to meet the needs or interests of that user; [0080], validating the first training data set within the database is achieved by user direction on which item members of the first training data set accurately fall within the desired results of the machine learning model, e.g., with respect to the indication of a positive or a negative result; [0083], the user-directed machine learning model is trained with a separate set of training data from a database (e.g., a second subset of the database, e.g., non-overlapping; or a separate database), which can come from any source (e.g., being provided by the system/tool, by the user, or searched for); this separate set of training data is referred to herein as a second reference subset; [0084], the user-directed machine learning model processes the second reference subset based on a comparative scoring analysis to produce a second training data set; [0087], validating the second training data set within the database is achieved by user direction on which item members of the second training data set accurately fall within the desired results of the machine learning model)
	However, Woolf fails to expressly disclose with a visual emphasis added to a proper subset of text of the proper subset of the data based on the confidence score.  In the same field of endeavor, Piscitello teaches:
with a visual emphasis added to a proper subset of text of the proper subset of the data based on the confidence score (Piscitello Figs. 1-8; [0029], links to sets of web documents returned by search engine 120 may include, in addition to text snippets that describe the web documents, a visual cue that informs the user that the web document is likely to be relevant to the user's search query; the link corresponding to a document that is determined to be "highly relevant" (i.e., a high confidence that the document is the document that the user would be most interested in viewing) to the user search query is displayed with the visual cue; [0027], ranking component 122 assists search engine 120 in returning relevant documents to the user by ranking the set of documents identified by document locator 121; this ranking may take the form of assigning a numerical value, called a relevance score, corresponding to the calculated relevance of each document identified by document locator 121; [0028], a document is to be broadly interpreted to include any machine-readable and machine-storable work product; [0030], FIG. 2 is a diagram illustrating a document 200 that includes links to web documents that may be displayed to a user at a client device 102 in response to a search query; [0033], one or more of the links 210-214 may contain a visual cue 230 corresponding to the link; visual cue 230 is a miniaturized ("thumbnail") rendering of the web page corresponding to link 210; search query 201 was "stanford"; Search engine 120 determined that the most highly ranked link for "stanford" is the link to the web site of Stanford University (stanford.edu); accordingly, search engine 120 included visual cue 230 in document 200; [0036], FIG. 3 is a flow chart illustrating operation of search engine program 120 consistent with an aspect of the invention; search engine program 120 may begin by receiving a search query from one of users 105 (act 301)’ based on the search query, document locator 121 may generate a set of links to documents that are relevant to the search query (act 302); the set of links may be sorted based on a relevance metric returned for each of the documents from ranking component 122 (act 303); [0037], search engine program 120 may determine whether any of the links returned by document locator 121 are associated with "very relevant" documents (act 304); documents that are determined to be very relevant may be associated with a visual cue, such as visual cue 230 (act 305); [0038], FIG. 4 is a block diagram conceptually illustrating the determination of whether a document is very relevant by search engine program 120; [0040], a document at the top of the sorted list is more likely to be a very relevant document than a document further down on the list; [0044], relevance score for document D is significantly greater than the next highest relevance score in the returned set of documents (determined based on component 403); [0045], the parameters associated with components 401-404, or based on other parameters, can be used to determine whether a document is highly relevant, or more generally, to generate a value that measures the confidence level that the document is highly relevant; [0046], other forms of highlighting, such as a logo, contrasting textual fonts (e.g., text contrasted by size, color, or weight) that are designed to stand out, contrasting backgrounds, or textual labels may be used in place of thumbnails)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated with a visual emphasis added to a proper subset of text of the proper subset of the data based on the confidence score as suggested in Piscitello into Woolf.  Doing so would be desirable because locating a desired portion of the information, however, can be challenging (see Piscitello [0005]).  By including visual cue 230 with very relevant links, users may learn to associate the visual cue with links that search engine 120 is confident matches the user's intentions. As the users begin to trust visual cue 230, the visual cue allows the user to home in on the relevant search results faster as they will not necessarily need to read the corresponding snippet 220. This may thus result in a decreased "time to satisfaction" for the user and a concomitant increase in search engine satisfaction (see Piscitello [0035]).
	
Regarding claim 4, claim 4 contains substantially similar limitations to those found in claim 1.  Consequently, claim 4 is rejected for the same reasons.

Regarding claim 15, claim 15 contains substantially similar limitations to those found in claim 1, the only difference being A system comprising: a data storage service implemented by a first one or more electronic devices to store data for a user (Woolf Figs. 1-9; [0035], a graphical user interface, or GUI, facilitates the communication/interaction with stored data on a server by a user through the exchange of information or operation in the GUI; [0077], the primary machine learning model is trained, or presented, with one or more sets of training data from a database (e.g., subset of a database), which can come from any source (e.g., being provided by the system/tool, by the user, or searched for); [0115], the database and the database categorization, e.g., categorization information or organization format of the database, are collected and stored on a machine readable medium, e.g., a server or collection of servers; [0117], the database comprises source content); and a model management service implemented by a second one or more electronic devices, the model management service including instructions that upon execution cause the model management service to (Woolf Figs. 1-9; [0045], the present invention provides methods of model encapsulation of user-directed iterative machine learning such that machine learning models may be created, validated, modified, and applied without the user writing or editing the programming that underlies the machine learning models; [0070], the systems/tools that carry out the methods disclosed herein comprise one or more central computing devices, one or more memory units, one or more input and output channels for communication, one or more databases, one or more networks; [0092], the methods of the present invention described herein are useful as instructions stored on a machine-readable medium for execution by a processor to perform the method; [0093], the present invention provides a model aggregation tool utilizing model encapsulation of user-directed iterative (UDI) machine learning comprising a machine-readable medium having instructions stored thereon for execution by a processor to perform a method of model encapsulation of user-directed iterative machine learning).  Consequently, claim 15 is rejected for the same reasons.

Regarding claim 2, Woolf in view of Piscitello teaches all the limitations of claim 1, further comprising:
wherein the proper subset of the data is a plurality of candidate documents for the search query (Woolf Figs. 1-9; [0020], the present invention allows a user to configure and train a machine learning model, for searches and predictions of certain accessible content, documents, or other materials within a database; [0027], the user-configurable machine learning models disclosed herein may be used for searches of and predictive analytics related to any content, including but not limited to documents, transactions, network-accessible content, or other materials; [0067], the training set data and results data set identify content on a paragraph by paragraph basis, e.g., contained in a complete textual document; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0077-0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Regarding claims 5 and 16, claims 5 and 16 contain substantially similar limitations to those found in claim 2.  Consequently, claims 5 and 16 are rejected for the same reasons.

Regarding claim 3, Woolf in view of Piscitello teaches all the limitations of claim 1, further comprising:
wherein the proper subset of the data is a plurality of candidate answers for the search query (Woolf Figs. 1-9; [0020], the present invention allows a user to configure and train a machine learning model, for searches and predictions of certain accessible content, documents, or other materials within a database; [0027], the user-configurable machine learning models disclosed herein may be used for searches of and predictive analytics related to any content, including but not limited to documents, transactions, network-accessible content, or other materials; [0067], the training set data and results data set identify content on a paragraph by paragraph basis, e.g., contained in a complete textual document; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0077-0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Regarding claims 9 and 18, claims 9 and 18 contain substantially similar limitations to those found in claim 3.  Consequently, claims 9 and 18 are rejected for the same reasons.

Regarding claim 6, Woolf in view of Piscitello teaches all the limitations of claim 5, further comprising:
wherein the displaying the plurality of candidate documents comprises displaying a respective selector for each of the plurality of candidate documents to the user (Woolf Figs. 1-9; [0015-0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM; the system 400 receives 154 from the user device 410 a first plurality of selections 158 of the first plurality of training data 142, and advantageously, also the user's feedback on the relevance of each of the first plurality of training selections 158, such feedback being necessary for the MLM creation function 197; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)
Piscitello further teaches:
a respective link for each of the plurality of candidate documents to the user (Piscitello Figs. 1-8; [0025], search engine 120 may then return a list of links pointing to the set of documents determined by document locator 121; the list of links may be sorted based on the relevance scores determined by ranking component 122; [0029], links to sets of web documents returned by search engine 120 may include, in addition to text snippets that describe the web documents, a visual cue that informs the user that the web document is likely to be relevant to the user's search query; the link corresponding to a document that is determined to be "highly relevant" (i.e., a high confidence that the document is the document that the user would be most interested in viewing) to the user search query is displayed with the visual cue; [0028], a document is to be broadly interpreted to include any machine-readable and machine-storable work product; [0030], FIG. 2 is a diagram illustrating a document 200 that includes links to web documents that may be displayed to a user at a client device 102 in response to a search query; [0032], the user may select any of links 210-214 to thereby direct the web browser to return the web document pointed-to by the links; [0033], one or more of the links 210-214 may contain a visual cue 230 corresponding to the link; visual cue 230 is a miniaturized ("thumbnail") rendering of the web page corresponding to link 210; search query 201 was "stanford." Search engine 120 determined that the most highly ranked link for "stanford" is the link to the web site of Stanford University (stanford.edu); accordingly, search engine 120 included visual cue 230 in document 200)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated a respective link for each of the plurality of candidate documents to the user as suggested in Piscitello into Woolf.  Doing so would be desirable because locating a desired portion of the information, however, can be challenging (see Piscitello [0005]).  By including visual cue 230 with very relevant links, users may learn to associate the visual cue with links that search engine 120 is confident matches the user's intentions. As the users begin to trust visual cue 230, the visual cue allows the user to home in on the relevant search results faster as they will not necessarily need to read the corresponding snippet 220. This may thus result in a decreased "time to satisfaction" for the user and a concomitant increase in search engine satisfaction (see Piscitello [0035]).  Additionally, the links enable the user to direct the web browser to return the web document pointed-to by the link (see Piscitello [0032]), thereby enabling the user to view additional information, if desired.

Regarding claim 17, claim 17 contains substantially similar limitations to those found in claim 6.  Consequently, claim 17 is rejected for the same reasons.

Regarding claim 7, Woolf in view of Piscitello teaches all the limitations of claim 5, further comprising:
wherein the displaying the plurality of candidate documents comprises displaying the search query to the user (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0014], FIG. 5 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface for user entry of user elected search criteria; [0015], FIG. 6 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the first training data set (displaying the search query to the user); [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0077-0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Regarding claim 8, Woolf in view of Piscitello teaches all the limitations of claim 5, further comprising:
wherein the indication from the user of the one or more sections is whether a respective interface element for each document of the plurality of candidate documents is selected by the user (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0015-0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM; the system 400 receives 154 from the user device 410 a first plurality of selections 158 of the first plurality of training data 142, and advantageously, also the user's feedback on the relevance of each of the first plurality of training selections 158, such feedback being necessary for the MLM creation function 197; [0140], the system 400 applies 160 the first plurality of selections 158 of the first plurality of training data 142, including but not limited to the user's feedback on which results are accurate, relevant, or desired, and which are not accurate, relevant, or desired, to the first selection of the plurality of pre-processing parameters 138 to create the initial MLM; [0141], it has been found advantageous to have the system 400, at a later time, refine 162 the MLM by i) iterating the step of searching 140, in the plurality of databases 470 or other sources of data containing or comprising the first selection of sources 118 or a next selection of sources 164, for the plurality of initial search criteria 128, and/or for a next plurality of search criteria 165, such iterated search 140 resulting in a new plurality of training data 168, and then ii) predicting 166 the outcome values for the new plurality of training data 168 based on the MLM; the user later validates the outcomes values from the new plurality of training data 168, in steps described below as the MLM validation function 198; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Regarding claim 11, Woolf in view of Piscitello teaches all the limitations of claim 9, further comprising:
wherein the displaying the plurality of candidate answers comprises displaying the search query to the user (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0014], FIG. 5 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface for user entry of user elected search criteria; [0015], FIG. 6 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the first training data set (displaying the search query to the user); [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0077-0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Regarding claim 12, Woolf in view of Piscitello teaches all the limitations of claim 9, further comprising:
wherein the indication from the user of the one or more sections is whether a respective interface element for each answer of the plurality of candidate answers is selected by the user (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0015-0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0138], the system 400 then sends 150 the first plurality of training data 142 to the user device 410; each of the plurality of results comprising the first plurality of training data 142 for the MLM is sent 150 to the user device 410 with a plurality of selectors, e.g., checkboxes, which the user device 410 may use to collect input on the relevance or utility of each such result; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM; the system 400 receives 154 from the user device 410 a first plurality of selections 158 of the first plurality of training data 142, and advantageously, also the user's feedback on the relevance of each of the first plurality of training selections 158, such feedback being necessary for the MLM creation function 197; [0140], the system 400 applies 160 the first plurality of selections 158 of the first plurality of training data 142, including but not limited to the user's feedback on which results are accurate, relevant, or desired, and which are not accurate, relevant, or desired, to the first selection of the plurality of pre-processing parameters 138 to create the initial MLM; [0141], it has been found advantageous to have the system 400, at a later time, refine 162 the MLM by i) iterating the step of searching 140, in the plurality of databases 470 or other sources of data containing or comprising the first selection of sources 118 or a next selection of sources 164, for the plurality of initial search criteria 128, and/or for a next plurality of search criteria 165, such iterated search 140 resulting in a new plurality of training data 168, and then ii) predicting 166 the outcome values for the new plurality of training data 168 based on the MLM; the user later validates the outcomes values from the new plurality of training data 168, in steps described below as the MLM validation function 198; [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)

Claims 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Woolf in view of Piscitello in further view of Rodriguez et al. (US 20130066889 A1 published 03/14/2013), hereinafter Rodriguez.

Regarding claim 10, Woolf in view of Piscitello teaches all the limitations of claim 9, further comprising:
wherein the displaying the plurality of candidate answers comprises displaying a respective passage (Woolf Figs. 1-9; [0119], a graphical user interface affords a useful way to implement the methods of the invention, for example, as depicted in FIGS. 5 to 9; [0015], FIG. 6 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the first training data set; [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set; [0031], categorization is a method of mapping the user elected search criteria to filter the categorized database; the categorization is performed on a paragraph by paragraph basis; [0072], the methods of user-directed iterative (UDI) machine learning of the present invention afford a user the ability to select, customize, or specifically individualize the search criteria based on attributes of the training data, e.g., the user-selected set of pre-processing parameters; the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); [0142], the system 400 may refine 162 the MLM by iterating a plurality of the MLM pre-processing function 196, that is, the MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410, and iv) receive feedback from the user device 410 as a new plurality of training data 168)
Piscitello further teaches:
displaying a respective passage, highlighted as a candidate answer based on the confidence score, for each of the plurality of candidate answers (Piscitello Figs. 1-8; [0029], links to sets of web documents returned by search engine 120 may include, in addition to text snippets that describe the web documents, a visual cue that informs the user that the web document is likely to be relevant to the user's search query; the link corresponding to a document that is determined to be "highly relevant" (i.e., a high confidence that the document is the document that the user would be most interested in viewing) to the user search query is displayed with the visual cue; [0027], ranking component 122 assists search engine 120 in returning relevant documents to the user by ranking the set of documents identified by document locator 121; this ranking may take the form of assigning a numerical value, called a relevance score, corresponding to the calculated relevance of each document identified by document locator 121; [0028], a document is to be broadly interpreted to include any machine-readable and machine-storable work product; [0030], FIG. 2 is a diagram illustrating a document 200 that includes links to web documents that may be displayed to a user at a client device 102 in response to a search query; [0033], one or more of the links 210-214 may contain a visual cue 230 corresponding to the link; visual cue 230 is a miniaturized ("thumbnail") rendering of the web page corresponding to link 210; search query 201 was "stanford"; search engine 120 determined that the most highly ranked link for "stanford" is the link to the web site of Stanford University (stanford.edu); accordingly, search engine 120 included visual cue 230 in document 200 (Fig. 2 appears to show specific highlighted words within the passage); [0036], FIG. 3 is a flow chart illustrating operation of search engine program 120 consistent with an aspect of the invention; search engine program 120 may begin by receiving a search query from one of users 105 (act 301)’ based on the search query, document locator 121 may generate a set of links to documents that are relevant to the search query (act 302); the set of links may be sorted based on a relevance metric returned for each of the documents from ranking component 122 (act 303); [0037], search engine program 120 may determine whether any of the links returned by document locator 121 are associated with "very relevant" documents (act 304); documents that are determined to be very relevant may be associated with a visual cue, such as visual cue 230 (act 305); [0038], FIG. 4 is a block diagram conceptually illustrating the determination of whether a document is very relevant by search engine program 120; [0040], a document at the top of the sorted list is more likely to be a very relevant document than a document further down on the list; [0044], relevance score for document D is significantly greater than the next highest relevance score in the returned set of documents (determined based on component 403); [0045], the parameters associated with components 401-404, or based on other parameters, can be used to determine whether a document is highly relevant, or more generally, to generate a value that measures the confidence level that the document is highly relevant; [0046], other forms of highlighting, such as a logo, contrasting textual fonts (e.g., text contrasted by size, color, or weight) that are designed to stand out, contrasting backgrounds, or textual labels may be used in place of thumbnails )
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated displaying a respective passage, highlighted as a candidate answer based on the confidence score, for each of the plurality of candidate answers as suggested in Piscitello into Woolf.  Doing so would be desirable because locating a desired portion of the information, however, can be challenging (see Piscitello [0005]).  By including visual cue 230 with very relevant links, users may learn to associate the visual cue with links that search engine 120 is confident matches the user's intentions. As the users begin to trust visual cue 230, the visual cue allows the user to home in on the relevant search results faster as they will not necessarily need to read the corresponding snippet 220. This may thus result in a decreased "time to satisfaction" for the user and a concomitant increase in search engine satisfaction (see Piscitello [0035]).
However, Woolf in view of Piscitello fail to expressly disclose with a proper subset of the respective passage highlighted as a candidate answer based on the confidence score, for each of the plurality of candidate answers.  In the same field of endeavor, Rodriguez teaches:
with a proper subset of the respective passage highlighted as a candidate answer based on the confidence score, for each of the plurality of candidate answers (Rodriguez Figs. 1-5; [0013], when a user submits a query of "When was Thanksgiving in 1990", the answer may be highlighted (or otherwise rendered visually distinguishable) within the search results and/or displayed (e.g., "Thanksgiving in 1990: Nov. 22" may be highlighted or appear in bold during display of search results); [0022], question solver module 22 (e.g., via server system 10) analyzes the query at step 54, and determines whether the query represents a fact based inquiry that may be satisfied by a fact based answer at step 56; [0023], if the query represents a fact based inquiry, the question solver module (e.g., via server system 10) determines the answer at step 58; [0024], question solver module 22 may assign a measure of confidence to an answer based on various criteria (e.g., the quantity and/or ranking of sources (e.g., number of the top sites in the results, etc.) containing the answer or equivalent forms of the answer, the empirical reliability of the sources, etc.); the threshold may be with respect to an actual quantity of sites, a percentage of sites, or any other desired criteria; [0025], if the question solver module determines an answer to the query (e.g., with a reasonable degree of confidence based on the natural language processing or commonality techniques described above) as determined at step 60, the answer is provided to results editor module 24; confidence levels or thresholds; [0027], the answer may be separately displayed and provided in the search results in bold (or other effect including different colors, fonts, sizes, flashing, surrounded by various symbols, etc.); [0029], an example of distinguishing the answer from other content within the search results is illustrated in FIG. 5; a manner drawing attention to occurrences of the answer within the description of content according to an alternative embodiment of the present invention. By way of example, the search results for an example query, "Who was President in 1803?" are provided; the answer in this example (e.g., "Thomas Jefferson") is displayed in bold at the top and within the search results; [0032], the results editor module may use any manner of distinguishing an answer from within the search results (e.g., highlighting, setting font characteristics (e.g., type, size, or effects such as bold, italic, or underline), flashing, setting foreground or background color, changing location, any combinations thereof, etc.);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated with a proper subset of the respective passage highlighted as a candidate answer based on the confidence score, for each of the plurality of candidate answers as suggested in Rodriguez into Woolf in view of Piscitello.  Doing so would be desirable because the present invention can be used to highlight any type of data representing any information from any type of source (see Rodriguez [0040]).  Within the search results of Rodriguez, the answer is highlighted for readily viewing by the user (see Rodriguez [0005]).  The highlighting of Rodriguez would improve the results of Woolf in view of Piscitello because answers within a reasonable degree of confidence (see Rodriguez [0011]) may be conspicuously provided in a manner drawing attention to occurrences of the answer within the description of content (see Rodriguez [0027]) thereby enabling the answer to be readily detected (see Rodriguez [0013]).  

Regarding claim 19, claim 19 contains substantially similar limitations to those found in claim 10.  Consequently, claim 19 is rejected for the same reasons.

Claims 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Woolf in view of Piscitello in further view of Dispensa et al. (US 20170329829 A1, published 11/16/2017), hereinafter Dispensa.

Regarding claim 13, Woolf in view of Piscitello teaches all the limitations of claim 9, further comprising:
wherein the displaying the proper subset of the data is in response to the confidence score (Woolf Figs. 1-9; [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); the user selects a set of source training data and a set of pre-processing parameters (i.e., as identified by the user elected search criteria), and the pre-processing parameters are used to convert the set of training data to a set of mathematical matrices suitable for building a machine learning model, i.e., the primary machine learning model; [0077], the model processes the reference subset based on a comparative scoring analysis, e.g., calculating probability accuracy of the predictions of the application of the model to a given set of data, e.g., via classification; the comparative scoring analysis is based on textual scoring, e.g., derived from a machine learning based toolkit for natural language processing (e.g., tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution); [0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM) 
However, Woolf in view of Piscitello fails to expressly disclose wherein in response to the confidence score being less than a confidence threshold with respect to a relevance to the search query.  In the same field of endeavor, Dispensa teaches:
in response to the confidence score being less than a confidence threshold with respect to a relevance to the search query (Dispensa Figs. 1-51; [0042], the unstructured data may include representations of thousands or millions of documents, and the search may identify multiple of those documents that the searching system determines to be relevant to the query; [0072], at box 512, the computing system identifies a ranking of the search results that are responsive to the query; the computing system may rank the search results or request that another system provide a ranking; the ranking may be based on a confidence score that the computing system identifies for each of the responsive results (e.g., as calculated by the confidence score generator 129); the confidence score may be calculated based on various factors; one factor may be a degree to which the result (or the resource that is identified by the result, when the result is considered a summarized version of the resource and may include a link to the resource) includes words in the query; another factor may be a degree to which the result has previously been selected by the same user account or other user accounts in response to the same or similar queries; yet another factor may be whether users selected a user interface element to indicate that the result was helpful or not helpful as a response to the query; the computing system may select a subset of the results with the best confidence scores (e.g., the top twenty results with scores that are highest on a range from 0 to 100, in this example); [0073], at box 514, the computing system provides the responsive search results for presentation; [0075], FIG. 6 shows a webpage that includes a list of results to a query of unstructured data, in which the highest-ranking result has a confidence score that exceeds a threshold; suppose that that the threshold for causing the top result to appear is 75%; since the top result has a confidence score of 81%, the initial presentation of search results shows the top result as expanded; [0076], the expanded result is a picture that was pulled from a resource that can be viewed by selecting a “view source” link; a user can also indicate whether the result was helpful by selecting the “Yes” or “No” buttons next to the phrase “Was this Answer Relevant?” Selecting “Yes” can cause the computing system to weight the confidence score for the result higher; [0077], FIG. 7 shows a webpage that includes list of results to an unstructured query, in which the highest-ranking result has a confidence score that does not exceed a threshold; as a result, all of the results are shown in collapsed format)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated in response to the confidence score being less than a confidence threshold with respect to a relevance to the search query as suggested in Dispensa into Woolf in view of Piscitello.  Doing so would be desirable because particular implementations can, in certain instances, realize one or more of the following advantages. The presentation of results that are relevant to the query can be customized based on a type of the result and a confidence score for the result. This customization can include the computing system selectively expanding certain search results (see Dispensa [0006]).  Additionally, displaying the proper subset of the data based on whether a confidence score for a result does or does not exceed a confidence threshold would enable the user to quickly perceive whether a given result is the best result.  If there no results are shown to exceed the confidence threshold, the user may wish to try a different query. 

Regarding claim 20, claim 20 contains substantially similar limitations to those found in claim 13.  Consequently, claim 20 is rejected for the same reasons.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Woolf in view of Piscitello in further view of Clark et al. (US 20160085857 A1, published 03/24/2016), hereinafter Clark.

Regarding claim 14, Woolf in view of Piscitello teaches all the limitations of claim 9, further comprising:
wherein the displaying of the proper subset of the data is in response to a difference between a first confidence score for a first section of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section of the proper subset of the data with respect to a relevance to the search query (Woolf Figs. 1-9; [0016], FIG. 7 illustrates an exemplary embodiment of the graphical user interface for the user-directed iterative (UDI) machine learning of the present invention; depicting the interface, ready for validation, of the training data sets subsequent to the first training data set (i.e., 2-“n” times); the confidence scores are readily displayed next to each YES/NO validation selector; [0072], the user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences (e.g., number of sample paragraphs returned in a paragraph by paragraph analysis); the user selects a set of source training data and a set of pre-processing parameters (i.e., as identified by the user elected search criteria), and the pre-processing parameters are used to convert the set of training data to a set of mathematical matrices suitable for building a machine learning model, i.e., the primary machine learning model; [0077], the model processes the reference subset based on a comparative scoring analysis, e.g., calculating probability accuracy of the predictions of the application of the model to a given set of data, e.g., via classification; the comparative scoring analysis is based on textual scoring, e.g., derived from a machine learning based toolkit for natural language processing (e.g., tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution); [0078], the product of processing the reference subset within the model is a first training data set, e.g., along with the probability of the accuracy of the predictions based on the model; [0137], the plurality of pre-processing parameters may comprise the data sources, specifically the first selection of sources 118; a level of specificity with which the data of the first selection of sources 118 is to be searched; a confidence threshold; a plurality of guideline keywords to filter the initial set of training data; and more (selecting a proper subset by filtering based on search criteria including a confidence threshold); the system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138; the results of this search 140 comprise the first plurality of training data 142; [0139], it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM)
However, Woolf in view of Piscitello fails to expressly disclose in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section of the proper subset of the data with respect to a relevance to the search query.  In the same field of endeavor, Clark teaches:
in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section of the proper subset of the data with respect to a relevance to the search query (Clark Figs. 1-6; [0003], the plurality of confidence values represents confidence of answers to a query submitted to an answering system; [0011], an answer confidence value may be indicated using any kind of indicator that represents the level of confidence in an answer; [0012], a QA system allows a user to submit a query for answering; the QA system generally returns a number of possible answers that are associated with answer confidence values; grouping answers into buckets makes the returned answers easier to display and interpret; buckets contain a group of answers and are typically associated with one or more threshold values and a descriptive label; the QA system determines which answers to associate with which buckets by comparing the answer confidence values to bucket thresholds; [0045], FIG. 4 depicts a flow diagram illustrating example operations for determining bucket thresholds for a set of answer confidence values based on the size of gaps between answer confidence values; [0048], at block 403, the next smallest answer confidence value is subtracted from the answer confidence value to determine a gap; the gap for the selected answer confidence value is the difference between the answer confidence value and the next smallest answer confidence value; [0050], at block 405, the standard deviation of the gaps is determined; [0052], at block 407, it is determined whether the selected gap is greater than or equal to the standard deviation; if the selected gap is greater than or equal to the standard deviation, control then flows to block 408; [0053], at block 408, the gap is identified to be a cliff; the cliff is a gap for an answer confidence value that is greater than or equal to the standard deviation of all the gaps; [0056], at block 411, the answer confidence values with the largest cliffs are selected to be the bucket thresholds; the number of answer confidence values selected is equal to one less than the number of buckets; [0058], rates of change among the answer confidence values can also be used for determining dynamic thresholds (see Fig. 5 and [0059-0064]); [0069], once the answer confidence values are sorted into buckets, a QA system associates the answers with the buckets based on their associated answer confidence value; the answers are then presented via the QA system; the answers may be presented according to their bucket groupings; in the FIG. 1 example, the answers associated with the answer confidence values 0.15, 0.08, and 0.07 may be presented according to a “not recommended” classification; the “not recommended” classification answers may be presented near the bottom of a user display, in red font, or along with some indication that the answers have low confidence values; the answers associated with the answer confidence values 0.98, 0.94, 0.89, and 0.88 may be presented according to a “preferred” classification; the “preferred” classification answers may be presented near the top of a user display, in green font, or along with some indication that the answers have high confidence values; the “for consideration” answers may be presented in the middle of a user display, in yellow font, or along with some indication that the answers have do not have high confidence values but still may be helpful; [0070], the answering system may be a system that hosts a database of predetermined answers and that provides relevant answers in response to specific queries)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section of the proper subset of the data with respect to a relevance to the search query as suggested in Clark into Woolf in view of Piscitello.  Doing so would be desirable because returning the answers and answer confidence values alone may overwhelm a user or lead to misinterpretations of the quality of a returned answer. Grouping answers into buckets makes the returned answers easier to display and interpret.  But using static bucket thresholds alone disregards the relative value of a set of answers. For instance, all answer confidence values may fall into a single bucket when static bucket thresholds are used. A single bucket of answers does not indicate relative confidence with respect to other answers in the bucket. With dynamic bucket thresholds, a QA system can determine bucket thresholds based on the answer confidence values. Since the dynamic bucket thresholds are based on answer confidence values, the QA system can create bucket thresholds that capture the relative confidence of the answers. In addition, using both static and dynamic bucket thresholds allows a QA system to present answers in a manner that captures relative confidence within a framework of a broadly accepted standard of confidence (see Clark [0012]).  

Response to Arguments
The Examiner acknowledges the Applicant’s amendments to claims 1, 4, 10, 13-15, 19, and 20.  The corrections to claims 10, 13, 14, 19, and 20 have been approved and the rejections to claims 10, 13, 14, 19, and 20 under 35 U.S.C. 112(b) are respectfully withdrawn.  Claims 6, 7, 10, and 11 remain objected to for lack of antecedent basis.
Regarding independent claim 1, the Applicant alleges that Woolf as described in the previous Office action, does not teach amended claim 1.  Examiner has therefore rejected independent claim 1 under 35 U.S.C § 103 as unpatentable over Woolf in view of Piscitello.
Applicant further alleges Woolf fails to disclose “performing a search of the data of the user, using a machine learning model, for the search query to generate a result.”  Paragraph [0137] of Woolf appears to discuss “pre-processing function 196” including “search 400” where “The results of this search 140 comprise the first plurality of training data 142.” and paragraph [0140] of Woolf appears to discuss “The system 400 thereafter carries out a [machine learning model] creation function 197,” and “The system 400 applies 160 the first plurality of selections 158 of the first plurality of training data 142, including but not limited to the user's feedback on which results are accurate, relevant, or desired, and which are not accurate, relevant, or desired, to the first selection of the plurality of pre-processing parameters 138 to create the initial MLM”  (see remarks p. 10).  Examiner respectfully disagrees.
Woolf discloses providing an interface for the user to establish user search criteria, e.g., via request within the interface ([0068]). The user elected search criteria is selected from the group consisting of keywords (e.g., including or excluding certain keywords), source content (e.g., company profiles), confidence threshold, and number of desired occurrences ([0072]).  The establishment of the user elected search criteria is used to create a primary machine learning model.  The user elected search criteria is established by the act of requesting search criteria entry from a user, e.g., defining pre-processing parameters ([0073]).  The primary machine learning model is trained, or presented, with one or more sets of training data from a database.  The model processes the reference subset based on a comparative scoring analysis, e.g., calculating probability accuracy of the predictions of the application of the model ([0077]).  The system 400 receives 114 a first selection of sources 118 from the user device 410.  The system 400 thereafter sends 120 a request to the user device 410 for a plurality of initial search criteria 128 ([0136]).  The system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138. Later, the system 400 searches 140 the plurality of databases ([0137]).  The machine learning model is refined by iterating step 140 ([0141]).  The MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410 ([0142] and Figs. 5-9).  As disclosed by Woolf, the machine learning model is used to perform a search of the data of the user for the search query to generate a result.
Applicant further alleges Woolf fails to disclose (ii) "selecting a proper subset of the data to be provided to the user based on the confidence score" as alleged. E.g., paragraph [0139] of Woolf appears to discuss "In certain embodiments, it has been found advantageous to have the system 400 send 150 to the user device 410 confidence interval scores related to each such result comprising the first plurality of training data 142 of the MLM.")  The Office action references FIG. 6 of Woolf, but the Detailed Description thereof does not appear to discuss FIG. 6 and the text in FIG. 6 of the publication is substantially illegible (see remarks pp. 10-11).
Examiner respectfully disagrees.  Woolf discloses user elected search criteria includes confidence thresholds ([0072]).  The plurality of pre-processing parameters may comprise a confidence threshold (see [0137] and Figs. 5-7, ‘Threshold %’ entry box).  The system 400 receives 134 from the user device 410 a first selection of the plurality of pre-processing parameters 138.  The system 400 searches 140 the plurality of databases 470 or other locations, including but not limited to sources accessible via the network 420, containing or comprising the first selection of sources 118 for the plurality of initial search criteria 128 ([0137]).  The system sends 150 to the user confidence interval scores related to each result ([0139]).  The MLM may iterate a plurality of steps 130, 134, 140, 150, and/or 154, to i) obtain a next plurality of processing parameters, and/or ii) search for a next set of search criteria, and/or iii) send an output 152 of the MLM to the user device 410 ([0142] and Figs. 5-9).  After refining 162 the MLM, the system 400 may apply 170 a new selection of pre-processing parameters 178 to modify the MLM, including but not limited to changing the confidence interval ([0143]).  Confidence scores are displayed next to each result in Fig. 7 ([0016]).  As disclosed by Woolf, a subset of search results are generated and provided to the user based on the confidence score.  Piscitello is cited to clarify selecting a proper subset of the data to be provided to the user based on the confidence score and visually emphasizing based on the confidence score.  
Similar arguments have been presented for claims 4 and 15 and thus, Applicant’s arguments are not persuasive for the same reasons.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Liao (US 10585927 B1, published 03/10/2022) see col. 12 [line 63] – col. 13 [line 26] and Figs. 1-8.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN T REPSHER III whose telephone number is (571)272-7487. The examiner can normally be reached Monday - Friday, 8AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached on (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JOHN T REPSHER III/            Primary Examiner, Art Unit 2143