DETAILED ACTION
This Final Office Action is responsive to Applicant’s Amendment filed on 04 May 2020 in which claims 1 and 13-14 were amended.
Claims 1-25 are currently pending and under examination, of which claims 1 and 13-14 are independent claims. No claims are currently in condition for allowance.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
As required by M.P.E.P. 609(c), the applicant’s submissions of the Information Disclosure Statement dated 05/04/2021 is acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending. As required by M.P.E.P. 609 C(2), a copy of the PTOL-1449 initialed and dated by the examiner is attached to the instant office action.

Response to Arguments
Examiner thanks Applicant’s Attorney Ryan McCormick for interview 05/27/2021 in discussing functionality of claimed subject matter. While no specific agreement was reached, consideration included present amendment and is conducive to mutual understanding in the interest of prosecution.
Applicant’s amendments to independent claims 1 and 13-14 are given further consideration with respect to matters carried forward as it they pertain to Double Patenting, 101 Eligibility, 112 Indefinite, and 103 Obviousness as following. Detailed rejections are updated in the body of the action.
With regard to prior art as applicable to rejection under 35 USC 103, applicant notes amendments as distinguishing from the reference Christos. Respectfully, examiner disagrees. The recitation of claimed elements to include first and second subset of values is taught by Christos as D(x,θ)”. Illustratively, Figs 4-5 graph the subspace/subsets for convex optimization, lending the term “convex subset” or “convex subspace”. In the context of reference as a whole, Christos offers regression estimator that considers Gaussian distribution with expected quantization error, conveying a regressive quantile estimator, see [P.562 RtCol]. Examiner provides reference of Brownlee in assisting with background of terminology for Gaussian distribution. Additionally, the training of Christos provides an iterative framework for training on tuples/pairs and further generates synthetic set [P.567 RtCol]. Accordingly, examiner is not persuaded of arguments and maintains the rejection under 35 USC 103.
With regard to indefinite subject matter as applicable to rejection under 35 USC 112(b), applicant notes the amendments as addressing the issue. Examiner is not persuaded for the following reasons: Initially, remarks note [P.10 of 17 ¶2] “Applicants submit that these features of the variable element are clearly inherent”. Inherency, while not typically applied to 112, is noted by MPEP 2163.00(b) with regard to written description: “Inherency, however, may not be established by probabilities”. The claimed subject matter particularly relates to probabilistic techniques, variance in training of a neural network. However, the very nature of this rejection is directed to resolving the clarity of this variance since there is in fact no exact recitation of calculation or update function in particular disclosed by the specification. Rather, it is set forth with regard to variable of a query. A query is simply a question set forth by a user, i.e., user-defined. Therefore, taking variance of something which is user-defined could still remain to be user-defined under the broadest reasonable interpretation. While any particular value of compute may be of smaller value or of different value from another value, or a “potential value” (could potentially be not a value?) the criticality of range (MPEP 2144.05) only appears to convey known functions for a distribution such as mean, std deviation, variance, etc.  Remarks continue [P.10 of 17 ¶3] such that PHOSITA would “understand, with the appropriate degree of certainty, what a variable element is and how to determine a variance with respect to such a variable element”. Respectfully, a variable is by definition uncertainty as the root of the word, to vary. The very fact that this drives to the central concept of a claimed invention suggests that it is not so readily understood with certainty. Simply, the variance lacks clarity in the context of the application as a whole and is only further detailed by elements which are inherent. Accordingly, examiner is not persuaded and maintains rejection under 35 USC 112(b).
With regard to eligibility of abstract idea as applicable to rejection under 35 USC 101, applicant’s amendments to independent claims 1, 13-14, and remarks dated 05/04/2021 are considered. Of particular traversal, applicant notes practical application as improvement to the functioning of a computer by way of reducing bias. Specifically, remarks [P.8 of 17 ¶2] “accounting for potential bias in variance determination provides for the implementation of variance determination processes with lower bias than subset-based variance determination”. However, from the amendments it is clear that the technique is subset-based, i.e., “first subset of potential values… second subset of potential values”. Not only is the technique explicitly set forth as subset-based, it is merely a hypothetical/potential. To point, the variable of query might be none other than preference of a color palette. The additional recitation of subsets further do not integrate into a practical application or amount to significantly more because, as applicant points out, they are inherent in variance and thus considered part of the abstract idea. Accordingly, examiner is not persuaded and maintains rejection under 35 USC 101 as being directed to an abstract idea without significantly more.
The rejection over Double Patenting is updated to reflect present claim status.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 and 14 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-19 of copending Application No. 16/717,251. Although the claims at issue are not identical, they are not patentably distinct from each other because both applications are directed to neural network training of query based on corpus. Claims are re-drafted but arrive at the same functionality. This is a Provisional Nonstatutory Double Patenting rejection because the patentably indistinct claims have not in fact been patented. See the following comparison table: 
Instant Application: 15/858,936
Copending Application: 16/717,251
Claim 1. 
A method for generating training sets for training neural networks, comprising: 














receiving a plurality of query pairs, wherein each of the plurality of query pairs includes a query and a real result previously determined for the query; 








determining at least one variable element of each query in the plurality of received query pairs; 

determining a variance for the at least determined variable element of each query in the plurality of received query pairs; and 



















generating a training set based on the determined variable element, the determined variance, and the previously determined real result.



Amendment:

wherein the at least one variable element indicates a variable and a first subset of potential values for the variable, and wherein the variance includes at least one second subset of potential values for the variable, wherein each of the at least one second subset of potential values for the variable is different from the first subset of potential values for the variable
Claim 1. 
A method for generating training sets for training neural networks, comprising: 

determining a segmentation based on a column from a columnar database table; 

generating a group-by query based on the segmentation; generating a plurality of reduced queries based on the group-by query; 

executing the group-by query on a table of a database to obtain a result table, wherein the result table includes a plurality of results, wherein each result corresponds to a respective reduced query of the plurality of reduced queries; and 

generating a plurality of training query pairs by pairing each reduced query with its corresponding reduced result.

Claim 7. 
The method of claim 6, wherein generating each training query further comprises: 

determining a variable element of a query of the set of queries; and  Page 22 of 36SSNE P1485 

determining a variance of the variable element, wherein the training query is generated based on the determined variable element and the determined variance.

Claim 8. 
The method of claim 1, further comprising: training a neural network at least partially using the plurality of training query pairs.

Claim 9. 
The method of claim 8, wherein the neural network is further trained when a predicted result generated by the neural network differs from a real result generated based on a dataset above a threshold, further comprising: 

generating an updated predicted result based on the predicted result and the real result, wherein the updated predicted result is utilized as a training input to the neural network.


Claim 4.

determining a distribution of values

Claim 14. 
A system for generating approximations of query results, comprising: 

a processing circuitry; and 

a memory, the memory containing instructions that, when executed by the processing circuitry, configure the system to: 














receive a plurality of query pairs, wherein each of the plurality of query pairs includes a query and a real result previously determined for the query; 









determine at least one variable element of each query in the plurality of received query pairs; 

determine a variance for the at least determined variable element of each query in the plurality of received query pairs; and 






















generate a training set based on the determined variable element, the determined variance, and the previously determined real result.




Amendment:

wherein the at least one variable element indicates a variable and a first subset of potential values for the variable, and wherein the variance includes at least one second subset of potential values for the variable, wherein each of the at least one second subset of potential values for the variable is different from the first subset of potential values for the variable
Claim 11. 
A system for generating training sets for training neural networks, comprising: 

a processing circuitry; and 

a memory, the memory containing instructions that, when executed by the processing circuitry, configure the system to: 

determine a segmentation based on a column from a columnar database table; 

generate a group-by query based on the segmentation; generate a plurality of reduced queries based on the group-by query;  Page 23 of 36SSNE P1485 

execute the group-by query on a table of a database to obtain a result table, wherein the result table includes a plurality of results, wherein each result corresponds to a respective reduced query of the plurality of reduced queries; and 

generate a plurality of training query pairs by pairing each reduced query with its corresponding reduced result.


Claim 17. 
The system of claim 16, wherein the system is further configured to: 

determine a variable element of a query of the set of queries; and 

determine a variance of the variable element, wherein the training query is generated based on the determined variable element and the determined variance.

Claim 18. 
The system of claim 11, wherein the system is further configured to: 

train a neural network at least partially using the plurality of training query pairs.

Claim 19. 
The system of claim 18, wherein the neural network is further trained when a predicted result generated by the neural network differs from a real result generated based on a dataset above a threshold, wherein the system is further configured to: 

generate an updated predicted result based on the predicted result and the real result, wherein the updated predicted result is utilized as a training input to the neural network.


Claim 14.

determining a distribution of values




Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. In determining whether the claims are subject matter eligible, the Examiner applies the MPEP as updated to include matter disclosed by 2019 USPTO Patent Eligibility Guidelines. (2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50, Jan. 7, 2019.)
Step 1: Is the claim to a process, machine, manufacture, or composition of matter? Yes—all claims fit into one of the four statutory categories: claims 1-12 are process/method, claim 13 is an article of manufacture/CRM, claims 14-25 are a machine/system.
Step 2A, prong one: Does the claim recite an abstract idea, law of nature or natural phenomenon? Yes—the claimed limitations are directed to an abstract idea as generating an updated training set with determined variance and variable for use in information retrieval (QA). Specifically, all independent claims recite:
receiving a plurality of query pairs, wherein each of the plurality of query pairs includes a query and a real result previously determined for the query;
determining a least one variable element of each query in the plurality of received query pairs, wherein the at least one variable element indicates a variable and a first subset of potential values for the variable;
determining a variance for the at least determined variable element of each query in the plurality of received query pairs, wherein the variance includes at least one second subset of potential values for the variable, wherein each of the at least one second subset of potential values for the variables is different from the first subset of potential values for the variable; and 
generating a training set based on the determined variable element, the determined variance, and the previously determined real result.
The limitations, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of mathematical calculation (determine variance & variable) to update training. When a claim encompasses functionality including mathematical calculation, e.g. calculate variance, then it falls within the mathematical concepts grouping of abstract ideas.
Step 2A, prong two: Does the claim recite additional elements that integrate the judicial exception into a practical application? No—the judicial exception is not integrated into a practical application. Although the claims comprise functionality which is to be used for training a neural network that processes query and result pairs, this amounts to mere instructions to apply an exception. As set forth in MPEP 2106.05(f), “claim limitations that attempt to cover any solution to an identified problem with no restriction on how the result is accomplished and no description of the mechanism for accomplishing the result do not integrate a judicial exception in to a practical application or provide significantly more”. The claim, when view as a whole or as an ordered combination, does not amount to more than the idea of a solution and fails to provide technical detail as to how one arrives at the solution. The generality of the claim further does not reveal any particular transformation or provide evidence of improving the functioning of a computer. While applicant may arguendo point to specification portions relating to “advantages to speeding up the process the process are clear” [0005] and “require less computational resources” [0071], such statements are pervasive in the computer arts and do not find strong association with the claimed functionalities. In fact, the additional calculations of computed variance would likely increase the computational burden in direct contrast to the disclosure. Finally, examiner notes the tone of the specification in particular [0008] “This summary is not an extensive overview of all contemplated embodiments, and is intended neither to identify key or critical elements of all embodiments nor to delineate the scope of any or all aspects” and [0075] “it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future” both of which statements underscore the abstract nature of the disclosure.
Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No—the additional elements as identified do not amount to significantly more as noted previously. The claim as a whole merely describes how to “apply” the concept of a mathematical calculation to neural network training. There is nothing in the claim which amounts to significantly more (i.e., an inventive concept) to the abstract idea. 
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claims 13-14, which recite a system and a computer-readable medium, respectively, as well as to dependent claims 2-12 and 15-25. While the system and computer readable medium include additional elements of “processing circuitry” and “memory” to perform the method of claim 1, these elements amount to generic computer components. The claim thus recites computing components only at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using generic computer components. 
Taken alone, their additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-25 are rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, regards as the invention. In particular, independent claims 1, 13, and 14 disclose “determining a variance for the at least determined variable element”. While computing a variance is understood, it depends upon a “variable element” which does not comport. In reviewing the specification, said variable element is not specifically resolved but rather given an exemplar embodiment, i.e., “sales” at paragraph [0057]. That is, the “variable element” is to be conveyed as a categorical label? As such, computing variance of a label (as opposed to a variable such as dependent/independent) is not readily apparent to one of ordinary skill in the art. The examiner interprets “variable element” as any element upon which variance is computed, and likely intended to mean a dependent variable. There does not appear to be any technical detail in assisting the reader as to how one arrives at the result. Therefore, it cannot with certainty be determined what the variance is computed with respect to. Accordingly, claims 1, 13, and 14 are held indefinite under 35 USC 112(b). Remaining dependent claims fail to cure the deficiency and inherit the rejection as being indefinite.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-6 and 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over:  
Anagnostopoulos et Triantafillou, “Efficient Scalable Accurate Regression Queries in In-DBMS Analytics” hereinafter Christos, in view of 
Wang et al., “IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models”, hereinafter Wang.
With respect to claim 1, Christos teaches:
A method for generating training sets for training neural networks {Christos [P.561 Sect.C ¶1] “methodology” for illustrated system Fig 2 and detailed algorithms to [P.570 Last¶] “train novel models which predict future query results… via multiple local linear regression models” and [P.567 Last¶] “generate training files”}, comprising: 
receiving a plurality of query pairs {Christos [P.562 RtCol ¶3] “continuous stream {(q1,y1), …, (qt,yt)} of pairs (query, answer) through the interactions between the users and the system”}, wherein each of the plurality of query pairs includes a query and a real result previously determined for the query {Christos [P.562 RtCol ¶2] “result y of a query q”. Further, [P.570 Sect.VII] “previously executed queries are exploited to train novel models which predict future query results” again noted at [P.560 RtCol ¶1]}; -

    PNG
    media_image1.png
    317
    607
    media_image1.png
    Greyscale

determining at least one variable element of each query in the plurality of received query pairs {Christos [P.562 Sect.A ¶1-4] “dependent variable y… y can be decomposed into (i) a conditional expectation function E[y|x,θ], hereinafter referred to as regression function, which is explained by x and θ” where [P.562 Sect.A RtCol ¶2] “Each query provides information to locally learn the dependency between u and x” and [P.562 Sect.A RtCol] “The result y of a query q refers to the best regression estimator”} wherein the at least one variable element indicates a variable and a first subset of potential values for the variable; {Christos [P.562 Sect.A ¶3] “value for y over the subspace D(x,θ)” i.e., [P.561 Sect.C ¶1] “Queries are exploited to partition the queried data space into subspaces”}

    PNG
    media_image2.png
    344
    709
    media_image2.png
    Greyscale

    PNG
    media_image3.png
    346
    617
    media_image3.png
    Greyscale

determining a variance for the at least determined variable element of each query in the plurality of received query pairs {Christos [P.561 Sect.II ¶2] “variance Var(ϵ) = σ2 > 0” each query detailed [P.567-68 PgBrk] “For each query, θ ~ Ɲ (μθ, σθ2) is generated from Gaussian distribution with mean μθ, variance σθ2”; [P.567 LeftCol ¶3] “variance of the dependent variable”} wherein the variance includes at least one second subset of potential values for the variable, wherein each of the at least one second subset of potential values for the variable is different from the first subset of potential values for the variable {Christos [P.561 Sect.C] “different subspaces” where [P.563 ¶2] “The total number K of such query subspaces depends of the desired approximation (goodness of fit)”}; and 
generating a training set based on the determined variable element, the determined variance, and the previously determined real result {Christos [P.565 Sect.V] “training (query-response) pairs” [P.567 Last¶] “We generate training files T… respectively, of pairs (q,y) over the R1 and R2 (Figure 2)” as described [P.567 RtCol] “R2 synthetic dataset” and Figure 2 caption notes “Our model learns from past queries T and predicts future query results” using iterative repeat loop of training Alg 1. Additionally, training comprises effort noted [P.570 LeftCol] “training set size |T|” size of training wrt distribution, see Figs 6 and 13-14. The result is [P.562 RtCol] “best regression estimator” with “Expected Quantization Error (EQE)” i.e., regressive quantile estimator}.  

    PNG
    media_image4.png
    305
    597
    media_image4.png
    Greyscale
 
    PNG
    media_image5.png
    711
    605
    media_image5.png
    Greyscale

However, Christos does not expressly state that the predictive model training of model class “neural network” which is found throughout the art, such as for example, in the work of Wang. Wang teaches neural networks, particularly generative adversarial networks, for QA/IR (question answer commonly termed information retrieval). One having ordinary skill in the art would have considered it obvious prior to the effective filing date to implement the method and system of Christos utilizing neural networks as disclosed by Wang as an obvious variant in choosing from finite number of known predictive model classes to train with a reasonable expectation of success. Further benefit would arise because of the explicit use of a bias term (Wang [Sect3.1-3.2]) and/or “in order to reduce variance” (Wang [¶below Alg.1]). 
Finally, examiner notes reference Zuccon et al., “Query Variations and their Effect on Comparing Information Retrieval System”, which is not relied upon in rejection of any claimed limitations, but solely noted for convenience of applicant to reflect preferred embodiment of the instant specification [0057], see Zuccon [P.693 Sect.3] mean variance analysis wrt finance.

With respect to claim 2, the combination of Christos and Wang teaches the method of claim 1, further comprising:
determining at least one predicate for each of the plurality of query pairs, wherein a predicate is an expression used to determine if the query will return any one of: a true result and a false result {Christos [P.562 ¶3] “boolean indicator A(q,q’) ϵ {TRUE, FALSE}” per equation (9) [P.565 RtCol]}.  

With respect to claim 3, the combination of Christos and Wang teaches the method of claim 1, wherein
the previously determined real result is determined by querying at least one data set using the respective query {Christos [P.570 Sect.VII ¶1] “previously executed queries are exploited to train novel models which predict future query results” with illustrated databases Fig 2. See also [P.565 Sect.V ¶1]; [P.560 RtCol ¶1]. Examiner notes “the respective query” as respective in pairs, i.e., (q,y)}.  

With respect to claim 4, the combination of Christos and Wang teaches the method of claim 3, further comprising:
determining if the generated training set is representative of the at least one data set {Christos [P.567 ¶1] “Goodness-of-fit describes how well a model fits a set of observations, which were provided in the model’s training phase” details performance evaluation. See also [P.568 ¶2] “We train our model with T and evaluate and compare it with the ground truths”}.  

With respect to claim 5, the combination of Christos and Wang teaches the method of claim 4, wherein determining if the generated training set is representative of the data set {claim 4} further comprises at least one of: 
determining if the query pairs are directed to all portions of the data set and determining if the query pairs are directed to a number of portions of the data set above a predetermined threshold {Christos [P.569 ColBrk] “We examine the number of training pairs, |T|, our method requires to reach the termination threshold”; [P.564 RtCol ¶2] “ρ represents a threshold”}.  

With respect to claim 6, the combination of Christos and Wang teaches the method of claim 1, wherein
the variance of the query is configured to accommodate for potential bias within a determined real result {Christos [P.561 Sect.II ¶2] “b0 is the intercept” is potential bias accommodated to variance through regression function}.  

With respect to claim 13, Christos teaches:
A non-transitory computer readable medium having stored thereon instructions for causing a processing circuitry to perform a process {Christos [P.567 RtCol ¶2] “server with 2x Intel Xeon E5645, RAM 96 GB, HD: Seagate Constellation 1TB, 32MB cache”, [P.569 ¶3] “implemented over Matlab (with all data in memory)” beneficially [P.567 ¶1] “yielding up to 6 orders of magnitude faster query execution”}, the process comprising: 
The remainder of this claim is rejected for the same rationale as claim 1.

With respect to claim 14, Christos teaches:
A system for generating approximations of query results {Christos [P.560 RtCol ¶2] “In Fig. 2 we show the system context within which our contributions unfold” illustrated query response environment}, comprising: 
a processing circuitry; and a memory, the memory containing instructions that, when executed by the processing circuitry {Christos [P.567 RtCol ¶2] “server with 2x Intel Xeon E5645, RAM 96 GB, HD: Seagate Constellation 1TB, 32MB cache”, [P.569 ¶3] “implemented over Matlab” beneficially [P.567 ¶1] “yielding up to 6 orders of magnitude faster query execution”}, configure the system to: 
The remainder of this claim is rejected for the same rationale as claim 1.

Claims 15-19 are rejected for the same rationale as claims 2-6, respectively.

Claims 7-8 and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Christos and Wang in view of Tsatsin et al., US PG Pub No 20170357896A1, hereinafter Tsatsin.
With respect to claim 7, the combination of Christos and Wang teaches the method of claim 1. Tsatsin teaches further comprising:
providing the generated training sets to at least one neural network of a plurality of neural networks {Tsatsin Figs 1-2, 13 neural networks illustrated, comprises [0046] “generating training data for training neural networks”}.  
	Tsatsin is directed to training predictive models for query processing thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to provide training data disclosed by Christos to neural networks as disclosed by Tsatsin “because neural networks are often trained with regularizers that, for example penalize and/or adjust for larger weights… These regularizers are added to prevent overfitting” and/or (Tsatsin [0092]) “more comprehensive training data allows for the creation of a better (e.g., more accurate or realistic) model, because the model is only as ‘smart’ as the data that was used for training” (Tsatsin [0126]).

With respect to claim 8, the combination of Christos, Wang, and Tsatsin teaches the method of claim 7, wherein
the training sets are vectorized to a matrix representation configured to be fed to input neurons of the neural network {Tsatsin [0115] “embedding (e.g., a vector or matrix representation)” to train [0044] “input nodes (neurons) of the neural network” as [0153] “training data to be eventually fed into the neural network or model”}.  PHOSITA would be motivated to utilize the matrix representation embedding of Tsatsin in combination with Christos “In order to achieve scalable and fast search performance indexing” (Tsatstin [0076]).  “Vectorized”

Claims 20-21 are rejected for the same rationale as claims 7-8, respectively.

Claims 9 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Christos and Wang in view of Melucci, Massimo, “Impact of Query Sample Selection Bias on Information Retrieval System Ranking”, hereinafter Melucci.
With respect to claim 9, the combination of Christos and Wang teaches the method of claim 1, further comprising:
continuously generating a plurality of test queries; and ceasing to generate test queries when the plurality of generated test queries is equal to a representative sample size of the data set {Christos [P.567 Last¶] “We generate training files T and different testing files V (for predictions) of various sizes” wherein V is test query and Fig 8 illustrates RMSE v. testing pairs (|V| size), over continuous query stream [P.562 RtCol ¶3] as well as stopping conditions for algorithm at [P.564 Last2¶ - P.565 ¶2]}. 
However, it is unclear that the stop condition is linked to test queries. The deficiency is cured by reference Melucci who discloses [P.344 RtCol ¶2] “set of test queries is designed” so as to [P.344 LeftCol ¶2-3] “provide useful guidelines for the most effective query sample size” in avoiding “sample selection bias”. See also [P.349 Last¶] “minimum sample size that guarantees probability of error below a given threshold”.
Melucci is directed to training of predictive models for query processing with variance thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to implement the test query technique of Melucci in combination with Christos in order to address query sample selection bias (Melucci [P.344]). Finally, examiner notes that the instant specification is completely silent as to all key terms of the claim i.e., “test queries”, “sample size”, or “ceasing” are afforded zero discussion by the instant specification.

Claim 22 is rejected for the same rationale as claim 9.

Claims 10-11 and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Christos and Wang in view of Corvinelli et al., US Patent No 10,706,354B2, hereinafter Corvinelli.
With respect to claim 10, the combination of Christos and Wang teaches the method of claim 1. Corvinelli teaches wherein 
a first subset of query pairs corresponds to a first column of the data set {Corvinelli [Col7 Lines58-67] “first column in a database may be referred to as COL1… index with a key including a first column” describing automatic query processing, see also NN iterative algorithm [Col15 Lines5-18] and Figs 2-3}.  
	Corvinelli is directed to training of predictive models for query processing thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to address the “query vectorial space” of Christos [P.562 ¶1] according to the column-wise indexing as disclose by Corvinelli because “An index may be a data structure that may improve the speed of data retrieval operations” (Corvinelli [Col6 Lines39-52]).

With respect to claim 11, the combination of Christos, Wang, and Corvinelli teaches the method of claim 10, wherein 
a second subset of query pairs corresponds to the first column and a second column of the data set {Corvinelli [Col7 Lines58-67] “COL2… delimit the end ranges of an index with a key including a first column and a second column (COL1, COL2)”}, and a first plurality of test queries is generated based on the first subset of query pairs, and a second plurality of test queries is generated based on the second subset {Corvinelli [Col11 Line24 – Col12 Line7] details column data distribution where “upper bound neural network 208 may have been trained by executing test queries… lower bound neural network 210 may have been trained by executing test queries with various predicates to determine known resulting cardinalities for the training data” emphasis test queries where columns distributed to separate NN}.  

Claims 23-24 are rejected for the same rationale as claims 10-11, respectively.

Claims 12 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Christos, Wang, and Corvinelli in view of Yang et al., “A Variance Maximization Criterion for Active Learning”, hereinafter Yang.
With respect to claim 12, the combination of Christos, Wang, and Corvinelli teaches the method of claim 11. Yang teaches wherein 
the second plurality of test queries is generated first, and the first plurality of test queries is generated in response to the size of the second plurality of test queries being less than the sample size for the variance of the first column {Yang Figs 1 and 2 illustrate row by column matrix operations whereby combined variances are computed as detailed Table 3 and equation of Sect3.3 for Algo 1. Furthermore, [Sect5.3] “We use L2 regularized logistic regression” teaches correlation between variables as corresponding functionality}.  
Yang is directed to training predictive models for query processing thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to process the test queries disclosed by Corvinelli according to the variance computation of Yang as applying a known technique to a known method to yield predictable results and/or because “More important than the extension to the batch setting and the computational speed is that we at all have a criterion that can give us good active learning performance” (Yang [Last¶]) such that “By fusing these variances, MVAL is able to select the instances which are both informative and representative” (Yang [Abstract]). Finally, examiner again notes lack of support for any key terminology of the claim in reviewing the instant specification.

Claim 25 is rejected for the same rationale as claim 12.







Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Chase P Hinckley whose telephone number is (571)272-7935.  The examiner can normally be reached on M-F 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda M. Huang can be reached on 571-270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CHASE P. HINCKLEY/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124