DETAILED ACTION

Claims 1-20 are pending in this Office action.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 102
4.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


5. 	Claims 1-20 rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yingyu CHEN (US-20140358890-A1).
	As per claim 1, CHEN teaches “a computer-implemented method for selective preprocessing of data distributed across a plurality of data sources, the method comprising”:

“for each question from the set of questions, generating an execution plan, the execution plan comprising instructions for processing data stored in one or more data sources,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“identifying a set of fields from the plurality of data sources, the set of fields storing data processed by execution plans of the set of questions,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“for each of the set of fields, determining a quality score for the field, the quality score indicative of a quality of data stored in the field,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“identifying a field of a data source from the set of fields having a quality score indicating that a quality of data stored in the field is below a threshold level,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“sending, to the data source storing the identified field, a request to update data stored in the field to improve the quality of data stored in the field information describing the identified field,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“receiving, from a client device, a new question,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);

“processing the execution plan to generate results for the new question,” ([0042]-[0043], [0045], [0047], [0049]-[0050]); and
“sending the generated results to the client device,” ([0042]-[0043], [0045], [0047], [0049]-[0050]).
	As per  claim 2, CHEN further shows “wherein the identified field is updated by transforming data of the identified field such that the quality of the identified field after transformation is better than the quality of the identified field before transformation,” ([0034]).
	As per  claim 3, CHEN further shows “wherein determining quality score of a field comprises”:
“receiving at least a portion of the data of the field based on sampling of data of the field; and analyzing the quality of the received portion of the data of the field,” ([0049]-[0050]).
	As per  claim 4, CHEN further shows “wherein a quality score of a field is inversely related to a percentage of nulls in the field,” ([0076]).

	As per  claim 6, CHEN further shows “wherein a quality score of a field is indicative of a degree of compliance of the field with particular regulations,” ([0076]-[0078]).
	As per  claim 7, CHEN further shows “wherein a quality score of a field is indicative of a degree of compliance of the field with privacy regulations,” ([0076]-[0078]).
	As per  claim 8, CHEN further shows “wherein a quality score of a field is indicative of a degree of compliance with a privacy regulation for the field, wherein preprocessing of a field comprises”:
“determining a lineage of the data stored in the field, the lineage identifying one or more sources of data, wherein the data stored in the field is obtained by processing the data from the one or more sources of data,” ([0080], [0117]-[0118]);
“for each of the one or more sources of data in the lineage of the data stored in the field, determining a degree of compliance with the privacy regulation for the source of the data,” ([0080], [0117]-[0118]);
“responsive to determining that the degree of compliance with the privacy regulation for a particular source of the data is below a threshold, regenerating the data stored in the field using a different source of data instead of the particular source of data,” ([0080], [0117]-[0118]).

“determining a plurality of execution plans for the field; for each of the plurality of execution plans, determining a quality score indicative of the quality of fields processed by the execution plan; and selecting a particular execution plan having the quality score indicating the highest quality of the data processed by the execution plan,” ([0045], [0047]). 
	As per  claim 10, CHEN teaches “a non-transitory computer-readable storage medium storing computer-executable instructions for executing on a computer processor, the instructions when executed by the computer processor cause the computer processor to perform steps comprising”:
“receiving a set of questions, each question from the set of questions requesting data stored in one or more data sources from a plurality of data sources,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“for each question from the set of questions, generating an execution plan, the execution plan comprising instructions for processing data stored in one or more data sources,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“identifying a set of fields from the plurality of data sources, the set of fields storing data processed by execution plans of the set of questions,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);

“identifying a field of a data source from the set of fields having a quality score indicating that a quality of data stored in the field is below a threshold level,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“sending, to the data source storing the identified field, a request to update data stored in the field to improve the quality of data stored in the field information describing the identified field,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“receiving, from a client device, a new question,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“generating an execution plan for the new question, the execution plan processing data stored in one or more fields including the identified field,” ([0042]-[0043], [0045], [0047], [0049]-[0050]);
“processing the execution plan to generate results for the new question,” ([0042]-[0043], [0045], [0047], [0049]-[0050]); and
“sending the generated results to the client device,” ([0042]-[0043], [0045], [0047], [0049]-[0050]).
	As per  claim 11, CHEN further shows “wherein the identified field is updated by transforming data of the identified field such that the quality of the identified field after 
	As per  claim 12, CHEN further shows “wherein instructions for determining quality score of a field cause the computer processor to perform steps comprising”:
“receiving at least a portion of the data of the field based on sampling of data of the field; and analyzing the quality of the received portion of the data of the field,” ([0049]-[0050]).
	As per  claim 13, CHEN further shows “wherein a quality score of a field is inversely related to a percentage of nulls in the field,” ([0076]).
	As per  claim 14, CHEN further shows “wherein a quality score of a field is inversely related to a percentage of values in the field with data format errors,” ([0050], [0076]-[0078]).
	As per  claim 15, CHEN further shows “wherein a quality score of a field is indicative of a degree of compliance of the field with privacy regulations,’ ([0076]-[0078]).
	As per  claim 16, CHEN further shows “wherein a quality score of a field is indicative of a degree of compliance with a privacy regulation for the field, wherein instructions for preprocessing of a field cause the computer processor to perform steps comprising”:

“for each of the one or more sources of data in the lineage of the data stored in the field, determining a degree of compliance with the privacy regulation for the source of the data,” ([0080], [0117]-[0118]);
“responsive to determining that the degree of compliance with the privacy regulation for a particular source of the data is below a threshold, regenerating the data stored in the field using a different source of data instead of the particular source of data,” ([0080], [0117]-[0118]).
	As per  claim 17, CHEN further shows “wherein instructions for determining the execution plan for a question cause the computer processor to perform steps comprising”:
“determining a plurality of execution plans for the field; for each of the plurality of execution plans, determining a quality score indicative of the quality of fields processed by the execution plan; and selecting a particular execution plan having the quality score indicating the highest quality of the data processed by the execution plan,” ([0045], [0047]). 
	As per  claim 18, CHEN teaches “a computer-implemented system comprising”: 
“a computer processor,” (fig. 1); and 

“identifying a set of fields from the plurality of data sources, the set of fields storing data processed by execution plans of the set of questions; for each of the set of fields, determining a quality score for the field, the quality score indicative of a quality of data stored in the field; identifying a field of a data source from the set of fields having a quality score indicating that a quality of data stored in the field is below a threshold level; sending, to the data source storing the identified field, a request to update data stored in the field to improve the quality of data stored in the field information describing the identified field; receiving, from a client device, a new question,” ([0042]-[0043], [0045], [0047], [0049]-[0050]); 
“generating an execution plan for the new question, the execution plan processing data stored in one or more fields including the identified field; processing the execution plan to generate results for the new question; and sending the generated results to the client device,” ([0042]-[0043], [0045], [0047], [0049]-[0050]).

“determining a lineage of the data stored in the field, the lineage identifying one or more sources of data, wherein the data stored in the field is obtained by processing the data from the one or more sources of data,” ([0080], [0117]-[0118]);
“for each of the one or more sources of data in the lineage of the data stored in the field, determining a degree of compliance with the privacy regulation for the source of the data,” ([0080], [0117]-[0118]);
“responsive to determining that the degree of compliance with the privacy regulation for a particular source of the data is below a threshold, regenerating the data stored in the field using a different source of data instead of the particular source of data,” ([0080], [0117]-[0118]).
	As per  claim 20, CHEN further shows “wherein instructions for determining the execution plan for a question cause the computer processor to perform steps comprising”:
“determining a plurality of execution plans for the field,” (0045], [0047]); 
“for each of the plurality of execution plans, determining a quality score indicative of the quality of fields processed by the execution plan; and selecting a particular execution . 



                                                             Conclusion

6.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
                                                        




                                                         




                                                     Contact Information
7.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to KIM T NGUYEN whose telephone number is (571)270-1757.  The examiner can normally be reached on Mon-Thurs 6-4:30pm.
If attempts to reach the examiner by telephone are unsuccessful, the 
examiner’s supervisor, Alford Kindred can be reached on (571)272-4037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Sept. 09, 2021
/KIM T NGUYEN/Primary Examiner, Art Unit 2153