DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent No. 11,157,479 (US Appl. No. 16/378,155).
Although the claims at issue are not identical, they are not patentably distinct from each other because both are directed to similar invention with similar limitations as demonstrated in the table below.
Claims 1-13 of instant application recite similar limitations as claims 14-20, hence claims 1-13 are being used as representative for demonstration in the table below. 
Similarly, claims 1-8 of U.S. Patent No. 11,157,479 recite similar to limitations as claims 9-20.  Hence claims 1-8 are being used as representative for demonstration in the table below.

Instant Application
U.S. Patent No. 11,157,479
1. A method, comprising: 
identifying an input table comprising a plurality of entries having associated values; 


generating a perturbed version of the input table by removing one or more entries from the input table; 
identifying a plurality of training tables, the plurality of training tables including a set of reference tables presumed to have clean data; 


3. The method of claim 2, wherein determining the first probability that the input table is drawn from the plurality of training tables includes selectively comparing entries of the input table and entries from the subset of training tables, and 
wherein determining the second probability that the perturbed version of the input table is drawn from the plurality of training tables includes selectively comparing entries of the perturbed version of the input table and entries from the subset of training tables.

comparing a distribution of values from the input table to distributions of values from the plurality of training tables to determine a first probability that the input table is drawn from the plurality of training tables; 
comparing a distribution of values from the perturbed version of the input table to distributions of values from the plurality of training tables to determine a second probability that the perturbed version of the input table is drawn from the plurality of training tables; 
determining that the removed one or more entries from the input table contains one or more errors based on a comparison of the first probability and the second probability; and 
providing, via a graphical user interface of a client device, an indication of the error in conjunction with a display of the one or more entries within the input table.
1. A method, comprising: 
receiving an input table comprising a plurality of entries, wherein each entry of the plurality of entries comprises an associated value; 
removing one or more entries from the plurality of entries to generate a modified input table; 
determining a first metric of similarity based on a comparison of a first distribution of values from the input table and distributions of values from a plurality of training tables, the plurality of training tables including a set of reference tables presumed to have clean data; 
determining a second metric of similarity based on a comparison of a second distribution of values from the modified input table and distributions of values from the plurality of training tables: 






determining a first probability that the input table is drawn from a plurality of training tables based on the first metric of similarity; 



determining a second probability that the modified input table is drawn from the plurality of training tables based on the second metric of similarity; 



determining that the one or more entries removed from the input table contain an error based on a comparison of the first probability and the second probability; and

providing, via a graphical user interface of a client device, an indication of the error in conjunction with a display of the one or more entries.

8. The method of claim 1, further comprising: 
tagging the one or more entries of the input table; and 
providing an indication of the tagging via the graphical user interface of the client device in conjunction with a presentation of the input table.
2. The method of claim 1, wherein identifying the plurality of training tables includes identifying a subset of training tables from a collection of training tables based on the subset of training tables having one or more shared features with the input table.
2. The method of claim 1, further comprising identifying the plurality of training tables by identifying a subset of training tables from a collection of training tables based on one or more shared features of the input table and the subset of training tables.
4. The method of claim 2, wherein the one or more shared features include one or more of: a datatype of the plurality of entries; a number of entries from the plurality of entries; a number of rows of entries from the plurality of entries; or a value prevalence associated with values from the plurality of entries.
3. The method of claim 2, wherein the one or more shared features comprise one or more of: a datatype of the plurality of entries; a number of entries from the plurality of entries; a number of rows of entries from the plurality of entries; or a value prevalence associated with values from the plurality of entries.
5. The method of claim 1, wherein generating the perturbed version of the input table includes selectively removing the one or more entries from the plurality of entries based on the one or more entries having outlying values from other values from the plurality of values corresponding to the plurality of entries.
4. The method of claim 1, further comprising selectively identifying the one or more entries from the plurality of entries based on outlying values for the one or more entries relative to values of additional entries from the plurality of entries.
6. The method of claim 1, wherein generating the perturbed version of the input table includes: 
determining a threshold quantity of entries to be removed from the input table;




selectively removing a number of entries less than or equal to the determined maximum number of entries.


7. The method of claim 6, wherein the threshold number of entries includes one or more of: a percentage of entries from the plurality of entries of the input table; or a count of entries from the plurality of entries of the input table.
5. The method of claim 1, further comprising:


identifying a threshold perturbation value for generating the modified input table, the maximum perturbation value indicating a threshold number or a threshold percentage of entries to remove from the plurality of entries when generating the modified input table; and
selectively identifying a number of the one or more entries to remove from the plurality of entries based on the threshold perturbation value.
8. The method of claim 6, wherein the threshold number of entries is based on one or more of a total number of entries within the input table, a number of rows of the input table, or a datatype of entries within one or more select columns of the input table.

9. The method of claim 1, wherein generating the perturbed version of the input table includes applying a minimization model to the input table to identify the one or more entries based on a threshold expected ratio between the first probability and the second probability.

10. The method of claim 1, wherein generating the perturbed version of the input table includes applying a likelihood ratio minimization model over a plurality of subsets of the input table trained to identify a predetermined number of numeric outliers that, when removed from the input table, results in the perturbed version of the input table predicted to minimize a ratio between the first probability and the second probability.
6. The method of claim 1, further comprising identifying the one or more entries by applying a minimization model to the input table, wherein the minimization model identifies the one or more entries based on a threshold expected ratio between the first probability and the second probability.
11. The method of claim 1, wherein generating the perturbed version of the input table includes a likelihood ratio minimization model over a plurality of subsets of the input table trained to identify a predetermined number of text-based entries based on a threshold pair-wise edit distance between the predetermined number of text-based entries that, when removed from the input table, results in the perturbed version of the input table predicted to minimize a ratio between the first probability and the second probability.

12. The method of claim 1, wherein generating the perturbed version of the input table includes applying a likelihood ratio minimization model over a plurality of subsets of the input table trained to identify a predetermined number of uniqueness violation entries based on a uniqueness ratio-functions applied to a column of values from the input table that, when removed from the input table, results in the perturbed version of the input table predicted to minimize a ratio between the first probability and the second probability.

13. The method of claim 1, wherein generating the perturbed version of the input table includes applying a likelihood ratio minimization model over a plurality of subsets of the input table trained to identify a predetermined number of functional dependency (FD) violation entries based on an FD-compliance ratio function applied to multiple columns from the input table that, when removed from the input table, results in the perturbed version of the input table predicted to minimize a ratio between the first probability and the second probability.
7. The method of claim 1, wherein determining that the one or more entries removed from the input table contain the error comprises: calculating a ratio between the first probability and the second probability; and determining that the one or more entries contain the error based on the calculated ratio.


As demonstrated by the mappings in the table above, U.S. Patent No 11,157,479 discloses or renders obvious all the features of the claims of the instant application.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michelle Owyang whose telephone number is (571)270-1254. The examiner can normally be reached Monday-Friday, 8am-6pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FRED EHICHIOYA can be reached on (571)272-4034. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHELLE N OWYANG/            Primary Examiner, Art Unit 2168