DETAILED ACTION
This office action is in response to Applicant’s arguments and amendments filed on January 29, 2021. The application contains claims 1-18: 
Claims 4, 10, and 16 were cancelled previously
Claims 1, 7, and 13 are amended
Claims 1-3, 5-9, 11-15, 17, and 18 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 29, 2021 has been entered.

Response to Arguments
Applicant's arguments and amendments filed on January 29, 2021 have been fully considered and the objections and rejections are updated accordingly. 

Claim Rejections - 35 USC § 112

As a matter of fact, the limitation “select[ing] a threshold range for each data quality metric […]” in claims 1, 7, and 13, which includes the newly introduced amendments, is so unclear that it can not be properly understood despite the examiner’s best effort.
Please refer to the updated 35 USC § 112 rejections for details.

Claim Rejections - 35 USC § 103
As discussed above, the indefiniteness of the claim language hinders understanding of the claimed subject matter, which in turn makes it impossible to determine whether or not further search for new references is necessary. Applicant is advised to amend the claims, especially the independent claims, to clearly recite each and every claim element as well as the relationships among them so as to advance this application. 
For the reasons above, the 35 USC § 103 rejections to claims 1-3, 5-9, 11-15, 17, and 18 as set forth below are maintained.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-3, 5-9, 11-15, 17, and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 1, 7, and 13 each recite the limitation "wherein the data source for each data quality metric indicates one or more data source sub-populations associated with the data source" in lines 8-10, 13-15, and 10-12, respectively. There is insufficient antecedent basis for “the data source” in the claim. Prior to the recitation of the above quoted limitation, each respective claim only associates a data quality metric with each field but never each data source. Therefore, claims 1, 7, and 13 are indefinite and rejected under 35 U.S.C. 112(b).
Claims 1, 7, and 13 each recite the limitation "wherein the plurality of threshold ranges each indicate one or more sub-population values of the data source sub-populations" in lines 11-13, 16-18, and 13-15, respectively. There are two issues with this limitation: one, there is insufficient antecedent basis for “the data source sub-populations” in the claim; two, based on the claim language, a threshold range is selected for each data quality metric, and each data quality metric is associated with each field. It is unclear what the “one or more sub-population values” refers to in the recited context. Therefore, claims 1, 7, and 13 are indefinite and rejected under 35 U.S.C. 112(b).
Claims 1, 7, and 13 each recite the limitation "wherein the threshold range is selected from the plurality of threshold ranges based on a matching score determined based on a weighted number of matchings between the one or more sub-population values indicated by the plurality of threshold ranges and the corresponding values of the one or more data source sub-populations" in lines 13-16, 18-21, and 15-18, respectively. As discussed above, it is unclear what the “one or more sub-population values” refers to. In addition, there is insufficient antecedent basis for “the corresponding values” in the claim. Consequently, it is unclear what values are being matched. Furthermore, it is unclear what “a weighted number of matchings” refers to. Therefore, claims 1, 7, and 13 are indefinite and rejected under 35 U.S.C. 112(b).

Dependent claims 8, 9, 11, and 12 are also rejected for inheriting the deficiency from their corresponding independent claim 7, respectively.
Dependent claims 14, 15, 17, and 18 are also rejected for inheriting the deficiency from their corresponding independent claim 13, respectively.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-9, 11-15, 17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Marrelli et al. (US 20160070725 A1), in view of Nath et al. (US 10185728 B2).

With regard to claim 1,
Marrelli teaches
a computer-implemented method of processing data records in a multi-tenant environment to ensure data quality (Abstract), comprising: 
processing a plurality of records from a plurality of data sources to provide a plurality of data quality metrics comprising a data quality metric for each field of the plurality of records based on record values in the field (process rows/records in a set of tables from data sources 1-n to provide a data quality metric for each field/data attribute based on values in the corresponding field, Fig. 2, 140 SOURCE 1-n; [0040]; [0042]-[0043]; Fig. 4 illustrates different types of data quality metrics; Fig. 5 and 6 show data quality metrics correspond to each data attribute);
comparing the data quality metric to the threshold range to determine whether the data quality metric violates the threshold (compare data quality metrics, e.g., accuracy or completeness data quality, to thresholds to determine whether data cleansing should be performed, [0091], where data cleansing should be performed indicates “the data quality metric violates the threshold”); and 
providing a data quality report for the plurality of records, wherein the data quality report indicates whether the data quality metric of each field violates the selected threshold range (Fig. 8 illustrates a data quality report that uses color codes to indicate the status of the data quality metric, i.e., data is clean or not, for each data attribute/field, where color red indicates the data quality metric violates the selected threshold range, [0096]-[0098]).
Marrelli does not explicitly teach
selecting a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges and satisfies a specificity level corresponding to the data source; 
Nath teaches
selecting a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges of the plurality of data sources, wherein the threshold range indicates a range of acceptable values for the data quality metric, wherein the data source for each data quality metric indicates one or more data source sub-populations associated with the data source, wherein the one or more data source sub-populations in combination identify a specific data source, wherein the plurality of threshold ranges each indicate one or more sub-population values of the data source sub-populations, wherein the threshold range is selected from the plurality of threshold ranges based on a matching score determined based on a weighted number of matchings between the one or more sub-population values indicated by the plurality of threshold ranges and the corresponding values of the one or more data source sub-populations (a “Change Threshold” setting allows a user to select the threshold for each data quality metric for a specific variable “PMT_PDUE”, where Accuracy, Completeness, etc. are data quality metrics and “PMT_PDUE” is a specific field, Fig. 5; Col. 8, lines 36-67; Col. 9, lines 1-2. Different threshold settings imply different threshold ranges, thus “wherein the threshold range is selected from a plurality of threshold ranges” is inherently taught. As discussed in detail in the 112(b) rejections, the remainder of this limitation can’t be properly understood due to indefiniteness); 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Marrelli to incorporate the teachings of Nath to provide users with the means to select a threshold range for each data quality metric of each field. Doing so would allow users to set different thresholds for different fields with respect to each data quality metric thus afford users more granularity when it comes to customizing data quality monitoring and the data cleansing that may ensue.

With regard to claim 2,
As discussed regarding claim 1, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer-implemented method of claim 1, wherein the plurality of records is received from one or more tenants in a multi-tenant environment (Fig. 2 illustrates a multi-tenant environment including source 1-n), 
Nath further teaches 
wherein the plurality of threshold ranges is specific to the data source of each tenant (the threshold discussed in claim 1 is with respect to one data source, Fig. 5; Abstract. When incorporated in the multi-tenant environment taught by Marrelli, it would have been obvious to one of ordinary skill in the art to use different threshold ranges specific to each data source because data from different sources often has different schema and conforms to different rules).

With regard to claim 3,
As discussed regarding claim 1, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer-implemented method of claim 1, wherein processing the plurality of records from the plurality of data sources to provide a data quality metric further comprises: 
processing record values of each field of the plurality of records to generate a key-value-count table (maintaining various record counts with a specific attribute that is compliant/non-compliant with a particular data quality dimension for any desired scope is equivalent to “generate a key-value-count table”, [0057]-[0058]); 
Nath further teaches 
converting the key-value-count table into a histogram (bottom graph corresponds to a histogram which represents the distribution of collected data quality statistics, Fig. 6; Col. 9, lines 3-22); and 
processing the histogram to calculate data quality metrics for each field (Fig. 5 illustrates data quality metrics, e.g., Accuracy and Completeness, calculated for field PMT_PDUE based on above distribution graph; Col. 8, lines 36-67; Col. 9, lines 1-2).

With regard to claim 5,
As discussed regarding claim 1, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer-implemented method of claim 1, wherein the data quality metric comprises a percentage of invalid record values in a field (validity percentage for each data attribute is equivalent to “a percentage of invalid record values in a field”, Fig. 5).

With regard to claim 6,
As discussed regarding claim 5, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer-implemented method of claim 5, wherein an invalid record value comprises one of: a record value outside of a predetermined range for the field, and a null value (completeness NOT NULL is equivalent to “a null value”, Fig. 4).

With regard to claim 7,
Marrelli teaches
a computer system for processing data records in a multi-tenant environment to ensure data quality (Abstract), the computer system comprising: 
one or more computer processors (Fig. 1, 15 processor); 
one or more computer readable storage media (Fig. 1, 35 memory); 
program instructions stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising instructions to: 
process a plurality of records from a plurality of data sources to provide a plurality of data quality metrics comprising a data quality metric for each field of the plurality of records based on record values in the field (process rows/records in a set of tables from data sources 1-n to provide a data quality metric for each field/data attribute based on values in the corresponding field, Fig. 2, 140 SOURCE 1-n; [0040]; [0042]-[0043]; Fig. 4 illustrates different types of data quality metrics; Fig. 5 and 6 show data quality metrics correspond to each data attribute); 
compare the data quality metric to the threshold range to determine whether the data quality metric violates the threshold (compare data quality metrics, e.g., accuracy or completeness data quality, to thresholds to determine whether data cleansing should be performed, [0091], where data cleansing should be performed indicates “the data quality metric violates the threshold”); and 
provide a data quality report for the plurality of records, wherein the data quality report indicates whether the data quality metric of each field violates the selected threshold range (Fig. 8 illustrates a data quality report that uses color codes to indicate the status of the data quality metric, i.e., data is clean or not, for each data attribute/field, where color red indicates the data quality metric violates the selected threshold range, [0096]-[0098]).
Marrelli does not explicitly teach
select a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges and satisfies a specificity level corresponding to the data source; 
Nath teaches
select a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges of the plurality of data sources, wherein the threshold range indicates a range of acceptable values for the data quality metric, wherein the data source for each data quality metric indicates one or more data source sub-populations associated with the data source, wherein the one or more data source sub-populations in combination identify a specific data source, wherein the plurality of threshold ranges each indicate one or more sub-population values of the data source sub-populations, wherein the threshold range is selected from the plurality of threshold ranges based on a matching score determined based on a weighted number of matchings between the one or more sub-population values indicated by the plurality of threshold ranges and the corresponding values of the one or more data source sub-populations (a “Change Threshold” setting allows a user to select the threshold for each data quality metric for a specific variable “PMT_PDUE”, where Accuracy, Completeness, etc. are data quality metrics and “PMT_PDUE” is a specific field, Fig. 5; Col. 8, lines 36-67; Col. 9, lines 1-2. Different threshold settings imply different threshold ranges, thus “wherein the threshold range is selected from a plurality of threshold ranges” is inherently taught. As discussed in detail in the 112(b) rejections, the remainder of this limitation can’t be properly understood due to indefiniteness); 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Marrelli to incorporate the teachings of Nath to provide users with the means to select a threshold range for each data quality metric of each field. Doing so would allow users to set different thresholds for different fields with respect to each data quality metric thus afford users more granularity when it comes to customizing data quality monitoring and the data cleansing that may ensue.

With regard to claim 8,
Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer system of claim 7, wherein the plurality of records is received from one or more tenants in a multi-tenant environment (Fig. 2 illustrates a multi-tenant environment including source 1-n), 
Nath further teaches 
wherein the plurality of threshold ranges is specific to the data source of each tenant (the threshold discussed in claim 1 is with respect to one data source, Fig. 5; Abstract. When incorporated in the multi-tenant environment taught by Marrelli, it would have been obvious to one of ordinary skill in the art to use different threshold ranges specific to each data source because data from different sources often has different schema and conforms to different rules).

With regard to claim 9,
As discussed regarding claim 7, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer system of claim 7, wherein the instructions to process the plurality of records from the plurality of data sources to provide a data quality metric further comprise instructions to: 
process record values of each field of the plurality of records to generate a key-value- count table (maintaining various record counts with a specific attribute that is compliant/non-compliant with a particular data quality dimension for any desired scope is equivalent to “generate a key-value-count table”, [0057]-[0058]); 
Nath further teaches 
convert the key-value-count table into a histogram (bottom graph corresponds to a histogram which represents the distribution of collected data quality statistics, Fig. 6; Col. 9, lines 3-22); and 
process the histogram to calculate data quality metrics for each field (Fig. 5 illustrates data quality metrics, e.g., Accuracy and Completeness, calculated for field PMT_PDUE based on above distribution graph; Col. 8, lines 36-67; Col. 9, lines 1-2).

With regard to claim 11,
As discussed regarding claim 7, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer system of claim 7, wherein the data quality metric comprises a percentage of invalid record values in a field (validity percentage for each data attribute is equivalent to “a percentage of invalid record values in a field”, Fig. 5).

With regard to claim 12,
As discussed regarding claim 11, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer system of claim 11, wherein an invalid record value comprises one of: a record value outside of a predetermined range for the field, and a null value (completeness NOT NULL is equivalent to “a null value”, Fig. 4).

With regard to claim 13,
Marrelli teaches
a computer program product for processing data records in a multi-tenant environment to ensure data quality (Abstract), the computer program product comprising one or more computer readable storage media collectively having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: 
process a plurality of records from a plurality of data sources to provide a plurality of data quality metrics comprising a data quality metric for each field of the plurality of records based on record values in the field (process rows/records in a set of tables from data sources 1-n to provide a data quality metric for each field/data attribute based on values in the corresponding field, Fig. 2, 140 SOURCE 1-n; [0040]; [0042]-[0043]; Fig. 4 illustrates different types of data quality metrics; Fig. 5 and 6 show data quality metrics correspond to each data attribute);
compare the data quality metric to the threshold range to determine whether the data quality metric violates the threshold (compare data quality metrics, e.g., accuracy or completeness data quality, to thresholds to determine whether data cleansing should be performed, [0091], where data cleansing should be performed indicates “the data quality metric violates the threshold”); and 
provide a data quality report for the plurality of records, wherein the data quality report indicates whether the data quality metric of each field violates the selected threshold range (Fig. 8 illustrates a data quality report that uses color codes to indicate the status of the data quality metric, i.e., data is clean or not, for each data attribute/field, where color red indicates the data quality metric violates the selected threshold range, [0096]-[0098]).
Marrelli does not explicitly teach
select a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges and satisfies a specificity level corresponding to the data source; 
Nath teaches
select a threshold range for each data quality metric, wherein the threshold range is selected from a plurality of threshold ranges of the plurality of data sources, wherein the threshold range indicates a range of acceptable values for the data quality metric, wherein the data source for each data quality metric indicates one or more data source sub-populations associated with the data source, wherein the one or more data source sub-populations in combination identify a specific data source, wherein the plurality of threshold ranges each indicate one or more sub-population values of the data source sub-populations, wherein the threshold range is selected from the plurality of threshold ranges based on a matching score determined based on a weighted number of matchings between the one or more sub-population values indicated by the plurality of threshold ranges and the corresponding values of the one or more data source sub-populations (a “Change Threshold” setting allows a user to select the threshold for each data quality metric for a specific variable “PMT_PDUE”, where Accuracy, Completeness, etc. are data quality metrics and “PMT_PDUE” is a specific field, Fig. 5; Col. 8, lines 36-67; Col. 9, lines 1-2. Different threshold settings imply different threshold ranges, thus “wherein the threshold range is selected from a plurality of threshold ranges” is inherently taught. As discussed in detail in the 112(b) rejections, the remainder of this limitation can’t be properly understood due to indefiniteness); 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Marrelli to incorporate the teachings of Nath to provide users with the means to select a threshold range for each data quality metric of each field. Doing so would allow users to set different thresholds for different fields with respect to each data quality metric thus afford users more granularity when it comes to customizing data quality monitoring and the data cleansing that may ensue.

With regard to claim 14,
As discussed regarding claim 13, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer program product of claim 13, wherein the plurality of records is received from one or more tenants in a multi-tenant environment (Fig. 2 illustrates a multi-tenant environment including source 1-n), 
Nath further teaches 
wherein the plurality of threshold ranges is specific to the data source of each tenant (the threshold discussed in claim 1 is with respect to one data source, Fig. 5; Abstract. When incorporated in the multi-tenant environment taught by Marrelli, it would have been obvious to one of ordinary skill in the art to use different threshold ranges specific to each data source because data from different sources often has different schema and conforms to different rules).

With regard to claim 15,
As discussed regarding claim 13, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer program product of claim 13, wherein the instructions to process the plurality of records from the plurality of data sources to provide a data quality metric further comprise instructions to: 
process record values of each field of the plurality of records to generate a key-value-count table (maintaining various record counts with a specific attribute that is compliant/non-compliant with a particular data quality dimension for any desired scope is equivalent to “generate a key-value-count table”, [0057]-[0058]); 
Nath further teaches 
convert the key-value-count table into a histogram (bottom graph corresponds to a histogram which represents the distribution of collected data quality statistics, Fig. 6; Col. 9, lines 3-22); and 
process the histogram to calculate data quality metrics for each field (Fig. 5 illustrates data quality metrics, e.g., Accuracy and Completeness, calculated for field PMT_PDUE based on above distribution graph; Col. 8, lines 36-67; Col. 9, lines 1-2).

With regard to claim 17,
As discussed regarding claim 13, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer program product of claim 13, wherein the data quality metric comprises a percentage of invalid record values in a field (validity percentage for each data attribute is equivalent to “a percentage of invalid record values in a field”, Fig. 5).

With regard to claim 18,
As discussed regarding claim 17, Marrelli and Nath teach all the limitations. 
Marrelli further teaches
the computer program product of claim 17, wherein an invalid record value comprises one of: a record value outside of a predetermined range for the field, and a null value (completeness NOT NULL is equivalent to “a null value”, Fig. 4).

Examiner’s Note
Examiner has pointed out particular references contained in the prior arts of record in the body of this action for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and Figures may apply as well. It is respectfully requested from the applicant, in preparing the response, to consider fully the entire references as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior arts or disclosed by the examiner. It is noted that any citation to specific pages, columns, figures, or lines in the prior art references any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAOQIN HU whose telephone number is (571)272-1792.  The examiner can normally be reached on Monday-Friday 7:00am-3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/XIAOQIN HU/Examiner, Art Unit 2168