Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Status of Claims
This office action is in response to communication filed on 11/18/2021, amending claims 1, 5, 8-9, 12, 14-15, 18, and 20, and cancelling 3-4, 11, 17. This application was filed 05/28/2019.
Claims 1, 8-9, 14-15 and 20 have been amended by Examiner’s AMENDMENT; Claims 5, 7, 10, 12, 16, 18 have been cancelled, are amended BY EXAMINER’S AMENDMENT.

USC § 101 Analysis
While the claims may be broadly associated with within the abstract idea grouping, “Certain Methods of Organizing Human Activity”, related to following rules associated with rules with preventing a social activity such as bias or imbalance in treatment associated with features of humans, the Examiner finds the totality of the claims, amount to more than this abstract idea, representing improvements to data imbalance detection on a trained machine-learning (ML) model, a long felt need as elucidated in Applicant specification,  paragraphs 1-3, and ML researchers1, specifically: 
“automatically determining a desired distribution threshold for the feature; examining at least one of the dataset or outcome data generated by the trained ML model to calculate a distribution of the feature in the dataset or a distribution of the feature in the outcome data;  2 and augmented intelligence facilitated by the machine learning mechanisms represented by said claims. 
Thus, the claims are patent eligible.

35 USC § 103 
Closest prior art of record, Non-Patent Literature Adebayo, Silberman (US 10,861,028), and Non-Patent Literature Cabrera are withdrawn from consideration pursuant to Allowable Subject Matter.

Examiner’s Amendment
Authorization for this examiner’s amendment, was given in an Examiner-Initiated Interview with Applicant Representative Azadeh Khadem on 28 February 2022.
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

--- Claims 1, 8-9, 14-15 and 20 have been AMENDED;
Claims 5, 7, 10, 12, 16, 18 have been cancelled 
by Examiner’s Amendment as Follows ---

AMENDMENT TO CLAIMS
PROPOSED CLAIM AMENDMENTS
(Currently Amended) A data processing system comprising: 
a processor; and
a memory in communication with the processor, the memory comprising executable instructions that, when executed by the processor cause the data processing system to perform functions of:
receiving a request to perform data imbalance detection on a trained machine-learning (ML) model;
providing, as an input, a dataset associated with the trained model to a trained feature identifier ML model for automatically identifying a feature for which data imbalance detection is to be performed, the trained feature identifier ML model being trained for automatically identifying the feature based on a content of the dataset and an objective of the trained ML model; 
receiving the feature as an output of the trained feature identifier ML model;
receiving access to the dataset;
receiving access to the trained ML model;
automatically determining a desired distribution threshold for the feature;
examining at least one of the dataset or outcome data generated by the trained ML model to calculate a distribution of the feature in the dataset or a distribution of the feature in the outcome data; 
comparing the calculated distribution with the desired distribution threshold; [[and]]
, and
providing a report of the data imbalance,
wherein:
the dataset is different from an original dataset that was used to train the trained ML model, 
examining the dataset to calculate a distribution of the feature includes performing a statistical analysis on the dataset to determine a percentage distribution of the feature across one or more categories available for the feature. 

2-5. (Canceled) 

(Original) The data processing system of claim 1, wherein examining the outcome data generated by the trained model includes running the trained ML model with the dataset as input to generate the outcome data.

(Canceled) 

(Currently Amended) The data processing system of claim 1, wherein the executable instructions when executed by the processor further cause the data processing system to provide a report identifying that the trained ML model is validated for being low-biased or unbiased, when the calculated distribution is within the range of the desired distribution threshold.

(Currently Amended) A method for providing data imbalance detection and validation for a trained machine-learning (ML) model, the method comprising:
receiving a request to perform data imbalance detection on the trained ML model;
     providing, as an input, a dataset associated with the trained model to a trained feature identifier ML model for automatically identifying a feature for which data imbalance detection is to be performed, the trained feature identifier ML model being trained for automatically identifying the feature based on a content of the dataset and an objective of the trained ML model; 
receiving the feature as an output of the trained feature identifier ML model;
receiving access to the dataset;
receiving access to the trained ML model;
automatically determining a desired distribution threshold for the feature;
examining at least one of the dataset or outcome data generated by the trained ML model to calculate a distribution of the feature in the dataset or a distribution of the feature in the outcome data; 
comparing the calculated distribution with the desired distribution threshold; and
determining that the trained ML model exhibits data imbalance when the calculated distribution is outside a range of the desired distribution threshold, and
providing a report of the data imbalance,
wherein:
the dataset is different from an original dataset that was used to train the trained ML model, 
examining the dataset to calculate a distribution of the feature includes performing a statistical analysis on the dataset to determine a percentage distribution of the feature across one or more categories available for the feature.  
	
– 12. (Canceled) 

(Original) The method of claim 9, wherein examining the outcome data generated by the trained model includes running the trained ML model with the dataset as input to generate the outcome data.  

(Currently Amended) The method of claim 9, wherein the executable instructions when executed by the processor further cause the data processing system to provide a report identifying that the trained ML model is validated for being low-biased or unbiased, when the calculated distribution is within the range of the desired distribution threshold. 
 (Currently Amended) A non-transitory computer readable medium on which are stored instructions that, when executed cause a programmable device to: 
receive a request to perform data imbalance detection on a trained machine-learning (ML) model;
provide, as an input, a dataset associated with the trained model to a trained feature identifier ML model for automatically identifying a feature for which data imbalance detection is to be performed, the trained feature identifier ML model being trained for automatically identifying the feature based on a content of the dataset and an objective of the trained ML model; 
receive the feature as an output of the trained feature identifier ML model;
receive access to the dataset;
receive access to the trained ML model;
automatically determine a desired distribution threshold for the feature;
examine at least one of the dataset or outcome data generated by the trained ML model to calculate a distribution of the feature in the dataset or a distribution of the feature in the outcome data; 
compare the calculated distribution with the desired distribution threshold; and
determine that the trained ML model exhibits data imbalance when the calculated distribution is outside a range of the desired distribution threshold, and
provide a report of the data imbalance,
wherein:
the dataset is different from an original dataset that was used to train the trained ML model, 
examining the dataset to calculate a distribution of the feature includes performing a statistical analysis on the dataset to determine a percentage distribution of the feature across one or more categories available for the feature. 
 	
16-18. (Canceled) 

 (Original) The non-transitory computer readable medium of claim 15, wherein examining the outcome data generated by the trained model includes running the trained ML model with the dataset as input to generate the outcome data.

(Currently Amended) The non-transitory computer readable medium of claim 15, wherein the stored instructions when executed further cause a programmable device to provide a report identifying that the trained ML model is validated for being low-biased or unbiased, when the calculated distribution is within the range of the desired distribution threshold. 


Allowable Subject Matter
Claims 1, 6, 8-9, 13-15 and 19-20, are allowed.

The following is an examiner’s statement of reasons for allowance:
While closest prior art of record, Non-Patent Literature Adebayo, Silberman (US 10,861,028), and Non-Patent Literature Cabrera disclose the use of machine learning automatically identify features that may be imbalanced with a dataset, they do not teach: 
“automatically determining a desired distribution threshold for the feature; examining at least one of the dataset or outcome data generated by the trained ML model to calculate a distribution of the feature in the dataset or a distribution of the feature in the outcome data;  comparing the calculated distribution with the desired distribution threshold; [[and]] determining that the trained ML model exhibits data imbalance when the calculated distribution is outside a range of the desired distribution threshold, and providing a report of the data imbalance, wherein: the dataset is different from an original dataset that was used to train the trained ML model,  examining the dataset to calculate a distribution of the feature includes performing a statistical analysis on the dataset to determine a percentage distribution of the feature across one or more categories available for the feature”, after having “identif[ied] a feature for which data imbalance is to be performed …”, in the context of the claim when considered as a whole.  These uniquely distinct features render the claim(s) 1 allowable. 
Therefore, independent claims 1, 9, 15, and dependent claim(s) 6, 8, 13-14 and 19-20 are allowable based on the same rationale as the claim(s) from which they depend.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
The prior art made of record and NOT relied upon is considered pertinent to applicant's disclosure including information well-known to one of ordinary skill in the art:

    PNG
    media_image1.png
    1971
    1355
    media_image1.png
    Greyscale


Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL EZEWOKO whose telephone number is (571)272-7850.  The examiner can normally be reached on Monday - Thursday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Waseem Ashraf can be reached on (571) 270-3948.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL I EZEWOKO/Examiner, Art Unit 3682                                                                                                                                                                                                        



    
        
            
        
            
        
            
    

    
        1 Including, for example, Buolamwini, Intersectional Accuracy Disparities in Commercial Gender Classification, Proceedings of Machine Learning Research 81, 2018, pp 1-15 [infra, conclusion section]
        
        See also Selbst [infra, conclusion section]
        2 Including Dreyfust [infra, conclusion section]