Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Status of Claims
This office action is in response to communication filed on 10/12/2021, amending claims 1, 3, 5-6, 8-10, 12-15, 17-18, and 20, and cancelling 4, and 11. This application was filed 05/28/2019.
Claims 1, 9, 15 and 19 have been amended by Examiner’s AMENDMENT; Claims 3, 5-6, 8, 10, 12-14, 17-18, and 20 have been cancelled, are amended BY EXAMINER’S AMENDMENT; Claims 21-23 have been added, BY EXAMINER’S AMENDMENT.

USC § 101 Analysis
While the claims may be broadly associated with within the abstract idea grouping, “Certain Methods of Organizing Human Activity”, related to following rules associated with rules with preventing a social activity such as bias or imbalance in treatment associated with features of humans, the Examiner finds the totality of the claims, amount to more than this abstract idea, representing improvements to data imbalance detection on a trained machine-learning (ML) model, a long felt need as specifically elucidated in Applicant specification,  paragraphs 1-3, and Fair ML researchers1, specifically: 
“automatically identifying two or more features of the dataset for which data imbalance detection is to be performed based at least in part on a type of the ML model and one or more other parameters; performing a statistical analysis on the dataset to determine a distribution of for 2 3 and augmented intelligence facilitated by the machine learning mechanisms represented by said claims. 
Thus, the claims are patent eligible.

35 USC § 103 
Closest prior art of record, Non-Patent Literature Cabrera and Grouchy (US 11,062,792) are withdrawn from consideration pursuant to Allowable Subject Matter.




Examiner’s Amendment
Authorization for this examiner’s amendment, was given in an Examiner-Initiated Interview with Applicant Representative Azadeh Khadem on 28 February 2022.
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

--- Claims 1, 9, 15 and 19 have been AMENDED;
Claims 3, 5-6, 8, 10, 12-14, 17-18, and 20 have been cancelled;
 Claims 21-23 have been added
by Examiner’s Amendment as Follows ---

AMENDMENT TO CLAIMS
PROPOSED CLAIM AMENDMENTS
(Currently Amended) A data processing system comprising: 
a processor; and
a memory in communication with the processor, the memory comprising executable instructions that, when executed by the processor, cause the data processing system to perform functions of:
	receiving a request via a selectable user interface element of a user interface screen associated with a data imbalance detection system to perform a data imbalance detection on a dataset associated with training a machine-learning (ML) model, the dataset being at least one of a training dataset used for training the ML model or an output dataset generated by the ML model;
automatically identifying two or more features of the dataset for which data imbalance detection is to be performed based at least in part on a type of the ML model and one or more other parameters;
         performing a statistical analysis on the dataset to determine a distribution of for each of the two or more features across the dataset; 
         correlating the distribution of at least one of the two or more features with the distribution of another one of the two or more features;
transmitting data relating to the correlating of the distribution of at least one of the two or more features with the distribution of another one of the two or more features for visualizing the distribution of the at least one of the two or more features and the distribution of another one of the two or more features in a user interface element to help identify the data imbalance in the dataset, 
wherein:
performing the statistical analysis on the dataset to determine the distribution for each of the two or more features includes determining the distribution for each of the two or more features across one or more categories available for each of the two or more features, 
the dataset includes at least one of the training dataset, a training subset of the training dataset, a validation subset of the training dataset, and the output dataset, and
visualizing the distribution includes displaying the distribution on a visualization interface. 

(Original) The data processing system of claim 1, wherein the request identifies the dataset on which the data imbalance detection is to be performed.  

      3-6. (Canceled) 

(Original) The data processing system of claim 1, wherein the data imbalance includes bias. 

(Canceled) 

(Currently Amended) A method for detecting data imbalance in a dataset associated with training a machine-learning (ML) model, the method comprising:
receiving a request via a selectable user interface element of a user interface screen associated with a data imbalance detection system to perform data imbalance detection on the dataset associated with training the ML model, the dataset being at least one of a training dataset used for training the ML model or an output dataset generated by the ML model;
        automatically identifying two or more features of the dataset for which data imbalance detection is to be performed based at least in part on a type of the ML model and one or more other parameters;
         performing a statistical analysis on the dataset to determine a distribution of for each of the two or more features across the dataset; 
         correlating the distribution of at least one of the two or more features with the distribution of another one of the two or more features;
transmitting data relating to the correlating of the distribution of at least one of the two or more features with the distribution of another one of the two or more features for visualizing the distribution of the at least one of the two or more features and the distribution of another one of the two or more features in a user interface element to help identify the data imbalance in the dataset, 
wherein:
performing the statistical analysis on the dataset to determine the distribution for each of the two or more features includes determining the distribution for each of the two or more features across one or more categories available for each of the two or more features, 
the dataset includes at least one of the training dataset, a training subset of the training dataset, a validation subset of the training dataset, and the output dataset, and
visualizing the distribution includes displaying the distribution on a visualization interface. 
 	
– 14. (Canceled) 

  (Currently Amended) A non-transitory computer readable medium on which are stored instructions that, when executed cause a programmable device to: 
	receive via a selectable user interface element of a user interface screen associated with a data imbalance detection system to perform a data imbalance detection on a dataset associated with training a machine-learning (ML) model, the dataset being at least one of a training dataset used for training the ML model or an output dataset generated by the ML model;
     automatically identify two or more features of the dataset for which data imbalance detection is to be performed based at least in part on a type of the ML model and one or more other parameters;
         perform a statistical analysis on the dataset to determine a distribution of for each of the two or more features across the dataset; 
         correlate the distribution of at least one of the two or more features with the distribution of another one of the two or more features;
	transmit data relating to the correlating of the distribution of at least one of the two or more features with the distribution of another one of the two or more features for visualizing the distribution of the at least one of the two or more features and the distribution of another one of the two or more features in a user interface element to help identify the data imbalance in the dataset, 
wherein:
performing the statistical analysis on the dataset to determine the distribution for each of the two or more features includes determining the distribution for each of the two or more features across one or more categories available for each of the two or more features, 
the dataset includes at least one of the training dataset, a training subset of the training dataset, a validation subset of the training dataset, and the output dataset, and
	visualizing the distribution includes displaying the distribution on a visualization interface. 
 	
 (Original) The non-transitory computer readable medium of claim 15, wherein the request identifies the dataset on which the data imbalance detection is to be performed.

(Canceled) 

 (Canceled) 

 (Currently Amended) The non-transitory computer readable medium of claim [[18]] 15, wherein the visualization interface includes a chart. 

(Canceled)

(New) The method of claim 9, wherein the request identifies the dataset on which the data imbalance detection is to be performed.  

(New)  The method of claim 9, wherein the data imbalance includes bias. 

(New) The method of claim 9, wherein the visualization interface includes a chart. 

Allowable Subject Matter
Claims 1-2, 7, 9, 15-16,19, and 21-23, are allowed.

The following is an examiner’s statement of reasons for allowance:
While closest prior art of record, Non-Patent Literature Cabrera and Grouchy (US 11,062,792) discloses the use of machine learning automatically identify features that may be imbalanced with a dataset, they do not teach: 
“automatically identifying two or more features of the dataset for which data imbalance detection is to be performed based at least in part on a type of the ML model and one or more other parameters; performing a statistical analysis on the dataset to determine a distribution of for each of the two or more features across the dataset; correlating the distribution of at least one of 
Therefore, independent claims 1, 9, 15, and dependent claim(s) 2, 7, 16, 19 and 21-23 are allowable based on the same rationale as the claim(s) from which they depend.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
The prior art made of record and NOT relied upon is considered pertinent to applicant's disclosure including information well-known to one of ordinary skill in the art:

    PNG
    media_image1.png
    1673
    1277
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    778
    840
    media_image2.png
    Greyscale










    PNG
    media_image3.png
    1599
    1355
    media_image3.png
    Greyscale

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL EZEWOKO whose telephone number is (571)272-7850.  The examiner can normally be reached on Monday - Thursday.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL I EZEWOKO/Examiner, Art Unit 3682                                                                                                                                                                                                        




    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Including, for example, Buolamwini, Intersectional Accuracy Disparities in Commercial Gender Classification, Proceedings of Machine Learning Research 81, 2018, pp 1-15 [infra, conclusion section]
        
        See also Selbst [infra, conclusion section]
        2 Including Dreyfus [infra, conclusion section]
        
        3 Including Wickham [infra, conclusion section]