DETAILED ACTION
This is the response to applicant’s amendment action regarding application number 15/815,899, filed November 17, 2017.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on November 12, 2021 has been entered. 

Response to Amendments
The amendment filed November 12, 2021 has been entered. Examiner acknowledges receipt of Amendments to Application 15/815,899, which include: Amendments to the Claims pp.2-10, and Remarks pp.11-16 (containing applicant’s amendments). 
Regarding applicant’s Remarks on p.11, examiner has acknowledged Claims 1-2, 6, 8-9, 13, 15-16, and 20 have been amended. Claims 1-20 remain pending in the application. However, examiner has noted that the amended claims have introduced new claim objections
Regarding applicant’s Remarks on p.11, examiner has acknowledged applicant’s Amendments to the Claims have resolved the indefinite issues identified in independent Claims 1, 8, and 15 (and inherited in the respective dependent claims), and therefore the respective §112(b) rejections previously set forth in the Final Office Action mailed August 11, 2021 for Claims 1-20 are withdrawn. 

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 15/815,899, which include: Remarks pp.11-16 (containing applicant’s arguments). 
Regarding applicant’s Remarks on pp.12-16 for Claims 1, 5, 8, 12, 15, and 19 under 35 U.S.C. 103 as being unpatentable over Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016, in view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cat.No.00TH8512), IEEE, July 16-19 2000, pp.805-810 [henceforth referred as Fidelis]; for Claims 2-3, 6-7, 9-10, 13-14, 16-17, and 20 under 35 U.S.C. 103 as being unpatentable over Chatterjee in view of Fidelis as applied to Claims 1, 8, and 15, in further view of Castellanos et al., U.S. PGPUB 2012/0089620, published 4/12/2012 [henceforth referred as Castellanos]; and for Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chatterjee in view of Fidelis as applied to Claims 1, 8, and 15, in further view of Cheng et al., U.S. PGPUB 2006/0059112, published 3/16/2006 [henceforth referred as Cheng], examiner acknowledges applicant’s arguments and have considered them, and have found them to be not persuasive. Examiner notes that the applicant has amended the claims such that it necessitates further examination and re-evaluation of the amended and related original claims. The updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below. 
Regarding applicant’s Remarks on pp.13-14,
“Chatterjee, col. 3, 11. 34-45, discloses that "a client of a machine learning service may
submit a request to generate a classification model for a specified data set or data source (a source from which various observation records may be collected by the service and used for training the requested model). Each observation record may contain one or more input variables and at least one output or "target" variable (the variable for which the model is make [sic] predictions). The target variable may typically comprise a "class" variable, which can take on one of a discrete set of values representative of respective sub-groups or classes of the observation records."
As an example of such "class" variables, Chatterjee, col. 13, 11. 1-5, discloses that "the classification problem to be solved is to predict the favorite sport of an individual, for example from a small set of sports such as the set comprising soccer, baseball, cricket and tennis." Chatterjee, col. 13, 11. 15-22, discloses that "[a] classifier may be trained using the observation records of training set 401, and may produce a predicted favorite sport for each training set record. Any desired classification algorithm may be selected in various embodiments; for the purposes of illustrating the techniques for explaining classifier predictions at a high level, the specific classification algorithm used does not matter. The predictions made for Rec0-Rec3 by the trained classifier are shown in table 411." In FIG. 4, Chatterjee discloses that Rec0-Rec3 are "baseball," "cricket," "soccer," and "soccer," respectively. Thus, the classification algorithm disclosed by Chatterjee merely produces a discrete set of values (e.g., "baseball," "cricket," etc.) as the output and does no further classification beyond that.
However, Chatterjee is silent with respect to using a machine learning model to output a set of probabilities, and further classifying instances into one of the output classes "based on the set of probabilities" output from the machine learning model. In particular, Chatterjee fails to disclose or suggest "applying ... each instance and at least one perturbation of the respective instance to a machine learning model having a function that takes each instance and each perturbation of the respective instance to obtain, from an output of the machine learning model, a set of probabilities that each feature of the respective instance belongs to each of the output classes; [and] classifying ... each instance into one of the output classes for which, based on the set of probabilities, the probability that each feature of the respective instance belongs to the respective output class is highest," as now claimed (emphasis added).”
Examiner has considered the arguments, and has found them to be not persuasive. As indicated in the Final Office Action mailed August 11, 2021, Chatterjee was used to teach “a contribution of each feature of the respective instance to at least one of the output classes” which was recited earlier in the independent claims, which under its broadest reasonable interpretation, the term “a contribution” was interpreted to cover all possibilities where a contribution is associated with an output class (including the output class itself) and thus was not limited to be interpreted as a probability (or a set of probabilities). a contribution” to indicate “a set of probabilities” in amended independent Claim 1 (or “a probability” as indicated in amended independent Claims 8 and 15), as well as adding a new classification limitation involving the probability and the output class, such that “the probability that each feature of the respective instance belongs to the respective class is highest”, which necessitates further examination and re-evaluation of the amended and related original claims. The updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.
Regarding applicant’s Remarks on p.14-15,
“On page 19 of the Office Action, the Examiner correctly notes that Chatterjee fails to disclose, among other things, "a genetic algorithm." Both Fidelis and Castellanos are cited for allegedly disclosing a genetic algorithm.
Fidelis, p. 807, col. 1, sec. 3.3, 5th paragraph discloses, "Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). . . . In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule."). Fidelis is silent with respect to using a machine learning model to output a set of probabilities, and further classifying instances into one of the output classes "based on the set of probabilities" output from the machine learning model. 
Castellanos, para. [0016] discloses, " ... patterns of prefixes and suffixes are used as indicators of the occurrence of an instance of the target entity ... Although the approach uses rules, partial matching of patterns is allowed. Thus, a confidence value can be computed from a fitness function, providing a measure of the degree of match. A search space given by different combinations of prefixes and suffixes may be very large, leading to a large combinatorial problem. Genetic algorithms have proven to be useful for this kind of problem, because the inherent randomization introduced in each generation allows the algorithm to explore different regions of the variable space, making it resilient to getting caught in local minima .... a genetic algorithm may be used to learn the pattern part of the rules .... " Castellanos is silent with respect to using a machine learning model to output a set of probabilities, and further classifying instances into one of the output classes "based on the set of probabilities" output from the machine learning model.
Specifically, neither Fidelis nor Castellanos discloses or suggests "applying ... each instance and at least one perturbation of the respective instance to a machine learning model having a function that takes each instance and each perturbation of the respective instance to obtain, from an output of the machine learning model, a set of probabilities that each feature of the respective instance belongs to each of the output classes; [and] classifying, by the processor-based system, each instance into one of the output classes for which, based on the set of probabilities, the probability that each feature of the respective instance belongs to the respective output class is highest," as now claimed. Therefore, Fidelis and Castellanos fail to cure the deficiencies of Chatterjee discussed above.”
Examiner has considered the above arguments, and has found them to be not persuasive. As indicated in the Final Office Action mailed August 11, 2021, the Fidelis reference was used to teach the following claim limitations recited in the independent claims: “… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits, wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class; applying … each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules, each class level rule representing a logical conditional statement that, when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class; …”, which are different claim limitations than those recited in applicant’s argument shown above (where the applicant’s arguments are directed to the newly amended claim limitations present in the newly amended independent claims). Examiner further notes that the earlier recited claims from the Final Office Action have been amended to address the 112(b) indefiniteness rejections such that it necessitates further examination and re-evaluation of the amended and related original claims. The updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.
a fitness score … based on a harmonic mean” recited in the dependent claims, which are different claim limitations from the ones recited by the applicant (which were recited in the earlier independent claims). Examiner notes that the applicant’s arguments shown above are directed to the newly amended claim limitations present in the newly amended independent claims, which introduce new limitations such that it necessitates further examination and re-evaluation of the amended and related original claims. The updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.
Regarding applicant’s Remarks on pp.15,
“On page 50 of the Office Action, the Examiner correctly notes that Chatterjee in view of Fidelis fails to disclose entropy-based binning. Cheng is cited for allegedly disclosing entropy-based binning, but nevertheless fails to cure the deficiencies of Chatterjee, Fidelis and Castellanos discussed above.”
Examiner has considered the above arguments, and has found them to be not persuasive. 
As indicated in the Final Office Action mailed August 11, 2021, the Cheng reference was used to teach the limitation “… convert the numerical feature into a categorical feature using entropy based binning” recited in the dependent claims, which are different claim limitations from the ones inferred by the applicant’s arguments. Examiner notes that the applicant’s arguments shown above are directed to the newly amended claim limitations present in the newly amended independent claims, which introduce new limitations such that it necessitates further examination and re-evaluation of the amended and related original claims. The updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.

Claim Objections
Claims 1, 8, and 15 are objected to because of the following informalities:
Claim 1 recites the term “a machine learning model” in its preamble: A computer-implemented method of interpreting a machine learning model, the method comprising:…”. However, Claim 1 also contains the same term “a machine learning model” in the following amended limitation: “applying … each instance and at least one perturbation of the respective instance to a machine learning model having a function that takes each instance and each perturbation of the respective instance to obtain, from an output of the machine learning model, a set of probabilities that each feature of the respective instance belongs to each of the output classes;”, where the term “a machine learning model” in the indicated amended claim limitation appears to be referencing the same machine learning model as recited in the preamble, and thus should be corrected as “the machine learning model”. Appropriate correction is required.  
Similar to Claim 1, Claims 8 and 15 also recite the term “a machine learning model” in their respective preambles, and recite variations of the amended claim limitation identified in Claim 1, where the term “a machine learning model” in the indicated amended claim limitation appears to be referencing the same machine learning model as recited in the preamble, and thus should be corrected as “the machine learning model”. Appropriate correction is required.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective 
Claims 1, 5, 8, 12, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Hodjat et al., U.S. PGPUB 2017/0293849, published 10/12/2017 [hereafter referred as Hodjat] in view of Sepahvand et al., Generating Graphical Chain by Mutual Matching of Bayesian Network and Extracted Rules of Bayesian Network Using Genetic Algorithm, arXiv:1412.4465v1, December 15 2014, 6 pages [hereafter referred as Sepahvand], in further view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cata.No.00th 8512), IEEE, July 16-19 2000, pp.805-810 [henceforth referred as Fidelis], in even further view of Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016 [hereafter referred as Chatterjee].  
Regarding amended Claim 1, Hodjat teaches
(Currently amended) A computer-implemented method of interpreting a machine learning model, the method comprising: 
receiving, by a processor-based system, a set of training data and a set of output classes for classifying a plurality of instances of the set of training data, each instance representing at least one feature of the set of training data (Examiner’s note: Hodjat teaches a data mining system for evolving rulesets using an evolutionary algorithm, where the training data for the data mining system is collected from an environment generating a large amount of data over a period of time for the purposes of extracting useful knowledge and patterns, where the training portion of the system interacts with a database containing a pool of candidate individuals, and where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition is expressed as a relationship between a feature attribute and its corresponding value (Hodjat [0005]-[0007]; Figure 8; [0051] and [0086]-[0087]; see also Figure 3 and [0055]-[0057]). Hodjat further teaches that this system Hodjat Figure 10, element 1014; and [0116]).); 
applying, by the processor-based system, each instance and at least one perturbation of the respective instance to [[a]]the machine learning model having a function that takes each instance and each perturbation of the respective instance to obtain, from an output of the machine learning model, a set of probabilities (Examiner’s note: Under its broadest reasonable interpretation, the term “at least one perturbation of the respective instance” is interpreted as any variation of the respective instance, such as a change within a feature value of an instance, or the presence of similar instances with different output results. As indicated earlier, Hodjat Figure 8; [0051] and [0086]-[0087] teaches a pool of candidate individuals for training, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition expressed as a relationship between a feature attribute and its corresponding value. Hodjat further teaches each rule entry has a corresponding rule-level probability (RLP) that represents the probability of membership in a class, where this rule-level probability can represent an aggregated value (average, minimum, or maximum) of all condition level certainty values, where the conditions under aggregation represent different variations of the same condition (“perturbations of the respective instance”), and where this aggregation is performed by a probability aggregator present in the training/production portions of the system (Hodjat [0055]: “… The rule-level probability 310 indicates the probability that membership in the class exists when the conditions of this rule are satisfied….”; and [0061]-[0067]; in particular [0061]: “The condition-level certainty values for input data applied to a rule 306 are aggregated to determine the rule-level certainty value. In one embodiment, the certainty aggregation function can be an average of the all the condition-level certainty values. For example, if the condition-level certainty values for three conditions are 0.2, 0.4, and 0.6 respectively, the rule-level certainty value may be 0.4. In one embodiment, the rule-level certainty value will be the minimum value … In another embodiment, the rule-level certainty value will be the maximum value …” and [0076]-[0078]: “FIG. 5 is a method of operation of a probability aggregator 406 in either the training system [o]r the production system … a probability aggregator 406 determines the probability output of an individual … for a given data point … An individual or a ruleset is also received in block 402 providing one or more rules, each of the rules having one or more conditions and an indication of a rule-level probability of membership in a predetermined class.”).) …
classifying, by the processor-based system, each instance into one of the output classes (Examiner’s note: Hodjat teaches a set of rules classifying a patient’s current state based on current and past state based on a set of conditions, where the current state and past state represent different output classes, where in the example provided, a current state and past state for a rule entry based on measuring blood pressure and pulse conditions can represent a high blood pressure related event and a normal blood pressure related event, respectively (Hodjat [0062]-[0068]).) …
producing, for each respective instance, a set of instance level conditions, each representing … each feature of the respective instance in the output class where the instance is classified (Examiner’s note: As indicated earlier, Hodjat teaches a training portion of the system interacting with a database containing a pool of candidate individuals, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition is expressed as a relationship between a feature attribute and its corresponding value (i.e., feature/value pairs) (Hodjat Figure 8; [0097]-[0098]; see also Figure 3 and [0055]-[0057]).); 
… applying, by the processor-based system, the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules, each class level rule representing a logical conditional statement that predicts that the respective instances are members of the particular class (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, with each rule expressed as an IF-THEN relationship of conditions containing a feature attribute, its corresponding value, a threshold and an rule-level probability (RLP) that corresponds to a probability associated with an output class (Hodjat Figure 3; [0055]-[0057]; and [0062]-[0068]). Hodjat further teaches this pool of candidate individuals are provided as input into a procreation module where the procreation involves identifying parent individuals, performing mutation and crossover operations to create child individuals, and choosing the best ones based on a fitness estimate over multiple iterations to finally determine a set of individuals with the best fitness score at the end of the Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0087]; and [0090]-[0092]).); and 
using at least a portion of the set of class level rules (Examiner’s note: In light of applicant’s specification paragraph [0018], this claim limitation is interpreted as occurring after producing the set of class level rules, in the use case where these rules are provided to a user as an explanation for further processing. As indicated earlier, Hodjat teaches a training portion of the system involving the procreation module being invoked for multiple iterations, where for each iteration, new individuals created by combination and/or mutation are placed in the pool of candidate individuals to be chosen as new parents for successive combinations and/or mutations, and undergo further fitness evaluations through the competition module (Hodjat Figure 6, elements 606, 116, 608 and [0090]-[0092]). Once the best individuals are identified, they are provided into a production portion of the system where they are used for determining a recommendation through a decision/action system, which outputs a recommendation for a human to perform an action (Hodjat [0070]).) …  
While Hodjat teaches probabilities associated with an output class, Hodjat does not explicitly teach
… a set of probabilities that each feature of the respective instance belongs to each of the output classes;
classifying … for which, based on the set of probabilities, the probability that each feature of the respective instance belongs to the respective output class is highest; …
Sepahvand teaches
… a set of probabilities that each feature of the respective instance belongs to each of the output classes (Examiner’s note: Sepahvand teaches a Bayesian network modeling the conditional probabilities of variables in a rule, where each of the variables contain a plurality of classes and associated probability values associated with an output class, where the Bayesian network is used to identify the feature-associated chains in the network that have higher probabilities, that are useful for classification in order to understand existing events and predict future events (Sepahvand p.2 col.1 6th paragraph-col.2 2nd paragraph (Section III. Background, Section IV. Proposed Method) and p.2 col.2 Figure 1).);
classifying … for which, based on the set of probabilities, the probability that each feature of the respective instance belongs to the respective output class is highest (Examiner’s note: As indicated earlier, Sepahvand teaches a Bayesian network is used to identify the feature-associated chains in the network that have higher probabilities, that are useful for classification in order to understand existing events and predict future events (Sepahvand p.2 col.1 6th paragraph-col.2 2nd paragraph (Section III. Background, Section IV. Proposed Method) and p.2 col.2 Figure 1).); …
Both Hodjat and Sepahvand are analogous art since both teach generating and identifying relevant rules using genetic algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take rule level probability taught in Hodjat and extend it through use of a Bayesian network to determine probabilities for respective features within the rule conditions taught in Sepahvand as a way to determine rule conditions containing the most probable features (i.e., those with the highest probability). The motivation to combine is taught in Sepahvand, since using a Bayesian network to identify probabilities for each feature in a rule condition and identifying the most probable features (and hence the most probable paths and rules) allows the identification of those rule conditions that are the most useful for classification and determination of future events. Sepahvand further teaches that by focusing on those rule conditions that contain the most probable features and applying them to a genetic algorithm leads to a more computationally efficient way to determine the optimum rules, thus making the system more efficient (Sepahvand p.1 col.2 2nd paragraph (Section I. Introduction); p.2 col.2 Section IV. Proposed Method 1st-3rd paragraphs; p.5 col.2 Section VI. Evaluation 3rd paragraph).
While Hodjat in view of Sepahvand teach the conditions in each rule entry (i.e., a set of instance level conditions) represented as feature/value pairs for the procreation module, Hodjat in view of Sepahvand does not explicitly teach 
… a set of instance level conditions each representing a presence or absence of each feature …
Fidelis teaches
… a set of instance level conditions each representing a presence or absence of each feature (Examiner’s note: Fidelis teaches encoding chromosome structures representing rule conditions for use in a genetic algorithm, where each gene represents a condition with attributes, and where each gene is represented by a weight field taking values in range [0..1], indicating whether or not the corresponding attribute is present according to a limit threshold, where if the weight field is below a threshold, the smaller the probability that the condition will be present, and hence the condition is effectively removed from the rule (corresponding to an absence) (Fidelis p.806 col.1 Figure 1 and Section 3.1 Individual Encoding 3rd paragraph: “… The field weight (Wi) is a real-valued variable taking values in the range [0..1]. This variable indicated whether or not the corresponding attribute is present in the rule. … the greater the value of the threshold Limit, the smaller the probability that the corresponding condition will be present in the rule … so that conditions with a weight smaller than or equal to 0.3 were effectively removed from the rule.”).) …
Both Hodjat in view of Sepahvand and Fidelis are analogous art since both teach generating and identifying relevant rules using genetic algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the feature/value encoding representation for the evolutionary algorithm taught in Hodjat in view of Sepahvand and enhance with the encoding mechanism taught in Fidelis as a way to encode the set of instance level conditions for an evolutionary algorithm. The motivation to combine is taught in Fidelis, since this encoding method is flexible enough to support encoding of a plurality of conditions in a chromosome without having to change the length of the chromosome, which then allows the system to consistently process and perform crossover and mutations using equal length chromosomes in a consistent way, thus improving the computational efficiency of the genetic algorithm (Fidelis p.806 col.2 3rd paragraph (Section 3.1 Individual Encoding)).
While Hodjat in view of Sepahvand, in further view of Fidelis teaches using at least a portion of the set of class level rules to update the set of training data, Hodjat in view of Sepahvand, in further view of Fidelis does not explicitly teach
… using at least a portion of the set of class level rules to update the set of training data and retrain the machine learning model using the updated set of training data.

… using at least a portion of the set of class level rules to update the set of training data and retrain the machine learning model using the updated set of training data (Examiner’s note: In light of applicant’s specification paragraph [0018], this claim limitation is interpreted as occurring after producing the set of class level rules, in the use case where these rules are provided to a user as an explanation for further processing. Chatterjee teaches that the explainer producing an explanatory rule set may provide the information to a client device, where a user at the client device can trigger re-generation of additional rules if the explanatory rule set is considered unsatisfactory according to a threshold (i.e., responses to observations which generated a “no explanation is available” message). Chatterjee further teaches that this re-generation of additional rules may involve using a larger input set to re-train the exemplary machine learning model, where this larger input set includes at least some observation records (“rules”) for which no explanations were available (Chatterjee Figure 9, elements 901, 925; and col.18 lines 26-38).).
Both Hodjat in view of Sepahvand, in further view of Fidelis and Chatterjee are analogous art since they both teach extracting predictive rulesets based on machine learning techniques.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the decision/action system recommending an action taught in Hodjat in view of Sepahvand, in further view of Fidelis and enhance it to include a re-training trigger taught in Chatterjee as a way to re-generate additional rules if the provided ruleset is considered as an unsatisfactory explanation or are insufficient to generate a recommendation. The motivation to combine is taught in Chatterjee, where this trigger for re-training will allow the user to demand additional explanations beyond a general first-level explanation, resulting in a machine learning model to adjust its internal weights and internal representations to make further elaborations which identify relationships between input attributes and internal rule representations, which improves the accuracy and utility of the system using the machine learning model in terms of providing more informative explanations (Chatterjee col.15 lines 9-27 and col.15 line 43-col.16 line 23).
Regarding original Claim 5, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee teaches
The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules that predict that at least a threshold percentage of the respective instances are members of a particular class (Examiner’s note: Hodjat teaches taking the individuals having the best fitness score at the end of the training session, and adding them to a production ruleset population for a production phase to process live production sequences, where the output of the production phase is a probability associated with the determined ruleset being compared against a predetermined threshold value to determine an outcome representing a future event and an associated action to take to remediate the event, where this determination of an outcome representing a future event is a prediction associated with a subset of conditions in the ruleset that identified the outcome (Hodjat [0092]: “… the individuals having the best fitness score at the end of the training session are added to the production ruleset population 122 where they are used to process live production data sequences 130….”; and [0068]-[0070]: “…The production system 112 operates according to one or more rulesets 300 from the production ruleset population 122. … the ruleset as a whole outputs a probability 126 for an event that can occur in the near future. In the case of the blood pressure monitoring application, the probability output 126 will indicate the possibility of a high blood pressure related event occurring in the near future. … The decision/action system 128 is a system that uses the probability output 126 from the rulesets together with predetermined threshold values to decide what if any action to take. .. if the ruleset predicts a probability higher than a predetermined threshold value, for example 50%, that a patient’s blood pressure will exceed the normal range in the near future, the decision/action system 128 may alert a nurse or doctor …”).).  
Regarding amended Claim 8, Hodjat teaches
(Currently amended) A computer program product including 
one or more non-transitory computer readable mediums having instructions encoded thereon that when executed by one or more computer processors cause the one or more computer processors (Examiner’s note: Hodjat teaches a computer system containing a storage subsystem, where the storage subsystem includes a memory subsystem and a file subsystem, where the memory subsystem contains computer instructions, when executed by the processor subsystem, cause the computer system Hodjat Figure 10, elements 1014, 1024; and [0116] and [0121]).) to perform a process for interpreting a machine learning model, the process including 
receiving a set of training data and a set of output classes for classifying a plurality of instances of the set of training data, each instance representing at least one feature of the set of training data (Examiner’s note: Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training data for the data mining system is collected from an environment generating a large amount of data over a period of time for the purposes of extracting useful knowledge and patterns, where the training portion of the system interacts with a database containing a pool of candidate individuals, and where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition is expressed as a relationship between a feature attribute and its corresponding value (Hodjat [0005]-[0007]; Figure 8; [0051] and [0086]-[0087]; see also Figure 3 and [0055]-[0057]). Hodjat further teaches that this system is implemented on a computer system containing a processor (Hodjat Figure 10, element 1014; and [0116]).).); 
applying each instance and at least one perturbation of the respective instance to [[a]]the machine learning model having a function that takes each instance and each perturbation of the respective instance to obtain, from an output of the machine learning model, a probability (Examiner’s note: Under its broadest reasonable interpretation, the term “at least one perturbation of the respective instance” is interpreted as any variation of the respective instance, such as a change within a feature value of an instance, or the presence of similar instances with different output results. As indicated earlier, Hodjat Figure 8; [0051] and [0086]-[0087] teaches a pool of candidate individuals for training, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition expressed as a relationship between a feature attribute and its corresponding value. Hodjat further teaches each rule entry has a corresponding rule-level probability (RLP) that represents the probability of membership in a class, where this rule-level probability can represent an aggregated value (average, minimum, or maximum) of all condition level certainty values, where the conditions under aggregation represent different variations of the same condition (“perturbations of the respective instance”), and where this aggregation is performed by a probability aggregator present in the training/production parts of the system (Hodjat [0055]: “… The rule-level probability 310 indicates the probability that membership in the class exists when the conditions of this rule are satisfied….”; and [0061]-[0067]; in particular [0061]: “The condition-level certainty values for input data applied to a rule 306 are aggregated to determine the rule-level certainty value. In one embodiment, the certainty aggregation function can be an average of the all the condition-level certainty values. For example, if the condition-level certainty values for three conditions are 0.2, 0.4, and 0.6 respectively, the rule-level certainty value may be 0.4. In one embodiment, the rule-level certainty value will be the minimum value … In another embodiment, the rule-level certainty value will be the maximum value …” and [0077]-[0078]: “… a probability aggregator 406 determines the probability output of an individual … for a given data point … An individual or a ruleset is also received in block 402 providing one or more rules, each of the rules having one or more conditions and an indication of a rule-level probability of membership in a predetermined class.”).) …
classifying each instance into one of the output classes (Examiner’s note: Hodjat teaches a set of rules classifying a patient’s current state based on current and past state based on a set of conditions, where the current state and past state represent different output classes, where in the example provided, a current state and past state for a rule entry based on measuring blood pressure and pulse conditions can represent a high blood pressure related event and a normal blood pressure related event, respectively (Hodjat [0062]-[0068]).) …
producing, for each respective instance, a set of instance level conditions, each representing … each feature of the respective instance in the output class where the instance is classified (Examiner’s note: As indicated earlier, Hodjat teaches a training portion of the system interacting with a database containing a pool of candidate individuals, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, and each condition is expressed as a relationship between a feature attribute and its corresponding value (i.e., feature/value pairs) (Hodjat Figure 8; [0097]-[0098]; see also Figure 3 and [0055]-[0057]).); 
applying the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules, each class level rule representing a logical conditional statement that predicts that the respective instances are members of the particular class (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, where an individual is represented as a plurality of rules, each rule entry contains a plurality of conditions, with each rule expressed as an IF-THEN relationship of conditions containing a feature attribute, its corresponding value, a threshold and an rule-level probability (RLP) that corresponds to a probability associated with an output class (Hodjat Figure 3; [0055]-[0057]; and [0062]-[0068]). Hodjat further teaches this pool of candidate individuals are provided as input into a procreation module where the procreation involves identifying parent individuals, performing mutation and crossover operations to create child individuals, and choosing the best ones based on a fitness estimate over multiple iterations to finally determine a set of individuals with the best fitness score at the end of the training session, where this set of individuals represent a set of class level rules, and where the described process that is performed in the procreation module represents a type of evolutionary (“genetic”) algorithm (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0087]; and [0090]-[0092]).); and 
using at least a portion of the set of class level rules (Examiner’s note: In light of applicant’s specification paragraph [0018], this claim limitation is interpreted as occurring after producing the set of class level rules, in the use case where these rules are provided to a user as an explanation for further processing. As indicated earlier, Hodjat teaches a training portion of a system involving the procreation module being invoked for multiple iterations, where for each iteration, new individuals created by combination and/or mutation are placed in the pool of candidate individuals to be chosen as new parents for successive combinations and/or mutations, and undergo further fitness evaluations through the competition module (Hodjat Figure 6, elements 606, 116, 608 and [0090]-[0092]). Once the best individuals are identified, they are provided into a production portion of the system where they are used for determining a recommendation through a decision/action system, which outputs a recommendation for a human to perform an action (Hodjat [0070]).) …  
While Hodjat teaches probabilities associated with an output class, Hodjat does not explicitly teach
… a probability that each feature of the respective instance belongs to each of the output classes;
classifying … for which the probability that each feature of the respective instance belongs to the respective output class is highest; …
Sepahvand teaches
… a probability that each feature of the respective instance belongs to each of the output classes (Examiner’s note: Sepahvand teaches a Bayesian network modeling the conditional probabilities of variables in a rule, where each of the variables contain a plurality of classes and associated probability values associated with an output class, where the Bayesian network is used to identify the feature-associated chains in the network that have higher probabilities, that are useful for classification in order to understand existing events and predict future events (Sepahvand p.2 col.1 6th paragraph-col.2 2nd paragraph (Section III. Background, Section IV. Proposed Method) and p.2 col.2 Figure 1).);
classifying … for which the probability that each feature of the respective instance belongs to the respective output class is highest (Examiner’s note: As indicated earlier, Sepahvand teaches a Bayesian network is used to identify the feature-associated chains in the network that have higher probabilities, that are useful for classification in order to understand existing events and predict future events (Sepahvand p.2 col.1 6th paragraph-col.2 2nd paragraph (Section III. Background, Section IV. Proposed Method) and p.2 col.2 Figure 1).); …
Both Hodjat and Sepahvand are analogous art since both teach generating and identifying relevant rules using genetic algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take rule level probability taught in Hodjat and extend it through use of a Bayesian network to determine probabilities for respective features within the rule conditions taught in Sepahvand as a way to determine rule conditions containing the most probable features (i.e., those with the highest probability). The motivation to combine is taught in Sepahvand, as provided in the prior art claim mapping of Claim 1 recited above.
While Hodjat in view of Sepahvand teach the conditions in each rule entry (i.e., a set of instance level conditions) represented as feature/value pairs for the procreation module, Hodjat in view of Sepahvand does not explicitly teach 
… a set of instance level conditions each representing a presence or absence of each feature …

… a set of instance level conditions each representing a presence or absence of each feature (Examiner’s note: Fidelis teaches encoding chromosome structures representing rule conditions for use in a genetic algorithm, where each gene represents a condition with attributes, and where each gene is represented by a weight field taking values in range [0..1], indicating whether or not the corresponding attribute is present according to a limit threshold, where if the weight field is below a threshold, the smaller the probability that the condition will be present, and hence the condition is effectively removed from the rule (corresponding to an absence) (Fidelis p.806 col.1 Figure 1 and Section 3.1 Individual Encoding 3rd paragraph: “… The field weight (Wi) is a real-valued variable taking values in the range [0..1]. This variable indicated whether or not the corresponding attribute is present in the rule. … the greater the value of the threshold Limit, the smaller the probability that the corresponding condition will be present in the rule … so that conditions with a weight smaller than or equal to 0.3 were effectively removed from the rule.”).) …
Both Hodjat in view of Sepahvand and Fidelis are analogous art since both teach predicting classification rules using machine learning algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the feature/value encoding representation for the evolutionary algorithm taught in Hodjat in view of Sepahvand and enhance with the encoding mechanism taught in Fidelis as a way to encode the set of instance level conditions for an evolutionary algorithm. The motivation to combine is taught in Fidelis, as provided in the prior art claim mapping in Claim 1 recited above.
While Hodjat in view of Sepahvand, in further view of Fidelis teaches using at least a portion of the set of class level rules to update the set of training data, Hodjat in view of Sepahvand, in further view of Fidelis does not explicitly teach
… using at least a portion of the set of class level rules to adjust hyper-parameters of the machine learning model and retrain the machine learning model using the adjusted hyper-parameters.
Chatterjee teaches
… using at least a portion of the set of class level rules to adjust hyper-parameters of the machine learning model and retrain the machine learning model using the adjusted hyper-parameters (Examiner’s note: In light of applicant’s specification paragraph [0018], this claim limitation is interpreted as occurring after producing the set of class level rules, in the use case context where these rules are provided to a user as an explanation for further processing. Chatterjee teaches that the explainer producing an explanatory rule set may provide the information to a client device, where a user at the client device can trigger re-generation of additional rules if the explanatory rule set is considered unsatisfactory according to a threshold (i.e., responses to observations which generated a “no explanation is available” message). Chatterjee further teaches this re-training trigger from the explainer corresponds to adding more internal representations to the input data in an exemplary machine learning model. In the context of the exemplary machine learning model being a neural network classifier (Chatterjee col.14 line 49-col.15 line 25; col.15 lines 57-63; col.16 lines 8-12), this re-training trigger to add more internal representations to the input data is interpreted adding more hidden layers in the exemplary neural network classifier to support the request to provide more explanation or ruleset conditions, and adjusting the neural network weights between layers to process the additional information, such that this addition of more internal representations represents changing the original dimensions of the neural network (i.e., corresponding to changing a neural network’s hyper-parameters) in order to re-train the exemplary machine learning model using the changed dimensions (Chatterjee Figure 9, elements 901, 925; col.18 lines 26-38; col.17 lines 32-50; and col.8 lines 43-65).).
Both Hodjat in view of Sepahvand, in further view of Fidelis and Chatterjee are analogous art since they both teach extracting predictive rulesets based on machine learning techniques.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the decision/action system recommending an action taught in Hodjat in view of Sepahvand, in further view of Fidelis and enhance it to include a re-training trigger taught in Chatterjee as a way to re-generate additional rules if the provided ruleset is considered as an unsatisfactory explanation or are insufficient to generate a recommendation. The motivation to combine is taught in Chatterjee, as provided in the prior art claim mapping in Claim 1 recited above.
Regarding original Claim 12,
Claim 12 recites the computer program product of claim 8, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as indicated in Claim 5, in view of the rejections of amended Claim 8.  
Regarding amended Claim 15,
Claim 15 recites a system, where the system comprises claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 8, and hence is rejected under similar rationale and motivations provided by Hodjat, Sepahvand, Fidelis, and Chatterjee as indicated in amended Claim 8. In addition, as indicated earlier, Hodjat teaches a computer system for implementing the system, where the computer system contains a processor subsystem containing one or more processors, and a storage subsystem that includes a memory subsystem and a file store subsystem (corresponding to one or more storages, Hodjat Figure 10, elements 1024, 1026, 1028; and [0116]), and a processor subsystem connected to the same internal bus subsystem as the storage subsystem (Hodjat Figure 10, elements 1012, 1014, 1024; and [0116]), where the memory subsystem contains computer instructions, when executed by the processor subsystem, cause the computer system to operate or perform the training and production system functions (Hodjat Figure 10, elements 1014, 1024; and [0121]).
Regarding original Claim 19, 
Claim 19 recites the system of claim 15, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 5, and hence is rejected under similar rationale provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as indicated in Claim 5, in view of the rejections of amended Claim 15.  
Claims 2-3, 6, 9-10, 13-14, 16-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hodjat et al., U.S. PGPUB 2017/0293849, published 10/12/2017 [hereafter referred as Hodjat] in view of Sepahvand et al., Generating Graphical Chain by Mutual Matching of Bayesian Network and Extracted Rules of Bayesian Network Using Genetic Algorithm, arXiv:1412.4465v1, December 15 2014 [hereafter referred as Sepahvand], in further view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on .
Regarding Claim 2, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as applied to Claim 1 teaches
(Currently amended) The method of claim 1, wherein producing the set of class level rules further comprises 
calculating, by the processor-based system, at least one of 
a fitness score for each class level rule (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, which are initialized with initial fitness estimates, and run through a battery of trials to test the training data, and updating the corresponding fitness estimates for each individual and ranking individuals based on their fitness score (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0089]).) …
… [fitness score for each class level rule] based on … precision of the respective class level rule and a coverage of the respective class level rule (Examiner’s note: Fidelis teaches a fitness score calculation comprising of sensitivity and specificity indicators, with the sensitivity indicator representing recall (corresponding to the coverage of the rule, which is expressed as a ratio of true positives over the sum of true positives and false negatives), and the specificity indicator representing precision (corresponding to the precision of the rule, which is expressed as a ratio of true positives over the sum of true positives and false positives) (Fidelis p.807 Section 3. Fitness Function: “The fitness function evaluates the quality of each rule (individual). … Our fitness function combines two indicators commonly used in medical domains, namely the sensitivity (Se) and the specificity (Sp), defined as follows: Se = tp/(tp + fn) … Sp = tn/(tn + fp). Finally, the fitness function used by our system is defined as the product of these two indicators, ie.,: fitness =Se*Sp.”).), and 
a mutual information between the respective class rule and the predicted class; 
wherein at least one generation of the genetic algorithm is configured to produce the set of class level rules using at least one of 
the fitness score (Examiner’s note: As indicated earlier, Hodjat teaches this pool of candidate individuals are provided as input into a procreation module where the procreation involves identifying parent individuals, performing mutation and crossover operations to create child individuals, and choosing the best ones based on a fitness estimate over multiple iterations to finally determine a set of individuals with the best fitness score at the end of the training session, where this set of individuals represent a set of class level rules. Hodjat further teaches the procreation module being invoked for multiple iterations, where for each iteration, new individuals created by combination and/or mutation are placed in the pool of candidate individuals to be chosen as new parents for successive combinations and/or mutations, and undergo further fitness evaluations through the competition module (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0092]).) and 
the mutual information.  
	While Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee teaches a fitness score based on a precision of the respective class level rule and a coverage of the respective class level rule, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee does not explicitly teach
… a fitness score … based on a harmonic mean …
Castellanos teaches
… a fitness score … based on a harmonic mean (Examiner’s note: Castellanos teaches using a F-measure calculation (“fitness score”) to validate extracted rules produced by a genetic algorithm, where the F-measure calculation is based on a harmonic mean corresponding to precision and recall (Castellanos [0028]; and [0043]-[0044]: “Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Both Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos are analogous art since both teach validating rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding original Claim 3, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos teaches
(Original) The method of claim 2, further comprising 
sorting, by the processor-based system, the set of class level rules according to at least one of
the fitness score (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, which are initialized with initial fitness estimates, and run through a battery of trials to test the training data, and updating the corresponding fitness estimates for each individual and ranking individuals based on their fitness score, where the ranking based on fitness score represents a form of sorting (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0089]).) and 
the mutual information corresponding to each of the class level rules.  
Regarding amended Claim 6, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as applied to Claim 1 teaches
(Currently amended) The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules by calculating at least one of 
a fitness score for each of a pair of the class level rules (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, which are initialized with initial fitness estimates, and run through a battery of trials to test the training data, and updating the corresponding fitness estimates for each individual and ranking individuals based on their fitness score (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0089]). Sepahvand teaches fitness scores are calculated for a pair of chromosomes (Sepahvand p.3 col.1 Algorithm 1 step (d) and p.3 col.1 Fitness Function 1st paragraph). Fidelis teaches each individual is represented by chromosomes (Fidelis p.807 col.1-col.1 Section 3.3 Fitness Function: “… The fitness function evaluates the quality of each rule (individual). … Each run of our GA solves a two-class classification problem … Therefore, the GA is run at least once for each class (value of the goal attribute). … When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class … Hence, the above formulas for Se and Sp can be applied to problems with any number of classes.”).) …
… [fitness score for each of a pair of the class level rules] based on … precision of the respective class level rule and a coverage of the respective class level rule (Examiner’s note: Fidelis teaches a fitness score calculation comprising of sensitivity and specificity indicators, with the sensitivity indicator representing recall (corresponding to the coverage of the rule, which is expressed as a ratio of true positives over the sum of true positives and false negatives), and the specificity indicator representing precision (corresponding to the precision of the rule, which is expressed as a ratio of true positives over the sum of true positives and false positives) (Fidelis p.807 Section 3. Fitness Function: “The fitness function evaluates the quality of each rule (individual). … Our fitness function combines two indicators commonly used in medical domains, namely the sensitivity (Se) and the specificity (Sp), defined as follows: Se = tp/(tp + fn) … Sp = tn/(tn + fp). Finally, the fitness function used by our system is defined as the product of these two indicators, ie.,: fitness =Se*Sp.”).), and 
a mutual information between the respective class rule and the predicted class, and
selecting the class level rule having a greatest fitness score from the pair of class level rules using at least one of the fitness score (Examiner’s note: Fidelis teaches performing a series of runs for each class and using the fitness score to select the best rule, where the best rule is selected as the rule predicting that class (Fidelis p.808 col.1 Table 1 and p.808 col.1 Section 5.1 Results for the Dermatology Data Set 1st paragraph: “Table 1 presents the final 6 rules discovered by the GA – one rule for each class. For each class, the GA was run three times … The best rule of the three runs, according to its fitness values measured on the training set, was selected as the rule predicting that class (this is the rule shown in Table 1).”).) and 
the mutual information.  
	While Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee teaches a fitness score based on a precision of the respective class level rule and a coverage of the respective class level rule, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee does not explicitly teach
… a fitness score … based on a harmonic mean …
Castellanos teaches
… a fitness score … based on a harmonic mean (Examiner’s note: Castellanos teaches using a F-measure calculation (“fitness score”) to validate extracted rules produced by a genetic algorithm, where the F-measure calculation is based on a harmonic mean corresponding to precision and recall (Castellanos [0028]; and [0043]-[0044]: “Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Both Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos are analogous art since both teach validating rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding amended Claim 9, 
Claim 9 recites the computer program product of claim 8, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or more processors to perform a process that includes claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 2, and hence is rejected under similar rationale and motivations provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos as indicated in amended Claim 2, in view of the rejections of amended Claim 8.  
Regarding original Claim 10, 
Claim 10 recites the computer program product of claim 9, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or more processors to perform a process that includes claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 3, and hence is rejected under similar rationale provided Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos as indicated in Claim 3, in view of the rejections of amended Claim 9.  
Regarding amended Claim 13, 
Claim 13 recites the computer program product of claim 8, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or more processors to perform a process that includes claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 6, and hence is rejected under similar rationale and motivations provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos as indicated in amended Claim 6, in view of the rejections of amended Claim 8.  
Regarding amended Claim 16, 
Claim 16 recites the system of claim 15, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 2, and hence is rejected under similar rationale and motivations provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos as indicated in amended Claim 2, in view of the rejections of amended Claim 15.  
Regarding original Claim 17, 
Claim 17 recites the system of claim 16, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 3, and hence is rejected under similar rationale provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos as indicated in Claim 3, in view of the rejections of amended Claim 16.  
Regarding amended Claim 20, 
Claim 20 recites the system of claim 15, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 6, and hence is rejected under similar rationale and motivations provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos as indicated in amended Claim 6, in view of the rejections of amended Claim 15.  
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Hodjat et al., U.S. PGPUB 2017/0293849, published 10/12/2017 [hereafter referred as Hodjat] in view of Sepahvand et al., Generating Graphical Chain by Mutual Matching of Bayesian Network and Extracted Rules of Bayesian Network Using Genetic Algorithm, arXiv:1412.4465v1, December 15 2014 [hereafter referred as Sepahvand], in further view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cata.No.00th 8512), IEEE, July 16-19 2000, pp.805-810 [hereafter referred as Fidelis], in even further view of Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016 [hereafter referred as Chatterjee] as applied to Claims 1, 8, and 15; in even further view of Kapila et al., A Genetic Algorithm with Entropy Based Initial Bias for Automated Rule Mining, Int'l Conf. on Computer & Communication Technology (ICCCT '10), IEEE 2010, pp.491-495 [hereafter referred as Kapila].  
Regarding original Claim 4, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as applied to Claim 1 teaches
(Original) The method of claim 1, 
wherein at least one of the features is a numerical feature (Examiner’s note: Hodjat teaches an example rule containing a plurality of conditions, where the attributes of the condition (e.g., pulse value at time t, blood pressure value at time t-1, blood pressure value at time t-6) correspond to numerical values (Hodjat [0062]-[0067]).) …
However, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee does not explicitly teach
… wherein the method further comprises pre-processing, by the processor-based system, the set of training data to convert the numerical feature into a categorical feature using entropy based binning.  
	Kapila teaches
… wherein the method further comprises pre-processing, by the processor-based system, the set of training data to convert the numerical feature into a categorical feature using entropy based binning (Examiner’s note: Kapila teaches an entropy based filter approach to determine the entropy contained in a numerical attribute, using a formula that determines the entropy of an attribute based on the number of classes, a weight factor related to a partition, and the expected information required to classify an Kapila p.492 col.1 3rd paragraph-col.2 2nd paragraph (II.A. Population Initialization).).  
Both Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Kapila are analogous art since both teach generating and identifying relevant rules using genetic algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date to take the numerical attribute values taught in Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and use the entropy based filter approach taught in Kapila as a way to pre-process numerical attributes into categorical attributes. The motivation to combine is taught in Kapila, as a way to bias the initial population of rules so that it contains relevant attributes with greater probability than redundant attributes, such that the training data contains relevant information, which then improves the search performance of the genetic algorithm by reducing the search space to discover rules with more predictive accuracy, effectively producing a more computationally efficient system (Kapila p.491 col.2 2nd-4th paragraphs: “… To enhance the performance of genetic algorithms for automated rule mining, relevant attributes must be selected to reduce the search space for GA. Selection of relevant attributes enhances the efficient as well as efficacy of genetic algorithms and discovers rules with higher predictive accuracy. … To address this problem it is important to bias the initial population so that it can have relevant attributes with greater probability than the redundant attributes. … This paper proposes a genetic algorithm approach for automated rule mining employing entropy based filter approach to bias the initial population towards more relevant or informative attributes so that the GA starts with better fit rules covering relatively more training instances. … the approach is anticipated to evolve better fit rules in lesser time, thereby significantly enhancing the performance of evolutionary rule mining process.”).
Regarding Claim 11, 
Claim 11 recites the computer program product of claim 8, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Kapila as indicated in Claim 4, in view of the rejections of amended Claim 8.  
Regarding Claim 18, 
Claim 18 recites the system of claim 15, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 4, and hence is rejected under similar rationale and motivations provided by Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Kapila as indicated in Claim 4, in view of the rejections of amended Claim 15.  
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Hodjat et al., U.S. PGPUB 2017/0293849, published 10/12/2017 [hereafter referred as Hodjat] in view of Sepahvand et al., Generating Graphical Chain by Mutual Matching of Bayesian Network and Extracted Rules of Bayesian Network Using Genetic Algorithm, arXiv:1412.4465v1, December 15 2014, 6 pages [hereafter referred as Sepahvand], in further view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cata.No.00th 8512), IEEE, July 16-19 2000, pp.805-810 [hereafter referred as Fidelis], in even further view of Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016 [hereafter referred as Chatterjee] as applied to Claims 1 and 8; in even further view of Castellanos et al., U.S. PGPUB 2012/0089620, published 4/12/2012 [henceforth referred as Castellanos], in even further view of Rivera, Wilson, Scalable Parallel Genetic Algorithms, Artificial Intelligence Review 16, Kluwer Academic Publishers, 2001, pp.153-168 [hereafter referred as Rivera].
Regarding previously presented Claim 7, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee as applied to Claim 1 teaches
(Previously presented) The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules by applying each of a pair of the class level rules to a … genetic algorithm (Examiner’s note: As indicated earlier, Hodjat teaches a procreation module performing an evolutionary (“genetic”) algorithm that involves Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0087]; and [0090]-[0092]). Fidelis also teaches executing a genetic algorithm over multiple runs, with each run representing a search for rules representing each class, and determining a fitness function (comprising of Se and Sp) for each run to determine a set of class level rules (Fidelis p.807 col.1 5th paragraph-col.2, 1st paragraph (Section 3.3 Fitness Function): “Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). … In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule. Hence, the above formulas for Se and Sp can be applied to problems with any number of classes”).), 
wherein producing the set of class level rules further comprises calculating at least one of 
a fitness score for each class level rule (Examiner’s note: As indicated earlier, Hodjat teaches a system for evolving rulesets using an evolutionary algorithm, where the training portion of the system interacts with a database containing a pool of candidate individuals, which are initialized with initial fitness estimates, and run through a battery of trials to test the training data, and updating the corresponding fitness estimates for each individual and ranking individuals based on their fitness score (Hodjat Figure 8; [0051] and Figure 6, elements 606, 116, 608; [0086]-[0089]). Sepahvand teaches fitness scores are calculated for a pair of chromosomes (Sepahvand p.3 col.1 Algorithm 1 step (d) and p.3 col.1 Fitness Function 1st paragraph). Fidelis teaches each individual is represented by chromosomes (Fidelis p.807 col.1-col.1 Section 3.3 Fitness Function: “… The fitness function evaluates the quality of each rule (individual). … Each run of our GA solves a two-class classification problem … Therefore, the GA is run at least once for each class (value of the goal attribute). … When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class … Hence, the above formulas for Se and Sp can be applied to problems with any number of classes.”).) …
… [fitness score for each class level rule] based on … precision of the respective class level rule and a coverage of the respective class level rule (Examiner’s note: Fidelis teaches a fitness score calculation comprising of sensitivity and specificity indicators, with the sensitivity indicator representing recall (corresponding to the coverage of the rule, which is expressed as a ratio of true positives over the sum of true positives and false negatives), and the specificity indicator representing precision (corresponding to the precision of the rule, which is expressed as a ratio of true positives over the sum of true positives and false positives) (Fidelis p.807 Section 3. Fitness Function: “The fitness function evaluates the quality of each rule (individual). … Our fitness function combines two indicators commonly used in medical domains, namely the sensitivity (Se) and the specificity (Sp), defined as follows: Se = tp/(tp + fn) … Sp = tn/(tn + fp). Finally, the fitness function used by our system is defined as the product of these two indicators, ie.,: fitness =Se*Sp.”).), and 
a mutual information between the respective class rule and the predicted class, …
While Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee teaches a fitness score based on a precision of the respective class level rule and a coverage of the respective class level rule, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee does not explicitly teach
… a fitness score … based on a harmonic mean …
Castellanos teaches
… a fitness score … based on a harmonic mean (Examiner’s note: Castellanos teaches using a F-measure calculation (“fitness score”) to validate extracted rules produced by a genetic algorithm, where the F-measure calculation is based on a harmonic mean corresponding to precision and recall (Castellanos [0028]; and [0043]-[0044]: “Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Both Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee and Castellanos are analogous art since both teach validating rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
While Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos teaches selection of class level rules using a genetic algorithm, Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos does not explicitly teach
… applying … to a second level genetic algorithm …
… wherein the second level genetic algorithm is configured to select the subset of class level rules using at least one of the fitness score and the predicted class. 
Rivera teaches
… applying … to a second level genetic algorithm (Examiner’s note: Under its broadest reasonable interpretation, the term “second level genetic algorithm” is interpreted as a genetic algorithm divided into multiple levels, for purposes such as parallelization. Rivera teaches a global parallelization scheme where the evaluation of an individual’s fitness values is parallelized by assigning a fraction of the Rivera p.154 Section 2.1 Genetic algorithms 1st-2nd paragraphs; p.155 Section 2.2 Parallelization strategies bullet 1 and p.156 Figure 1). Rivera further teaches that this global parallelization scheme can be combined into a hybrid model and implemented in using various libraries such as PGAPack, which allows for multiple levels of control for the genetic algorithm (Rivera p.157 bullet 4).) …
… wherein the second level genetic algorithm is configured to select the subset of class level rules using at least one of the fitness score (Examiner’s note: As indicated earlier, Rivera teaches a global parallelization scheme, where a master processor performs the genetic operators and distributes the individuals among a set of slave processors to evaluate the fitness values, where the fitness values measures the quality of each individual, thus identifying the individuals for successive reproduction by the genetic algorithm (Rivera p.154 Section 2.1 Genetic algorithms 1st-2nd paragraphs; p.155 Section 2.2 Parallelization strategies bullet 1 and p.156 Figure 1).) and the predicted class. 
Both Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos and Rivera are analogous art since they both teach genetic algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the genetic algorithm taught in Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, in even further view of Castellanos and implement the master/slave type parallelization scheme taught in Rivera as a way to parallelize the genetic algorithm into different levels. The motivation to combine is taught in Rivera, since global parallelization can preserve the behavior of the original genetic algorithm and is effective in performing complicated fitness evaluations, which results in improvements to the computational time for the genetic algorithm (Rivera p.155 Section 2.2 Parallelization schemes, bullet 1).
Regarding previously presented Claim 14, 
Claim 14 recites the computer program product of claim 8, where the computer program product further comprises instructions that when executed by one or more computer processors cause the one or Hodjat in view of Sepahvand, in further view of Fidelis, in even further view of Chatterjee, Castellanos, and Rivera as indicated in Claim 7, in view of the rejections of amended Claim 8.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Vivekanandan et al., An Intelligent Genetic Algorithm for Mining Classification Rules in Large Datasets, Computing and Informatics, Vol. 32, 2013, pp.1-22.
Freitas, Alex A., A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery, In Advances in Evolutionary Computing, Theory and Applications, Volume I, Ghosh, A. and Tsutsui, S. (Eds.), Springer-Verlag Berlin Heidelberg 2003, pp.819-845.
Choubey et al., GA_RBF NN: A Classification System for Diabetes, Int. J. Biomedical Engineering and Technology, Vol.23 No.1, Inderscience Enterprises Ltd 2017, pp.71-93.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and 

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        



/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121