DETAILED ACTION
This is the response to applicant’s amendment action regarding application number 15/815,899, filed November 17, 2017.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The amendment filed June 23, 2021 has been entered. Examiner acknowledges receipt of Amendments to Application 15/815,899, which include: Amendments to the Specification pp.2-3, Amendments to the Drawings p.4 and Appendix (1 page), Amendments to the Claims pp.5-12, Remarks pp.13-19 (containing applicant’s amendments), and Affidavit-Declaration Under 37 C.F.R §1.130(a)-AIA  (FITF), 6 pages. 
Regarding applicant’s Remarks on p.13, examiner has acknowledged applicant’s amendments to the Specification (as shown in Amendments to the Specification pp.2-3), and they have overcome each and every objection previously set forth in the Preinterview First Office Action mailed February 22, 2021. 
Regarding applicant’s Remarks on p.13, examiner has acknowledged applicant’s amendments to the Drawings (as shown in Amendments to the Drawings p.4 and Appendix), and they have overcome each and every objection previously set forth in the Preinterview First Office Action mailed February 22, 2021. 
Regarding applicant’s Remarks on p.13, examiner has acknowledged Claims 1-2, 7-9, and 14-16 have been amended (as shown in Amendments to the Claims pp.5-12). Claims 1-20 remain pending in the application. However, examiner has noted that the amended claims have changed the scope of the claims such that they necessitate re-examination and further re-evaluation of the amended and related original claims, as well as introducing claim language rendered as being indefinite, each of which will be identified and described in the relevant sections below.

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 15/815,899, which include: Remarks pp.13-17 (containing applicant’s arguments). 
Regarding applicant’s Remarks on pp.13-16 for Claims 1-20 under 35 U.S.C. 101, examiner acknowledges applicant’s arguments and have considered them, and have found them to be persuasive, and as such, the earlier §101 rejections previously set forth in the Preinterview First Office Action mailed February 22, 2021 for Claims 1-20 are withdrawn. 
Regarding applicant’s Remarks on p.16 for Claims 1-20 under 35 U.S.C. 102(a)(1), examiner acknowledges applicant’s arguments and the received Affidavit-Declaration Under 37 C.F.R §1.130(a)-AIA  (FITF), and have considered them, and have found them to be persuasive, and as such, the earlier §102(a)(1) rejections previously set forth in the Preinterview First Office Action mailed February 22, 2021 for Claims 1-20 are withdrawn. 
Regarding applicant’s Remarks on pp.16-19 for Claims 1-20 under 35 U.S.C. 103, examiner acknowledges applicant’s arguments and have considered them, and have found them to be not persuasive. Examiner has noted that applicant has amended the claims to the extent such that the scope of the claims has changed, which necessitates further re-examination and re-evaluation of the amended and related original claims, which will be identified and described in the relevant sections below.
Regarding applicant’s Remarks on pp.17-18,
“Castellanos [0023] discloses (emphasis added), "a number of rules can be generated from the annotated training set, for example, using a genetic algorithm." Castellanos [0026] further discloses (emphasis added), "a genetic algorithm is started by creating a population of individuals from random combinations of words taken from the bag of prefixes and the bag of suffixes. ... Each individual is made of genes which in practice correspond to the features used to learn a model. In various embodiments, the individual can be composed of two parts, an n- length prefix and an n-length suffix. These can be randomly generated from the bags of prefixes and suffixes respectively." Thus, Castellanos discloses a genetic technique for discovering rules from a random combination of words in "a set of training documents." (Castellanos, [0021]). In other words, Castellanos is using a genetic algorithm to discover rules on a random set of words in the training documents. This rule discovery occurs prior to training a surrogate model ("[t]he training set is used to develop the model"), and thus the genetic rules in Castellanos describe the training data itself, not the contribution of the training data to the output of the model.
Therefore, it would not have been obvious to modify the classifier rule-mining techniques disclosed by Chatterjee using the genetic algorithm of Castellanos to arrive at the claimed subject matter at least because there is no disclosure or suggestion in the prior art that the genetic algorithm in Castellanos could be used to produce a set of class level rules representing "a logical conditional statement that, when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class," as now claimed.
Furthermore, it would not have been obvious to modify Chatterjee in the proposed manner because Castellanos teaches away from the claimed techniques by requiring randomness in the design of the individual ("features used to learn a model") that is subject to the genetic algorithm, as opposed to "applying...each string of bits of the instance level conditions for each of the corresponding instances [based on an output of the machine learning model] to a genetic algorithm to produce a set of class level rules," as now claimed, which is not random but rather a determinative result of executing the instances against the model. For at least the foregoing reasons, Castellanos fails to cure the deficiencies of Chatterjee. ”
Examiner has considered the arguments, and has found them to be not persuasive. Chatterjee teaches a second stage explainer algorithm/rule-mining algorithm (Chatterjee Figure 1, element 160) that generates a ranked list of explanatory rules based on the output predictions performed on a training data set, with those explanatory rules being presented to a user. Castellanos teaches a genetic algorithm to perform searching to produce rules, as indicated in Castellanos paragraph [0016]: “… patterns of prefixes and suffixes are used as indicators of the occurrence of an instance of the target entity … Although the approach uses rules, partial matching of patterns is allowed. Thus, a confidence value can be computed from a fitness function, providing a measure of the degree of match. A search space given by different combinations of prefixes and suffixes may be very large, leading to a large combinatorial problem. Genetic algorithms have proven to be useful for this kind of problem, because the inherent randomization introduced in each generation allows the algorithm to explore different regions of the variable space, making it resilient to getting caught in local minima. … a genetic algorithm may be used to learn the pattern part of the rules …”. With regards to Castellanos teaching randomness within a genetic algorithm, examiner notes that neither the applicant’s specification nor the claims explicitly exclude the possibility of using randomness to perform mutations within a population of individuals. In fact, applicant’s specification in paragraph [0040] indicates that “The population may include, for example, 1200 individual with a cross-over probability of 50% and a mutation probability set in such a manner that only two bits of an individual are flipped while undergoing mutation in the genetic algorithm. This provides a reasonable trade-off between exploration and exploitation.”, where any exploration and exploitation requires introducing a degree of randomness in order to perform the mutations (i.e., the flipping of two bits of an individual) required within a genetic algorithm. In addition, Castellanos teaches generation of the rules based on a bag of words of prefixes, which under its broadest representation, can be represented as strings of characters (including ‘0’s and ‘1’s), such that the bags of words of prefixes can also be represented as strings of ‘0’s and Castellanos paragraphs [0043]-[0044]) that is meant to replace the accuracy and validation metrics used in Chatterjee to rank the generated set of explanatory rules produced by the explainer algorithm/rule-mining algorithm. Examiner notes that while applicant originally failed to claim how the input into the genetic algorithm was designed, it would still be obvious to a person having ordinary skill in the art to substitute an explainer algorithm/rule-mining algorithm with a genetic algorithm to perform rule selections, in order to produce the same predictable results of producing rules and ranking the rules according to validation and accuracy metrics, including the fitness score metric based on a harmonic mean of precision and recall. Furthermore, examiner has noted that applicant has amended the claims to the extent such that the scope of the claims has changed, which necessitates further re-examination and re-evaluation of the amended and related original claims, which will be identified and described in the relevant sections below.
Regarding applicant’s Remarks on p.18,
“Cheng is cited for allegedly disclosing entropy-based binning, but nevertheless fails to cure the deficiencies of Chatterjee and Castellanos noted above. For at least the foregoing reasons, Chatterjee, Castellanos, and Cheng, when considered alone or in any combination, fail to render any of the claims unpatentable.”
Examiner has considered the above arguments, and has found them to be not persuasive. Chatterjee teaches pre-processing of training data, including a binning process that converts numerical features into categorical features. Cheng teaches pre-processing of training data, including an entropy-binning process that converts numerical features into categorical features. It would have been obvious to a person having ordinary skill in the art to substitute the binning process of Chatterjee with the entropy-binning process of Cheng in order to produce the same predictable results, and as a way to minimize information loss, thereby improving the performance of the system by making the predictions and rule selections more accurate. the scope of the claims has changed, which necessitates further re-examination and re-evaluation of the amended and related original claims, which will be identified and described in the relevant sections below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding amended Claim 1,
Claim 1 recites the limitation “the bit representing a range of values for each feature of the respective instance …” in line 14, which renders the claim as indefinite, since a bit is interpreted to either have a ‘0’ or ‘1’ value, and hence does not represent a range of values. This range 0..1 is not sufficient enough to represent all possible ranges of numeric values “for each feature of the respective instance”, and hence it is unclear how the applicant intends to represent numeric values with just a singular bit containing either ‘0’ or ‘1’. The specification does not provide any further clarification of how a bit can represent a range of values as indicate in this claim limitation, and as such, the specification fails to provide a definite way to quantify or Hence this lack of clarity in the term “the bit representing a range of values for each feature of the respective instance …” renders this claim as being indefinite. For the purposes of examination, this limitation will be interpreted as “[one or more bits within the string of bits] representing a range of values for each feature of the respective instance …”.
Claim 1 further recites the limitation “applying each string of bits of the instance level conditions …” in lines 16-17, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating a string of bits is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one relationship indicated by the limitation “applying each string of bits of the instance level conditions …”. The specification does not provide any further clarification of how there can be more than one string of bits associated with the instance level conditions, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity for the term “applying each string of bits of the instance level conditions …” renders this claim as being indefinite. For the purposes of examination, this limitation will be interpreted as “applying [a] string of bits of the instance level conditions …”.
Claim 1 further recites the limitation “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” in lines 19-22, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating that each instance is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one mapping indicated by the limitation “the set of instance level conditions of one or more instances of a particular class”. The Hence this lack of clarity for the term “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” renders this claim as being indefinite. For purposes of examination, this limitation will be interpreted as “when the statement holds true for the string of bits representing the set of instance level conditions of [one instance] of a particular class”.
Regarding Claims 2-7,
Claims 2-7 are dependent claims tracing back to independent parent Claim 1, and as such inherit the same indefiniteness established in Claim 1. Hence Claims 2-7 are also rejected as being indefinite by virtue of dependency.
Regarding amended Claim 8,
Claim 8 recites the limitation “the bit representing a range of values for each feature of the respective instance …” in lines 14-15, which renders the claim as indefinite, since a bit is interpreted to either have a ‘0’ or ‘1’ value, and hence does not represent a range of values. This range 0..1 is not sufficient enough to represent all possible ranges of numeric values “for each feature of the respective instance”, and hence it is unclear how the applicant intends to represent numeric values with just a singular bit containing either ‘0’ or ‘1’. The specification does not provide any further clarification of how a bit can represent a range of values as indicate in this claim limitation, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity in the term “the bit representing a range of values for each feature of the respective instance …” renders this claim as being indefinite. For the purposes of examination, this limitation will [one or more bits within the string of bits] representing a range of values for each feature of the respective instance …”.
Claim 8 further recites the limitation “applying each string of bits of the instance level conditions …” in lines 17-18, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating a string of bits is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one relationship indicated by the limitation “each string of bits of the instance level conditions …”. The specification does not provide any further clarification of how there can be more than one string of bits associated with the instance level conditions, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity for the term “applying each string of bits of the instance level conditions …” renders this claim as being indefinite. For the purposes of examination, this limitation will be interpreted as “applying [a] string of bits of the instance level conditions …”.
Claim 8 further recites the limitation “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” in lines 20-23, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating that each instance is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one mapping indicated by the limitation “the set of instance level conditions of one or more instances of a particular class”. The specification does not provide any further clarification of how there can be more than one instance being represented by a set of instance level conditions, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the Hence this lack of clarity for the term “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” renders this claim as being indefinite. For purposes of examination, this limitation will be interpreted as “when the statement holds true for the string of bits representing the set of instance level conditions of [one instance] of a particular class”.
Regarding Claims 9-14,
Claims 9-14 are dependent claims tracing back to independent parent Claim 8, and as such inherit the same indefiniteness established in Claim 8. Hence Claims 9-14 are also rejected as being indefinite by virtue of dependency.
Regarding amended Claim 15,
Claim 15 recites the limitation “the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class” in lines 20-22, which renders the claim as indefinite, since a bit is interpreted to either have a ‘0’ or ‘1’ value, and hence does not represent a range of values. This range 0..1 is not sufficient enough to represent all possible ranges of numeric values “for each feature of the respective instance”, and hence it is unclear how the applicant intends to represent numeric values with just a singular bit containing either ‘0’ or ‘1’. The specification does not provide any further clarification of how a bit can represent a range of values as indicate in this claim limitation, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity in the term “the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class” renders this claim as being indefinite. For the purposes of examination, this limitation will be interpreted as “[one or more bits within the string of bits] representing a range of values for each feature of the respective instance having a greatest contribution to the output class”.
apply each string of bits of the instance level conditions …” in lines 23-24, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating a string of bits is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one relationship indicated by the limitation “each string of bits of the instance level conditions”. The specification does not provide any further clarification of how there can be more than one string of bits associated with the instance level conditions, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity for the term “apply each string of bits of the instance level conditions …” renders this claim as being indefinite. For the purposes of examination, this limitation will be interpreted as “apply [a] string of bits of the instance level conditions …”.
Claim 15 further recites the limitation “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” in lines 27-30, which renders the claim as indefinite. Earlier in the claim, it was established that “for each respective instance, a set of instance level conditions represented by a string of bits”, indicating that each instance is associated with a set of instance level conditions (a one-to-one relationship), as opposed to a many-to-one mapping indicated by the limitation “the set of instance level conditions of one or more instances of a particular class”. The specification does not provide any further clarification of how there can be more than one instance being represented by a set of instance level conditions, and as such, the specification fails to provide a definite way to quantify or measure the metes and bounds of this term, thus making it difficult for one of ordinary skill in the art to be reasonably apprised of the scope of the invention. Hence this lack of clarity for the term “when the statement holds true for the string of bits representing the set of instance level conditions of one or more instances of a particular class” renders this claim as being indefinite. For purposes of examination, this limitation will be interpreted as “when the statement holds true for the string of bits representing the set of instance level conditions of [one instance] of a particular class”.
Regarding Claims 16-20,
Claims 16-20 are dependent claims tracing back to independent parent Claim 15, and as such inherit the same indefiniteness established in Claim 16. Hence Claims 16-20 are also rejected as being indefinite by virtue of dependency.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective 
Claims 1, 5, 8, 12, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016, in view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cat.No.00TH8512), IEEE, July 16-19 2000, pp.805-810 [henceforth referred as Fidelis].  
Regarding Claim 1, Chatterjee teaches
(Currently amended) A computer-implemented method of interpreting a machine learning model, the method comprising: 
receiving, by a processor-based system (Chatterjee col.19 lines 11-12), data representing 
a machine learning model ([Chatterjee Figure 1, elements 130, 132: examiner’s note: Selecting a classification algorithm (corresponding to “data representing a machine learning model”) that produces a classifier (which is a machine learning model).] [Chatterjee col.3 lines 45-50: “A classification algorithm or methodology may be selected at the service in some embodiments, e.g., based on the client's preferences as indicated in the model generation request …” and Chatterjee col.6 lines 56-63: “ … the algorithm may be selected based on other factors such as some indication of the problem domain for which observation records are to be classified, the schema of the observation records, knowledge base entries indicating successful classification algorithms identified in earlier classification efforts, and the like. A trained classifier 132 may be obtained using the training data 110 and the selected algorithm.”]), 
a set of training data (Chatterjee Figure 1, element 110 and Chatterjee col.6 lines 56-63: “ … the algorithm may be selected based on other factors such as some indication of the problem domain for which observation records are to be classified, the schema of the observation records, knowledge base entries indicating successful classification algorithms identified in earlier classification efforts, and the like. A trained classifier 132 may be obtained using the training data 110 and the selected algorithm.”), and 
a set of output classes for classifying a plurality of instances of the training data, each instance representing at least one feature of the training data ([Chatterjee Figure 4, elements 401, 405, 407: examiner’s note: A set of training data records (with the records corresponding to “a plurality of instances of the training data”), with each record/instance containing at least one input variable (corresponding to “at least one feature”), and associated values for a target output variable (corresponding to “a set of output classes”.] [Chatterjee col. 3 lines 34-45: “ … a client of a machine learning service may submit a request to generate a classification model for a specified data set or data source (a source from which various observation records may be collected by the service and used for training the requested model). Each observation record may contain one or more input variables and at least one output or "target" variable (the variable for which the model is make predictions). The target variable may typically comprise a "class" variable, which can take on one of a discrete set of values representative of respective sub-groups or classes of the observation records.”]); 
applying, by the processor-based system, to obtain, based on an output of the machine learning model, a contribution of each feature of the respective instance to at least one of the output classes, thereby producing, for each respective instance, a set of instance level conditions ([Chatterjee Figure 1, element 110, 132, 145: examiner’s note: Applying training data (corresponding to “each instance”) to a classifier to obtain classifier predictions based on the feature/attribute present in each instance of training data (where the predictions correspond to Chatterjee col.6, lines 56-63: “ … the algorithm may be selected based on other factors such as some indication of the problem domain for which observation records are to be classified, the schema of the observation records, knowledge base entries indicating successful classification algorithms identified in earlier classification efforts, and the like. A trained classifier 132 may be obtained using the training data 110 and the selected algorithm.” and Chatterjee col. 3 lines 34-45: “ … a client of a machine learning service may submit a request to generate a classification model for a specified data set or data source (a source from which various observation records may be collected by the service and used for training the requested model). Each observation record may contain one or more input variables and at least one output or "target" variable (the variable for which the model is make predictions). The target variable may typically comprise a "class" variable, which can take on one of a discrete set of values representative of respective sub-groups or classes of the observation records.”).] [Chatterjee col.4 lines 26-35: examiner note: A rule associated with a prediction generated from an instance within a training data set, where each generated rule/prediction is a logical condition (thus corresponding to “each instance … to the machine learning model … thereby producing, for each respective instance, a set of instance level conditions”), expressed as a logical if..then statement for predicting a certain target class (“Some number of explanatory rules or assertions may be generated for the predictions already made with respect to the training data set by the classifier. A given explanatory rule or assertion may be considered as a combination of one or more attribute predicates (e.g., ranges or exact matches of input attribute values) and implied target class values. For example, one rule may be expressed as the logical equivalent of "if (input attribute A1 is in range R1) and (input attribute A2 has the value V1), then the target class is predicted to be TC1. In this rule, the constraints on A1 and A2 respectively represent two attribute predicates, and the rule indicates an explanatory relationship between the predicates and the prediction regarding the target class.”).] [Chatterjee col.7, lines 3-12: examiner’s note: Transforming training data by feature processors (corresponding to “at least a perturbation of the respective instance”) (“The training data may be transformed in any of several ways by feature processors 140-- e.g., as discussed below, categorical attribute values may be converted to a set of binary values, numeric attribute values may be binned, and so on. The transformed versions of the training data records may be combined with the classifier predictions 135 (which may have been produced for the untransformed or original training data records) to form intermediate data records 145 used as input to generate an explainer for the classifier's predictions.”).]) …
applying, by the processor-based system, … the instance level conditions for each of the corresponding instances … to produce a set of class level rules, each class level rule representing a logical conditional statement that, when the statement holds true …, predicts that the respective instances are members of the particular class ([Chatterjee col.7 lines 13-46: examiner note: Applying training set instances (representing instance level conditions) to an explainer algorithm/rule-mining algorithm to generate attribute-predicate based rules (corresponding to “applying, by the processor-based system, … instance level conditions for each of the corresponding instances … to produce a set of class level rules”) (“A number of different explainer algorithms (which may also be referred to as rule mining algorithms) may be available in library 125 in the depicted embodiment, from which a particular algorithm may be chosen by explainer selector 160 to generate attribute-predicate based rules in the depicted embodiment. Each rule of the explainer 162 may indicate some set of conditions or predicates based on attribute values in the transformed or untransformed versions of the training set observation records, and an indication of the target variable value predicted by the classifier if the set of conditions is met. … For example, the rules may initially be generated using the training data, and then evaluated relative to one another based on the differences between (a) the actual predictions made on the test data set by the trained classifier and (b) the predictions indicated in the rules.”).] [Examiner note: Under its broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim language to not be performed because the condition precedent (“when the statement holds true” is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for the purposes of examination, this contingent clause will be treated as if the condition were fulfilled.] [Chatterjee col.4 lines 26-35: examiner note: A rule associated with a prediction generated from an instance within a training data set, where each generated rule/prediction is a logical condition, expressed as a logical if..then statement (corresponding to “a logical conditional statement”) for predicting a certain target class (corresponding to “…predicts that the respective instances are members of the particular class”) (“Some number of explanatory rules or assertions may be generated for the predictions already made with respect to the training data set by the classifier. A given explanatory rule or assertion may be considered as a combination of one or more attribute predicates (e.g., ranges or exact matches of input attribute values) and implied target class values. For example, one rule may be expressed as the logical equivalent of "if (input attribute A1 is in range R1) and (input attribute A2 has the value V1), then the target class is predicted to be TC1. In this rule, the constraints on Al and A2 respectively represent two attribute predicates, and the rule indicates an explanatory relationship between the predicates and the prediction regarding the target class.”).]); and 
causing, by the processor-based system, a display at least a portion of the set of class level rules to a user (Chatterjee col.11 lines 30-33: “In the embodiment shown in FIG.2, clients 264 may be able to view at least a subset of the artifacts stored in repository 220, e.g., by issuing read requests 218.” and Chatterjee col.9 lines 52-65: “The artifacts repository 220 may be used to store interim and/or final results of classifiers and explainers, values of the parameters selected, and so on. … A set of one or more programmatic interfaces 261 may be implemented at the machine learning service for interactions with clients 264 in the depicted embodiment. The interfaces may include, for example, one or more web-based consoles or web pages, … graphical user interfaces (GUIs) or the like. Using interfaces 261, clients 264 may, for example, … request to explain a prediction of a classification model.”).  
However, Chatterjee does not teach
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits, 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class; …
… applying, … each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class; … 
Fidelis teaches
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits (Fidelis p.806 col.1 Section 3.1 Individual Encoding, 1st paragraph – p.806 col.2 3rd paragraph: examiner’s note: Encoding instance level conditions into a chromosome structure for use in a genetic algorithm, where the chromosome structure consists of a set of genes, where each gene represents an instance level condition (with a “A chromosome is divided into n genes, where each gene corresponds to a condition involving one attribute, and n is the number of predicting attributes in the data being mined. The genes are positional, i.e. the first gene represents the first attribute, the second gene represents the second attribute, and so on. Each i-th gene, i= 1 ... n, is subdivided into three fields: weight (                        
                            
                                
                                    W
                                
                                
                                    i
                                
                            
                        
                    ), operator (                        
                            
                                
                                    O
                                
                                
                                    i
                                
                            
                        
                    ) and value (                        
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                    ), as shown in figure 1. Each gene corresponds to one condition in the IF part of a rule, and the entire chromosome (individual) corresponds to the entire IF part of the rule. … The field weight (                        
                            
                                
                                    W
                                
                                
                                    i
                                
                            
                        
                    ) is a real-valued variable taking values in the range [0 .. 1]. … The field operator (                        
                            
                                
                                    O
                                
                                
                                    i
                                
                            
                        
                    ) is a variable that indicates the relational operator employed in the i-th condition. If attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                     is categorical (nominal) this field can contain the operators "=" and "≠". If attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                     is continuous, this field can contain the operators"≥" and"≤". The field value (                        
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                    ) contains one of the values belonging to the domain of attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                    . The value                         
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                     is coded into a binary string, which is properly decoded for purposes of fitness evaluation. The number of bits used to code                         
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                     is proportional to the number of values in the domain of attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                    .”).), 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class ([Examiner note: This claim limitation is indefinite as analyzed in the above §112(b) rejection, and hence for the purposes of examination, this limitation will be interpreted as “… [one or more bits within the string of bits] representing a range of values for each feature of the respective instance …”.] [Fidelis p.806 col.1 Section 3.1 Individual Encoding, 1st paragraph – p.806 col.2 3rd paragraph: examiner’s note: Each gene within a chromosome represents an instance level condition (corresponding to “each instance level condition”) contains a value field that represents values within a domain of attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                     (corresponding to “a range of values for each feature of the respective instance having a greatest contribution to the output class”), where each value is coded into a binary string (“corresponding to “[one or more bits within the string of bits] representing a range of values for each feature … having a greatest contribution to the output class”) (“A chromosome is divided into n genes, where each gene corresponds to a condition involving one attribute, and n is the number of predicting attributes in the data being mined. The genes are positional, i.e. the first gene represents the first attribute, the second gene represents the second attribute, and so on. Each i-th gene, i= 1 ... n, is subdivided into three fields: weight (                        
                            
                                
                                    W
                                
                                
                                    i
                                
                            
                        
                    ), operator (                        
                            
                                
                                    O
                                
                                
                                    i
                                
                            
                        
                    ) and value (                        
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                    ), as shown in figure 1. … The field value (                        
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                    ) contains one of the values belonging to the domain of attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                    . The value                         
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                     is coded into a binary string, which is properly decoded for purposes of fitness evaluation. The number of bits used to code                         
                            
                                
                                    V
                                
                                
                                    i
                                
                            
                        
                     is proportional to the number of values in the domain of attribute                         
                            
                                
                                    A
                                
                                
                                    i
                                
                            
                        
                    .”).]); …
… applying, … each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class ([Examiner note: This claim limitation is indefinite as analyzed in the above §112(b) rejection, and hence for the purposes of examination, this limitation will be interpreted as “applying [a] string of bits of the instance level conditions …”., and “ … for the string of bits representing the set of instance level conditions of [one instance] of a particular class”.] [Examiner’s note: Applying the encoded chromosome (corresponding to “each string of bits of the instance level conditions for each of the  (Fidelis p.806 col.1 Section 3 The Genetic Algorithm, 1st paragraph: “The GA used in this work was developed based in the GALOPPS 3.2 system [10]. This is a public-domain tool that incorporates several features proposed by GA researchers and is very portable. The next subsections describe several aspects of the proposed algorithm, namely individual encoding, genetic operators, and fitness function.” and Fidelis p.807 col.1, Section 3.3 Fitness Function, 5th paragraph: “Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). … In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule.”).); …
Both Chatterjee and Fidelis are analogous art since both teach predicting classification rules using machine learning algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the explainer algorithm/rule-mining algorithm taught in Chatterjee and replace it with the genetic algorithm taught in Fidelis as a way to also produce a set of class level rules. The motivation to combine is taught in Fidelis, as genetic algorithms provide a search mechanism that is very effective in applications that has a large search space, (Fidelis p.805 col.2 Section 2 An Overview of Classification and Genetic Algorithms, 3rd paragraph – p.806 col.1, 1st paragraph: “Genetic Algorithms (GAs) are a search method that has been widely used in applications where the size of the search space is very large. In essence, GAs are "search algorithms based on the mechanics of natural selection and natural genetics" [9]. GAs are inspired on the principle of survival of the fittest, where the fittest individuals are selected to produce offspring for the next generation. In the context of search, individuals are candidate solutions to a given search problem. Hence, reproduction of the fittest individuals means reproduction of the best current candidate solutions. Genetic operators such as selection, crossover and mutation generate offspring from the fittest individuals. One of the advantages of GAs over "traditional" search methods is that the former performs a kind of global search using a population of individuals, rather than performing a local, hill-climbing search. Global search methods are less likely to get trapped into local maxima, in comparison with local search methods.”).
Regarding Claim 5, Chatterjee in view of Fidelis teaches
(Original) The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules that predict that at least a threshold percentage of the respective instances are members of a particular class (Chatterjee col.16, lines 36-57: examiner’s note: Selecting a subset of the set of class level rules based on attribute comparison between a record (corresponding to “an instance level condition”) and the ranked class level rules, resulting in an exact match, or an overlap with the attributes within a range predicted by a class (corresponding to “selecting, by the processor-based system, a subset of the set of class level rules that predict that at least a threshold percentage of the respective instances are members of a particular class”) (“In response to a client query to explain prediction 808 for the post-training observation record 806, the explainer may examine the rule set 804, e.g., in order of decreasing rank, to determine whether any of the rules in its rule set is applicable or not. The operations to determine whether a given rule is applicable to a given query regarding a prediction may be referred to as matching operations herein. … If the attribute values 807 match (or overlap) with the attribute predicates 802 of at least one rule, and the prediction 808 matches the implication in that rule, such a rule may be provided to the client as an explanation for the prediction. If several rules are found, one may be selected based on its rank and/or on its generality or ease of understandability. It is noted that to identify an applicable explanatory rule in the matching operations, an exact match may not be required-for example, a rule may be expressed using a range predicate on an attribute, and the rule may be considered applicable as long as the attribute's value falls within the range, and as long as the prediction made by the classifier is the same as the prediction indicated in the rule's implication component.”).).  
Regarding Claim 8, Chatterjee teaches
(Currently amended) A computer program product including 
one or more non-transitory computer readable mediums having instructions encoded thereon that when executed by one or more computer processors cause the one or more computer processors to perform a process for interpreting a machine learning model (Chatterjee Figure 10, elements 9020, 9010a, 9010b, .., 9010n; Chatterjee col. 20, lines 23-39; Chatterjee col. 19, lines 17-32), the process including 
receiving data representing 
a machine learning model (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), 
a set of training data (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), and 
a set of output classes for classifying a plurality of instances of the training data, each instance representing at least one feature of the training data (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); 
to obtain, based on an output of the machine learning model, a contribution of each feature of the respective instance to at least one of the output classes, thereby producing, for each respective instance, a set of instance level conditions (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.) …
to produce a set of class level rules, each class level rule representing a logical conditional statement that, when the statement holds true …, predicts that the respective instances are members of the particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); and 
causing a display of at least a portion of the set of class level rules to a user (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.).  
However, Chatterjee does not teach
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits, 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class; …
… applying each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class; … 
Fidelis teaches
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); …
… applying each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); …
Both Chatterjee and Fidelis are analogous art since both teach predicting classification rules using machine learning algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the explainer algorithm/rule-mining algorithm taught in Chatterjee and replace it with the genetic algorithm taught in Fidelis as a way to also produce a set of class level rules. The motivation to combine is taught in Fidelis, as genetic algorithms (Fidelis p.805 col.2 Section 2 An Overview of Classification and Genetic Algorithms, 3rd paragraph – p.806 col.1, 1st paragraph: “Genetic Algorithms (GAs) are a search method that has been widely used in applications where the size of the search space is very large. In essence, GAs are "search algorithms based on the mechanics of natural selection and natural genetics" [9]. GAs are inspired on the principle of survival of the fittest, where the fittest individuals are selected to produce offspring for the next generation. In the context of search, individuals are candidate solutions to a given search problem. Hence, reproduction of the fittest individuals means reproduction of the best current candidate solutions. Genetic operators such as selection, crossover and mutation generate offspring from the fittest individuals. One of the advantages of GAs over "traditional" search methods is that the former performs a kind of global search using a population of individuals, rather than performing a local, hill-climbing search. Global search methods are less likely to get trapped into local maxima, in comparison with local search methods.”).
Regarding Claim 12, Chatterjee in view of Fidelis teaches
(Original) The computer program product of claim 8, wherein the process includes 
selecting a subset of the set of class level rules that predict that at least a threshold percentage of the respective instances are members of a particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 5, and hence is rejected under similar rationale.).  
Regarding Claim 15, Chatterjee teaches
(Currently amended) A system for interpreting a machine learning model, the system comprising: 
one or more storages (Chatterjee Figure 10, elements 9020; Chatterjee col. 19, lines 31-56); and 
one or more processors operatively coupled to the one or more storages (Chatterjee Figure 10, elements 9020, 9030, 9010a, 9010b, .., 9010n, Chatterjee col.19, lines 17-30; Chatterjee col.19, lines 57-61), 
the one or more processors configured to execute instructions stored in the one or more storages that when executed cause the one or more processors to carry out a process (Chatterjee Figure 10, elements 9020, 9030, 9010a, 9010b, .., 9010n; Chatterjee col.19, lines 20-22; Chatterjee col. 20, lines 23-39) including 
receive data representing 
a machine learning model (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), 
a set of training data (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), and 
a set of output classes for classifying a plurality of instances of the training data, each instance representing at least one feature of the training data (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); 
apply each instance and at least one perturbation of the respective instance to the machine learning model to obtain, based on an output of the machine learning model, a contribution of each feature of the respective instance to at least one of the output classes, thereby producing, for each respective instance, a set of instance level conditions (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.) …
apply … the instance level conditions for each of the corresponding instances … to produce a set of class level rules, each class level rule representing a logical conditional statement that, when the statement holds true …, predicts that the respective instances are members of the particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); and 
cause a display of at least a portion of the set of class level rules to a user (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.).  
However, Chatterjee does not teach
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits, 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class; …
… apply each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class; … 
Fidelis teaches
… thereby producing, for each respective instance, a set of instance level conditions represented by a string of bits (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.), 
wherein each instance level condition is represented by a bit in the string of bits, the bit representing a range of values for each feature of the respective instance having a greatest contribution to the output class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); …
… apply each string of bits of the instance level conditions for each of the corresponding instances to a genetic algorithm to produce a set of class level rules. … each class level rule representing a logical conditional statement … for the string of bits representing the set of instance level conditions of one or more instances of a particular class, predicts that the respective instances are members of the particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 1, and hence is rejected under similar rationale.); …
Both Chatterjee and Fidelis are analogous art since both teach predicting classification rules using machine learning algorithms.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the explainer algorithm/rule-mining algorithm taught in Chatterjee and replace it with the genetic algorithm taught in Fidelis as a way to also produce a set of class level rules. The motivation to combine is taught in Fidelis, as genetic algorithms provide a search mechanism that is very effective in applications that has a large search space, and does not get trapped into local maxima (which may not represent the global optimal solution), thereby making the system more robust and producing a result that is close to an optimal solution (Fidelis p.805 col.2 Section 2 An Overview of Classification and Genetic Algorithms, 3rd paragraph – p.806 col.1, 1st paragraph: “Genetic Algorithms (GAs) are a search method that has been widely used in applications where the size of the search space is very large. In essence, GAs are "search algorithms based on the mechanics of natural selection and natural genetics" [9]. GAs are inspired on the principle of survival of the fittest, where the fittest individuals are selected to produce offspring for the next generation. In the context of search, individuals are candidate solutions to a given search problem. Hence, reproduction of the fittest individuals means reproduction of the best current candidate solutions. Genetic operators such as selection, crossover and mutation generate offspring from the fittest individuals. One of the advantages of GAs over "traditional" search methods is that the former performs a kind of global search using a population of individuals, rather than performing a local, hill-climbing search. Global search methods are less likely to get trapped into local maxima, in comparison with local search methods.”).
Regarding Claim 19, Chatterjee in view of Fidelis teaches
(Original) The system of claim 15, the process further comprising 
select a subset of the set of class level rules that predict that at least a threshold percentage of the respective instances are members of a particular class (This claim limitation is similar in scope as a corresponding claim limitation from Claim 5, and hence is rejected under similar rationale.).  
Claims 2-3, 6-7, 9-10, 13-14, 16-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016, in view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cat.No.00TH8512), IEEE, July 16-19 2000, pp.805-810 [henceforth referred as Fidelis] as applied to Claims 1, 8, and 15, in further view of Castellanos et al., U.S. PGPUB 2012/0089620, published 4/12/2012 [henceforth referred as Castellanos].
Regarding Claim 2, Chatterjee in view of Fidelis as applied to Claim 1 teaches
(Currently amended) The method of claim 1, 
wherein producing the set of class level rules further comprises 
calculating, by the processor-based system, at least one of
a fitness score for each class level rule based on a … precision of the respective class rule and a coverage of the respective class rule ([Chatterjee col.7, lines 13-46: examiner’s note: Performing a ranking of the rules based on various metrics including those that represent accuracy such as precision, recall, which are elements comprising a fitness score (corresponding to “calculating, by the processor-based system, … a fitness “… The rules may be ranked relative to one another, using various metrics such as support (e.g., the fraction of the training data which meets the attribute criteria of the rule), confidence or accuracy (e.g., the fraction of the rule's predictions which match the classifier's predictions, etc.), precision, recall, etc. in different embodiments and for different types of classification problems.”).] [Fidelis p.807 Section 3. Fitness Function: examiner’s note: A fitness score calculation comprising of sensitivity and specificity indicators, with the sensitivity indicator representing recall (which corresponds to the coverage of the rule, expressed as a ratio of true positives over the sum of true positives and false negatives), and the specificity indicator representing precision (which corresponds to the precision of the rule, expressed as a ratio of true positives over the sum of true positives and false positives) (“The fitness function evaluates the quality of each rule (individual). … Our fitness function combines two indicators commonly used in medical domains, namely the sensitivity (Se) and the specificity (Sp), defined as follows: Se = tp/(tp + fn) … Sp = tn/(tn + fp). Finally, the fitness function used by our system is defined as the product of these two indicators, ie.,: fitness =Se*Sp.”).]) … 
a mutual information between the respective class rule and the predicted class, 
wherein the genetic algorithm is configured to generate the set of class level rules using at least one of 
the fitness score (Fidelis p.808 col.1 Table 1, and p.808 col.1 Section 5.1 Results for the Dermatology Data Set, 1st paragraph – p.808 col.2 1st-2nd paragraphs: examiner’s note: The genetic algorithm discovering a set of rules shown in Table 1, where this table containing the rules and corresponding fitness score corresponds to “wherein the genetic algorithm is configured to generate the set of class level rules using … the fitness score” (“Table 1 presents the final 6 rules discovered by the GA – one rule for each class. … For each rule in Table 1 the third column shows two values, namely the fitness of the rule  --computed by equation (3) – in the training set and in the test set, respectively. One can see that all the rules discovered from the training data generalize well for examples in the test set. In most cases the fitness in the test set is nearly equal to the fitness in the training set.  … The fitness values reported in Table 1 are useful for evaluating the performance of each rule separately.”).) and 
the mutual information.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (Castellanos paragraphs [0043]-[0044]: examiner’s note: Using a fitness score (i.e., F-measure) calculation to validate extracted rules produced by a genetic algorithm (Castellanos paragraph [0028]), comprising of a harmonic mean of precision and recall (“Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 3, Chatterjee in view of Fidelis, in further view of Castellanos teaches
(Original) The method of claim 2, further comprising 
sorting, by the processor-based system, the set of class level rules according to at least one of 
the fitness score ([Chatterjee col.7, lines 13-46: examiner’s note: Performing a ranking of the rules based on various metrics including those that represent accuracy such as precision, recall, which are elements comprising a fitness score, where a ranking is a form of sorting (according to ascending or descending order), corresponding to “sorting, by the processor-based system, the set of class level rules according to … the fitness score”) (“… The rules may be ranked relative to one another, using various metrics such as support (e.g., the fraction of the training data which meets the attribute criteria of the rule), confidence or accuracy (e.g., the fraction of the rule's predictions which match the classifier's predictions, etc.), precision, recall, etc. in different embodiments and for different types of classification problems.”).] [Fidelis p.807 Section 3. Fitness Function: examiner’s note: A fitness score calculation comprising of sensitivity and specificity indicators, with the sensitivity indicator representing recall (which corresponds to the coverage of the rule, expressed as a ratio of true positives over the sum of true positives and false negatives), and the specificity indicator representing precision (which corresponds to the precision of the rule, expressed as a ratio of true positives over the sum of true positives and false positives) (“The fitness function evaluates the quality of each rule (individual). … Our fitness function combines two indicators commonly used in medical domains, namely the sensitivity (Se) and the specificity (Sp), defined as follows: Se = tp/(tp + fn) … Sp = tn/(tn + fp). Finally, the fitness function used by our system is defined as the product of these two indicators, ie.,: fitness =Se*Sp.”).]) and 
the mutual information corresponding to each of the class level rules.  
Regarding Claim 6, Chatterjee in view of Fidelis as applied to Claim 1 teaches
(Original) The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules by calculating at least one of 
a fitness score for each of a pair of the class level rules based on a … precision of the respective class rule and a coverage of the respective class rule (Fidelis p.807 col.1, Section 3.3 Fitness Function, 5th paragraph – p.807 col.2, 1st paragraph: examiner’s note: Running the genetic algorithm over multiple runs, with each run representing a search for rules representing each class (corresponding to “for each of a pair of the class level rules”), and running the fitness function (comprising of Se and Sp) for each run (corresponding to “calculating … a fitness score for each of a pair of the class level rules based on a … precision of the respective class rule and a coverage of the respective class rule”) (“Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). … In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule. Hence, the above formulas for Se and Sp can be applied to problems with any number of classes”).), and 
a mutual information between the respective class rule and the predicted class, and 
selecting the class level rule having a greatest fitness score from the pair of class level rules using at least one of 
the fitness score ([Chatterjee col.8 lines 20-23: examiner’s note: After performing ranking of rules based on fitness score (Chatterjee col.7, lines 13-46 ), providing the highest ranking rule to the client (corresponding to “… selecting the class level rule having the greatest fitness score …”) (“The highest ranking rule which appears to correctly explain the classifier's prediction may be provided to the client as part of an easy-to-understand explanation 172 in the depicted embodiment.”).] [Fidelis p.808 col.1 Table 1, and p.808 col.1 Section 5.1 Results for the Dermatology Data Set, 1st paragraph: examiner’s note: Performing a finite series of runs for each class (corresponding to “… from the pair of class level rules”) and using the fitness score to select the best rule of the three runs representing each class (corresponding to “selecting the class level rule having a greatest fitness score from the pair of class level rules using … the fitness score”) (“Table 1 presents the final 6 rules discovered by the GA – one rule for each class. For each class, the GA was run three times … The best rule of the three runs, according its fitness value measured on the training set, was selected as the rule predicting that class (this is the rule shown in Table 1).”).]) and 
the mutual information.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (Castellanos paragraphs [0043]-[0044]: examiner’s note: Using a fitness score (i.e., F-measure) calculation to validate extracted rules produced by a genetic algorithm (Castellanos paragraph [0028]), comprising of a harmonic mean of precision and recall (“Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 7, Chatterjee in view of Fidelis as applied to Claim 1 teaches
(Currently amended) The method of claim 1, further comprising 
selecting, by the processor-based system, a subset of the set of class level rules by applying each of a pair of the class level rules to a second level genetic algorithm ([Chatterjee col.7 lines 13-46: examiner note: Applying training set instances (representing instance level conditions) to an explainer algorithm/rule-mining algorithm to generate attribute-predicate based rules (corresponding to “selecting, by the processor-based system, a subset of the set of class level rules by applying each of a pair of the class level rules to a second level … algorithm”) (“A number of different explainer algorithms (which may also be referred to as rule mining algorithms) may be available in library 125 in the depicted embodiment, from which a particular algorithm may be chosen by explainer selector 160 to generate attribute-predicate based rules in the depicted embodiment. Each rule of the explainer 162 may indicate some set of conditions or predicates based on attribute values in the transformed or untransformed versions of the training set observation records, and an indication of the target variable value predicted by the classifier if the set of conditions is met. … For example, the rules may initially be generated using the training data, and then evaluated relative to one another based on the differences between (a) the actual predictions made on the test data set by the trained classifier and (b) the predictions indicated in the rules.”). Under its broadest reasonable interpretation, the “second level” Chatterjee Figure 1, elements 132 and 160) relative to the first stage of the trained classifier.] [Fidelis p.807 col.1, Section 3.3 Fitness Function, 5th paragraph – p.807 col.2, 1st paragraph: examiner’s note: Running the genetic algorithm (corresponding to “a second level genetic algorithm”) over multiple runs, with each run representing a search for rules representing each class (corresponding to “each of a pair of the class level rules”), and running the fitness function (comprising of Se and Sp) for each run (corresponding to “selecting, by the processor-based system, a subset of the set of class level rules by applying each of a pair of the class level rules to a second level genetic algorithm”) (“Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). … In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule. Hence, the above formulas for Se and Sp can be applied to problems with any number of classes”).]), 
wherein producing the set of class level rules further comprises 
calculating at least one of 
a fitness score for each of a pair of the class level rules based on a … precision of the respective class rule and a coverage of the respective class rule (Fidelis p.807 col.1, Section 3.3 Fitness Function, 5th paragraph – p.807 col.2, 1st paragraph: examiner’s note: Running the genetic algorithm over multiple runs, with each run representing a search for rules representing each class (corresponding to “for each of a pair of the class level rules”), and running the fitness function (comprising of Se and Sp)  (corresponding to “calculating … a fitness score for each of a pair of the class level rules based on a … precision of the respective class rule and a coverage of the respective class rule”) (“Each run of our GA solves a two-class classification problem, where the goal is to predict whether or not the patient has a given disease. Therefore, the GA is run at least once for each class (value of the goal attribute). … In the first run the GA would search for rules predicting class 1; in the second run it would search for rules predicting class 2, and so on. When the GA is searching for rules predicting a given class, all other classes are effectively merged into a large class, which can be conceptually thought of as meaning that the patient does not have the disease predicted by the rule. Hence, the above formulas for Se and Sp can be applied to problems with any number of classes”).), and 
a mutual information between the respective class rule and the predicted class, and 
wherein the second level genetic algorithm is configured to select the subset of class level rules using at least one of 
the fitness score ([Chatterjee col.8 lines 20-23: examiner’s note: After performing ranking of rules based on fitness score (Chatterjee col.7, lines 13-46 ), providing the highest ranking rule to the client (corresponding to “select … class level rules using … the fitness score”) (“The highest ranking rule which appears to correctly explain the classifier's prediction may be provided to the client as part of an easy-to-understand explanation 172 in the depicted embodiment.”).] [Fidelis p.808 col.1 Table 1, and p.808 col.1 Section 5.1 Results for the Dermatology Data Set, 1st paragraph: examiner’s note: Performing a finite series of runs for each class (corresponding to “the subset of class level rules”) and using the fitness score to select the best rule of the three runs representing each class (corresponding to “select the  (“Table 1 presents the final 6 rules discovered by the GA – one rule for each class. For each class, the GA was run three times … The best rule of the three runs, according its fitness value measured on the training set, was selected as the rule predicting that class (this is the rule shown in Table 1).”).]) and 
the predicted class.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (Castellanos paragraphs [0043]-[0044]: examiner’s note: Using a fitness score (i.e., F-measure) calculation to validate extracted rules produced by a genetic algorithm (Castellanos paragraph [0028]), comprising of a harmonic mean of precision and recall (“Rules learned during the training phase can be validated during a testing phase … The accuracy of each rule can be measured in terms of its “precision”, which can be defined as the number of correct extractions from all the extractions that it did. … validation may be performed using a metric termed "recall." "Recall" can be defined as the number of correct extractions done over the total number of extractions that may be performed in a validation test set. For example, if a validation test set was known to have ten expiration dates, but only five were extracted, the recall would be 5/20 or 0.5. Accordingly, an "accuracy" metric may be generated as a harmonic mean of precision and recall, herein termed an F measure. The F-measure may be calculated as: F = 2 ∙                        
                            
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                     
                                    ∙
                                     
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                                
                                    p
                                    r
                                    e
                                    c
                                    i
                                    s
                                    i
                                    o
                                    n
                                    +
                                    r
                                    e
                                    c
                                    a
                                    l
                                    l
                                
                            
                        
                    .”).) … 
Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 9, Chatterjee in view of Fidelis as applied to Claim 8 teaches
(Currently amended) The computer program product of claim 8, wherein producing the set of class level rules further comprises 
calculating at least one of 
a fitness score for each class level rule based on … a precision of the respective class level rule and a coverage of the respective class level rule (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.), and 
a mutual information between the respective class rule and the predicted class, 
wherein the genetic algorithm is configured to generate the set of class level rules using at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.) and 
the mutual information.  
Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 10, Chatterjee in view of Fidelis, in further view of Castellanos teaches
(Original) The computer program product of claim 9, wherein the process includes
sorting the set of class level rules according to at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 3, and hence is rejected under similar rationale.) and 
the mutual information corresponding to each of the class level rules.  
Regarding Claim 13, Chatterjee in view of Fidelis teaches
The computer program product of claim 8, wherein the process includes 
selecting a subset of the set of class level rules by calculating at least one of 
a fitness score for each of a pair of the class level rules based on a … precision of the respective class rule and a coverage of the respective class rule (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.) … , and 
a mutual information between the respective class rule and the predicted class, and 
selecting the class level rule having a greatest fitness score from the pair of class level rules using at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.) and 
the mutual information.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 14, Chatterjee in view of Fidelis as applied to Claim 8 teaches
(Currently amended) The computer program product of claim 8, wherein the process includes 
selecting a subset of the set of class level rules by applying each of a pair of the class level rules to a second level genetic algorithm (This claim limitation is similar in scope as a corresponding claim limitation from Claim 7, and hence is rejected under similar rationale.), 
wherein producing the set of class level rules further comprises calculating at least one of 
a fitness score for each class level rule based on … a precision of the respective class level rule and a coverage of the respective class level rule (This claim limitation is similar in scope as a corresponding claim limitation from Claim 7, and hence is rejected under similar rationale.), and 
a mutual information between the respective class rule and the predicted class, and 
wherein the second level genetic algorithm is configured to select the subset of class level rules using at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 7, and hence is rejected under similar rationale.) and 
the mutual information.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (This claim limitation is similar in scope as a corresponding claim limitation from Claim 7, and hence is rejected under similar rationale.) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 16, Chatterjee in view of Fidelis as applied to Claim 15 teaches
(Currently amended) The system of claim 15, wherein producing the set of class level rules further comprises: 
alculating at least one of 
a fitness score for each class level rule based on … a precision of the respective class level rule and a coverage of the respective class level rule (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.), and 
a mutual information between the respective class rule and the predicted class, 
wherein the genetic algorithm is configured to generate the set of class level rules using at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.) and 
the mutual information.  
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (This claim limitation is similar in scope as a corresponding claim limitation from Claim 2, and hence is rejected under similar rationale.) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to substitute the fitness score equation taught in Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Regarding Claim 17, Chatterjee in view of Fidelis, in further view of Castellanos teaches
(Original) The system of claim 16, the process further comprising: 
sort the set of class level rules according to at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 3, and hence is rejected under similar rationale.) and 
the mutual information corresponding to each of the class level rules.  
Regarding Claim 20, Chatterjee in view of Fidelis teaches
(Original) The system of claim 15, the process further comprising: 
select a subset of the set of class level rules by calculating at least one of 
a fitness score for each of a pair of the class level rules based on … a precision of the respective class level rule and a coverage of the respective class level rule (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.), and 
a mutual information between the respective class rule and the predicted class, and 
select the class level rule having a greatest fitness score from the pair of class level rules using at least one of 
the fitness score (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.) and 
the mutual information.   
However, Chatterjee in view of Fidelis does not teach 
a fitness score … based on a harmonic mean … 
Castellanos teaches
a fitness score … based on a harmonic mean (This claim limitation is similar in scope as a corresponding claim limitation from Claim 6, and hence is rejected under similar rationale.) … 
Both Chatterjee in view of Fidelis and Castellanos are analogous art since both teach validating discovered or extracted rules from a genetic algorithm using fitness score metrics based on precision and recall indicators.
Chatterjee in view of Fidelis with the fitness score equation containing a harmonic mean taught in Castellanos for validating the discovered rules produced by a genetic algorithm. Since Chatterjee in view of Fidelis already teaches using a fitness score (comprising of precision and recall indicators) to evaluate the accuracy of the extracted rules and rank them, a person having ordinary skill in the art would also consider using a variation of the fitness score calculation (with the same precision and recall indicators) as taught in Castellanos for performing validation and ranking in order to produce the same predictable results.
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chatterjee et al., U.S. Patent 10,824,959, filed 2/16/2016, in view of Fidelis et al., Discovering Comprehensible Classification Rules with a Genetic Algorithm, Proceedings of the 2000 Congress on Evolutionary Computation CEC00 (Cat.No.00TH8512), IEEE, July 16-19 2000, pp.805-810 [henceforth referred as Fidelis] as applied to Claims 1, 8, and 15, in further view of Cheng et al., U.S. PGPUB 2006/0059112, published 3/16/2006 [henceforth referred as Cheng].  
Regarding original Claim 4, Chatterjee in view of Fidelis as applied to Claim 1 teaches
(Original) The method of claim 1, 
wherein at least one of the features is a numerical feature (Chatterjee Figure 3, elements 302, 351, 352, 372: examiner’s note: Pre-processing data in the training data set, including features/attributes containing numeric values and categorical values, through processes such as a binning process or binarization process (Chatterjee col.10, lines 5-16: “In various embodiments, the raw data records of a data source may be pre-processed (e.g., at input record handlers 260 and/or at feature processors 262) at various stages of the classifier generation and explainer generation procedures e.g., before the classification algorithm is applied, or after the classifier is created but before the rule mining phase for the explainer is begun. Such pre-processing may include binarization of categorical attributes, binning of numeric attributes, and the like. Other types of pre-processing may also be implemented in various embodiments, such as cleansing data by removing incomplete or corrupt data records, normalizing attribute values, and so on.”).), and
wherein the method further comprises pre-processing, by the processor- based system, the set of training data to convert the numerical feature into a categorical feature (Chatterjee Figure 3, elements 302, 351, 352, 372: examiner’s note: Performing pre-processing including binning of numerical features/attributes, where each bin corresponds to a particular range of values (corresponding to “pre-processing, by the processor-based system, the set of training data to convert the numerical feature into a categorical feature”) (Chatterjee col.12, lines 15-50: “Before rule mining is begun with respect to the predictions made for Tattrl by the classifier selected for the data set to which observation records 302 belong, the categorical and/or numeric attribute values of the training data set may be transformed. Two example feature transformers 330 are shown: one which performs binarization 351 on the categorical variable Cattrl, and another which performs binning 352 on the numeric attribute Nattrl. … With respect to numeric attributes such as Nattrl, for which relative ordering of values is possible, a binning transformation (which preserves some ordering information, although at a coarser granularity than the raw data values) may be used. In the depicted example scenario, in which Nattrl can take on values in the range Oto 100, four bins or buckets are created. Values less than 25 are assigned to bin 0 (e.g., mapped to the transformed value 0), values in the range 25-50 are assigned to bin 1, values in the range 50-75 are assigned to bin 2, and the remaining values greater than 75 are assigned to bin 3.”).) …  
However, Chatterjee in view of Fidelis does not teach
… convert the numerical feature into a categorical feature using entropy based binning.  
	Cheng teaches
… convert the numerical feature into a categorical feature using entropy based binning (Cheng paragraph [0059]: examiner’s note: Performing the conversion of numerical features into discrete (categorical) features before model learning (corresponding to “pre-processing”), where entropy binning is one of the algorithms used to perform the conversion of the numerical feature into a categorical feature (“The exemplary BN learning algorithm requires discrete (categorical) data. For numerical features, discretization is performed before model learning. The discretization procedure can be based on domain knowledge or some discretization algorithms. Entropy binning is one of such algorithms that minimize the information loss between the feature and the target variable.”).).  
Both Chatterjee in view of Fidelis and Cheng are analogous art since both perform pre-processing of training data prior to use in a machine learning predictive model.
It would have been obvious to a person having ordinary skill in the art before the effective filing date to substitute the binning algorithm taught in Chatterjee in view of Fidelis with the entropy binning algorithm taught in Cheng as a way to perform pre-processing of data containing numeric features/attributes. Since Chatterjee in view of Fidelis already teaches pre-processing using binning to convert numerical features/attributes into categorical features/attributes, a person having ordinary skill in the art would also consider using entropy-based binning as taught in Castellanos in order to produce the same predictable results. Furthermore, the motivation to combine is also taught in Cheng, since entropy-based binning minimizes information loss during the conversion process, resulting in producing rules that more accurately reflect the attributes in the training data, improving the performance of the system in terms of providing more accurate rules to explain the machine learning model (Cheng paragraph [0059]: “The exemplary BN learning algorithm requires discrete (categorical) data. For numerical features, discretization is performed before model learning. The discretization procedure can be based on domain knowledge or some discretization algorithms. Entropy binning is one of such algorithms that minimize the information loss between the feature and the target variable.”).
Regarding Claim 11, Chatterjee in view of Fidelis as applied to Claim 8 teaches
(Original) The computer program product of claim 8, 
wherein at least one of the features is a numerical feature (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.), and 
wherein the process includes pre-processing the set of training data to convert the numerical feature into a categorical feature (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.) …  
However, Chatterjee in view of Fidelis does not teach
… convert the numerical feature into a categorical feature using entropy based binning.  
	Cheng teaches
… convert the numerical feature into a categorical feature using entropy based binning (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.).  
Both Chatterjee in view of Fidelis and Cheng are analogous art since both perform pre-processing of training data prior to use in a machine learning predictive model.
It would have been obvious to a person having ordinary skill in the art before the effective filing date to substitute the binning algorithm taught in Chatterjee in view of Fidelis with the entropy binning algorithm taught in Cheng as a way to perform pre-processing of data containing numeric features/attributes. Since Chatterjee in view of Fidelis already teaches pre-processing using binning to convert numerical features/attributes into categorical features/attributes, a person having ordinary skill in the art would also consider using entropy-based binning as taught in Castellanos in order to produce the same predictable results. Furthermore, the motivation to combine is also taught in Cheng, since entropy-based binning (Cheng paragraph [0059]: “The exemplary BN learning algorithm requires discrete (categorical) data. For numerical features, discretization is performed before model learning. The discretization procedure can be based on domain knowledge or some discretization algorithms. Entropy binning is one of such algorithms that minimize the information loss between the feature and the target variable.”).
Regarding Claim 18, Chatterjee in view of Fidelis as applied to Claim 15 teaches
(Original) The system of claim 15, 
wherein at least one of the features is a numerical feature (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.), 
the process further comprising: pre-process the set of training data to convert the numerical feature into a categorical feature (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.) …  
However, Chatterjee in view of Fidelis does not teach
… convert the numerical feature into a categorical feature using entropy based binning.  
	Cheng teaches
… convert the numerical feature into a categorical feature using entropy based binning (This claim limitation is similar in scope as a corresponding claim limitation from Claim 4, and hence is rejected under similar rationale.).  
Both Chatterjee in view of Fidelis and Cheng are analogous art since both perform pre-processing of training data prior to use in a machine learning predictive model.
It would have been obvious to a person having ordinary skill in the art before the effective filing date to substitute the binning algorithm taught in Chatterjee in view of Fidelis with Chatterjee in view of Fidelis already teaches pre-processing using binning to convert numerical features/attributes into categorical features/attributes, a person having ordinary skill in the art would also consider using entropy-based binning as taught in Castellanos in order to produce the same predictable results. Furthermore, the motivation to combine is also taught in Cheng, since entropy-based binning minimizes information loss during the conversion process, resulting in producing rules that more accurately reflect the attributes in the training data, improving the performance of the system in terms of providing more accurate rules to explain the machine learning model (Cheng paragraph [0059]: “The exemplary BN learning algorithm requires discrete (categorical) data. For numerical features, discretization is performed before model learning. The discretization procedure can be based on domain knowledge or some discretization algorithms. Entropy binning is one of such algorithms that minimize the information loss between the feature and the target variable.”).

Conclusion


















Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332.  The examiner can normally be reached on Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        

/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121