DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 7/23/2018 and the Remarks and Amendments filed on 5/23/2022.  Acknowledgement is made with respect to a claim of foreign priority to British Application GB1810736.7 filed on 6/29/2018.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-18 and 21-22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 1 recites the limitation “remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree” (emphasis added).  There is insufficient antecedent basis for this limitation in the claim.  For examination purposes, the limitation will be interpreted to read “remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with [[the]] validation data to create a pruned decision tree”. Dependent claims 2-10 and 22 depend on indefinite claim 1, and are also rejected under 35 USC § 112(b) based on this dependency.  Appropriate correction is required.

Claim 11 recites the limitation “removing at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the training examples data to create a pruned decision tree” (emphasis added).  There is insufficient antecedent basis for this limitation in the claim.  For examination purposes, the limitation will be interpreted to read “remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with [[the]] training examples to create a pruned decision tree”. Dependent claims 12-16 depend on indefinite claim 11, and are also rejected under 35 USC § 112(b) based on this dependency.  Appropriate correction is required.

Claim 17 recites the limitation “remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree” (emphasis added).  There is insufficient antecedent basis for this limitation in the claim.  For examination purposes, the limitation will be interpreted to read “remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with [[the]] validation data to create a pruned decision tree”. Dependent claims18 and 21 depend on indefinite claim 17, and are also rejected under 35 USC § 112(b) based on this dependency.  Appropriate correction is required.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-18 and 21-22 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the judicial exception into a practical application, or else amounts to significantly more than the abstract idea itself.

Claim 1
Step 1:  The claim recites a predictor; therefore, it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
generate the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of generating a decision tree through mathematical computations/operations.
remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of removing nodes or edges of a decision tree based on a comparison, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, remove a node and edge of a decision tree that is redundant or unnecessary, thus pruning the decision tree.
apply the pruned decision tree to the image to predict an identity of an object in the image:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of applying a decision tree to an image to predict an identity of an object in the image, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, look at an image and identify objects based on a decision to "identify all round objects".
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes, wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” and “a processor”.  The additional elements of “a memory”, “a processor”, and “modules” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”).  Last, the additional element of “wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05 (h)). Thus, the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional elements of “a memory”, “a processor”, and “modules” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes” is insignificant extra-solution activity that does not amount to an inventive concept, and is a well-understood, routine, conventional activity (see MPEP §2106.05 (g); “mere data gathering”; and MPEP §2106.05(d); “Storing and retrieving information in memory”).  Last, the additional element of “wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05 (h)). Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 2
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the assigned modules along the path form a neural network, and wherein the example x is any of: the image or an image feature map derived from the image”. These limitations merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the assigned modules along the path form a neural network, and wherein the example x is any of: the image or an image feature map derived from the image” (which are field of use limitations under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 3
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the assigned modules which are assigned to internal nodes of the at least one decision tree are routers configured to compute a binary decision in a stochastic manner according to characteristics of the processed example”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of computing a binary decision, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper, or a mathematical concept.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the example x is any of: the image or an image feature map derived from the image”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 4
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “computing the binary decision according to samples from a probability distribution with a mean corresponding to a current input to the decision tree”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of computing a binary decision, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “a processor”, which is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 5
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “operate on transformed input data received at the solver and to output an estimate of a conditional distribution expressing the probability of the outcome y given the example x”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of operating on an input to output an estimate, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “solvers”, which are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 6
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “compute a non-linear function of an example or a processed example reaching the edge from a parent node”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of computing a non-linear function.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “transformers”, which are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 7
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein at least one of the transformers is a single convolutional layer of a neural network followed by a rectified linear unit”. This limitation merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein at least one of the transformers is a single convolutional layer of a neural network followed by a rectified linear unit” (which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 8
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “generating [the decision tree] using a growing process which is dependent on a set of training data used to train the predictor”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of generating a decision tree using a growing process, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the training data comprises any of: the image, image feature map derived from the image, video, audio signal, text segment, phonemes from a speech recognition pre- processing system, skeletal data produced by a system which estimates skeletal positions of humans or animals from images, sensor data, data derived from sensor data” (which are field of use limitations under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 9
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the outcome is a class label and the image is a voxel of a medical image, and wherein the predictor is used for medical image analysis”. These limitations merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the outcome is a class label and the image is a voxel of a medical image, and wherein the predictor is used for medical image analysis” (which are field of use limitations under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 10
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “compute a non-linear function which acts to filter the medical image”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of computing a non-linear function.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the assigned modules which are assigned to edges of the decision tree are transformers” and “where a plurality of different transformers are used” (which are field of use limitations under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Further, the “transformers” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 11
Step 1:  The claim recites a method; therefore, it is directed to the statutory category of a process.
Step 2A Prong 1:  The claim recites, inter alia:
generating the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of generating a decision tree through mathematical computations/operations.
removing at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the training examples data to create a pruned decision tree:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of removing nodes or edges of a decision tree based on a comparison, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, remove a node and edge of a decision tree that is redundant or unnecessary, thus pruning the decision tree.
applying the pruned decision tree to the image to predict an identity of an object in the image:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of applying a decision tree to an image to predict an identity of an object in the image, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, look at an image and identify objects based on a decision to "identify all round objects".
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “storing in a memory a plurality of training examples comprising examples x for which outcomes y are known”.  The additional element of “a memory” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “storing in a memory a plurality of training examples comprising examples x for which outcomes y are known” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”).  Thus the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “a memory” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “storing in a memory a plurality of training examples comprising examples x for which outcomes y are known” is insignificant extra-solution activity that does not amount to an inventive concept, and is a well-understood, routine, conventional activity (see MPEP §2106.05 (g); “mere data gathering”; and MPEP §2106.05(d); “Storing and retrieving information in memory”) Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 12
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites “constructing a first model by simulating splitting of an internal node by adding a router module, and constructing a second model by simulating increasing the depth of an incoming edge of theinternal node by adding a transformer module”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental processes of constructing two models through simulations, which are observations or evaluations capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 13
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites “fixing the parameters of the decision tree in the first and second models, except for the parameters of modules added in the simulation, and computing a local optimization using the training data to adjust the non-fixed parameters”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the mental processes of fixing parameters in models and computing a local optimization, which are observations or evaluations capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 14
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites “making the decision by assessing the performance of: the first model, the second model, and the at least one decision tree before any changes, using the training examples and selecting according to a most accurate one of these options”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the mental processes of assessing performances of the models and selecting an option, which are observations or evaluations capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 15
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites “refining the decision tree by computing a global optimization over parameters of the modules using the training examples”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of refining a decision tree, which are observations or evaluations capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the training examples comprise any of: the image, image feature map derived from the image”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 16
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites “wherein the global optimization jointly optimizes a hierarchical grouping of data to paths on the at least one decision tree and neural networks associated with those paths”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of optimizing a group of data paths, which is an observation or evaluation capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 17
Step 1:  The claim recites one or more servers; therefore, it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
generate the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of generating a decision tree through mathematical computations/operations.
remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of removing nodes or edges of a decision tree based on a comparison, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, remove a node and edge of a decision tree that is redundant or unnecessary, thus pruning the decision tree.
apply the pruned decision tree to the image to predict an identity of an object in a video sample:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of applying a decision tree to a video sample to predict an identity of an object in the video sample, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, look at an image and identify objects based on a decision to "identify all round objects".
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes, wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” and “a processor”.  The additional elements of “a memory”, “a processor”, and “modules” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”).  Last, the additional element of “wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05 (h)). Thus, the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional elements of “a memory”, “a processor”, and “modules” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes” is insignificant extra-solution activity that does not amount to an inventive concept, and is a well-understood, routine, conventional activity (see MPEP §2106.05 (g); “mere data gathering”; and MPEP §2106.05(d); “Storing and retrieving information in memory”).  Last, the additional element of “wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05 (h)). Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 18
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the image comprises medical image data”. This limitation merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional elements of “wherein the image comprises medical image data”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 21
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “tagging the objects in the video”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of tagging or labelling objects in a video, which is an observation or evaluation capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.


Claim 22
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “recognizing human-made objects from natural objects in the image”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of recognizing human-made versus natural objects in an image, which is an observation or evaluation capable of being practically performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-7 and 11-17 are rejected under 35 U.S.C. § 103 as being obvious over Xiao (Xiao, “NDT: Neural Decision Tree Towards Fully Functioned Neural Graph”, Dec. 16, 2017, arXiv:1712.05934v1, pp. 1-8, hereinafter “Xiao”) in view of Brabec et al. (US 20190102337 A1, hereinafter “Brabec”).

	Regarding claim 1, Xiao discloses [a] predictor for predicting an outcome y given an example x, comprising an image, for a data set for usage in image classification, the predictor comprising: (Abstract; “we propose the neural decision tree (NDT), which takes simplified neural networks as decision function in each branch and employs complex neural networks to generate the output in each leaf”, which discloses a predictor I the form of a neural decision tree that inherently processes received inputs or examples x to predict an outcome y; and Page 3, Figure 2;  the figure discloses the neural tree structure that takes an input example x to predict an outcome y in the form of a target output; and Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses image classification using image data; and Page 5, § 5.2; “With this proposed principle from the seminal work, we attempt to tackle image classification”)
memory (Page 5, Experiment; the experiment section inherently uses a memory that stores inputs that are used in the experiment) storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes; (Abstract; “neural decision tree”; and Page 3, Figure 2;  the figure discloses the plurality of nodes connected by edges indicated by arrows, the nodes comprising a root node (upper-most condition network in the figure), internal nodes (lower condition networks in the figure), and leaf nodes (target network))
wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations; (Page 3, Figure 2;  the figure discloses, under a broadest reasonable interpretation of the claim language, wherein each of the individual one of the nodes each have an assigned module in the form of a respective condition network, and each of the internal nodes or condition networks the module computes a binary outcome in that the condition network splits according to >0 or <=0 for selecting a child node (one of the condition networks below a parent condition network) of the internal node. Note that the parameterized, differentiable operations, under a BRI, are computed at the condition network operation in the figure, which is akin to the “Router” operations as disclosed in paragraph [0023] of the present application; and Page 2, Column 2; “we employ a simplified neural network as condition network, which is usually a one- or two-layer multi-perceptions with the non-linear function of tanh”, which discloses that each node comprises parameterized differentiable operations in the form of a tanh operation) 
generate the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes (Page 4, Algorithm 1; the algorithm discloses, under a broadest reasonable interpretation of the claim language, generating the at least one decision tree (disclosed as “TREECONSTRUCTION” at line 1 of the algorithm) through computing the differentiable operations to create the internal nodes and edges; and Page 4, §3; the section provides further details on how the decision tree is generated through computing the differentiable operations to create the nodes and edges of the decision tree. See “Mathematically, the component of neurons are continuous functions, such as matrix multiply, hyperbolic tangent (tanh), convolution layer, etc, which could be implemented as mathematical operations”; and Page 3, Figure 2; the figure discloses the creation of internal and edge nodes).
apply the [[pruned]] decision tree to the image to predict an identity of an object in the image (Page 5, §5.2; “The MNIST dataset (Lecun et al., 1998) is a classic benchmark dataset, which consists of handwritten digit images, 28 x 28 pixels in size, organized into 10 classes (0 to 9) with 60,000 training and 10,000 test samples”, which discloses that the decision tree or random forest that is developed in the Xiao reference is used to predict the identity of an object in an image or a handwritten number; and Page 6, Figure 3; the figure discloses applying a decision tree to predict an identity of an object or number in an image or handwritten digit image.  Note further that the table in the figure discloses the predictions; and Page 6, Column 2; “Firstly, we could clearly draw the conclusion from Figure 3, that each leaf node needs to predict less categories, which justifies our assumption. For example, in the bottom figure, the node “A” only needs to predict the category “1”, which is a single classification, and the node “H” only needs to predict the categories “0,3,5,8” which is a four classification”).
Xiao fails to explicitly disclose but Brabec discloses remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree ([0070]; “According to various embodiments, the device may optionally also perform a model complexity pruning operation on the model that results from the decision refinement phase”, the pruning operation is the removing of internal or edge nodes from the decision tree; and [0071]; “FIG. 5 illustrates an example of pruning leaves from a random decision forest, in some embodiments. As shown, assume that a given decision tree 500 includes nodes 1-7 shown, which may be the result of the decision refinement phase described above”, the pruning of leaves being the pruning of nodes; and [0072]; “In turn, parent node 3 may itself be considered for consolidation (e.g., by comparing its prediction with its sibling node 2)”, which discloses that the parent node may also be pruned, and the parent node may be broadly interpreted as an internal or edge nod; and [0066]; “Note that in each iteration of the stochastic iterative retraining, the model is also trained on a different random subset S.sub.f. In theory, this may cause some of the objects, which were correctly classified in the previous iteration by chance, to be misclassified in the current iteration and added to D.sub.important set”, which discloses that, during the model retraining or refinement stage where the decision tree is pruned, a validation or training set/subset is used to perform the pruning operation, thus resulting in the pruned decision tree; and [0064]; the paragraph and the accompanying pseudocode discloses the decision tree refinement that uses training or validation data to do the refinement to create a pruned or refine decision tree; and Figure 5)
apply the pruned decision tree to the image to predict [[an identity of an object in the image]] ([0074]; “Stated simply, pruning reduces the size of the tree, but does not affect the predictions of the tree. In doing so, the resources needed to store and execute the classifier may be reduced significantly, without any loss of performance”, which discloses that the pruned decision tree is then applied to make subsequent predictions; and [0083]; “At step 625, as detailed above, the device may adjust the prediction labels of individual leaves of the random decision forest of the retrained malware classifier, based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier. More specifically, for each leaf in each tree, the device may compute the histogram of objects from the training dataset which end up in this leaf. The device may then determine the final prediction of the leaf from the histogram (e.g., via soft voting) of class distributions in each leaf or, alternatively, the predicted class is the one with the highest object count, in various embodiment”, which discloses, under a BRI, that the pruned or retrained decision tree is then applied to make predictions)
Xiao and Brabec are analogous art because both are concerned with decision trees and machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in decision trees and machine learning to combine the pruning and predicting of Brabec with the predictor of Xiao to yield the predictable result of remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree; and apply the pruned decision tree to the image to predict. The motivation for doing so would be to adjust prediction labels of individual leaves of a random decision forest of a retrained malware classifier based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier (Brabec; [0011]).


Regarding claim 2, the rejection of claim 1 is incorporated and Xiao further discloses wherein the assigned modules along the path form a neural network, and (Page 2, Column 2; “we employ a simplified neural network as condition network, which is usually a one- or two-layer multi-perceptions”; and Page 3, Figure 2)
wherein the example x is any of: the image or an image feature map derived from the image (Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses wherein the example is an image; and Page 5, §5.2; the section discloses using input images).

Regarding claim 3, the rejection of claim 1 is incorporated and Xiao further discloses wherein the assigned modules which are assigned to internal nodes of the at least one decision tree are routers configured to compute a binary decision in a stochastic manner according to characteristics of the processed example, and (Page 3, Figure 2; the figure discloses wherein the assigned modules are assigned to internal nodes or condition networks that compute a binary decision (>=0 or <0) in a stochastic manner)
wherein the example x is any of: the image or an image feature map derived from the image (Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses wherein the example is an image; and Page 5, §5.2; the section discloses using input images).

Regarding claim 4, the rejection of claims 1 and 3 are incorporated and Xiao further discloses wherein at least one of the routers comprises a processor for computing the binary decision according to samples from a probability distribution with a mean corresponding to a current input to the decision tree (Page 5, §5; the experiments section discloses the inherent processor used in the experiment for computing the binary decision according to samples from a probability distribution with a mean corresponding to a current input into the decision tree as a multitude of test samples are used in the study corresponding to the current input).

Regarding claim 5, the rejection of claim 1 is incorporated and Xiao further discloses wherein the assigned modules which are assigned to leaf nodes of the at least one decision tree are solvers configured to operate on transformed input data received at the solver and to output an estimate of a conditional distribution expressing the probability of the outcome y given the example x, (Page 3, Column 2; “To finally predict the category of each sample, we apply a complex network as the target network, which often is a stacked convolution one for image or an LSTM for sentence”; and Figure 2; “Target network”).

Regarding claim 6, the rejection of claim 1 is incorporated and Xiao further discloses wherein the assigned modules which are assigned to edges of the at least one decision tree are transformers, each transformer configured to compute a non-linear function of an example or a processed example reaching the edge from a parent node (Page 2, Column 2; “To exactly pre-classify each sample, we employ a simplified neural network as condition network, which is usually a one- or two-layer multi-perceptions with the non-linear function of tanh).

Regarding claim 7, the rejection of claims 1 and 6 are incorporated and Xiao further discloses wherein at least one of the transformers is a single convolutional layer of a neural network followed by a rectified linear unit (Page 5, Column 2; “CNN-based architecture LeNet-5 with dropout and ReLUs, classic linear classifier SVM with RBF kernel”; and Page 2, Column 2; “a one- or two-layer multi-perceptions with the non-linear function of tanh. This layer is only applied in the inner nodes of decision tree”).

Regarding claim 11, Xiao discloses [a] computer-implemented method of training a predictor to predict an outcome y given an example x for a data set for usage in image classification, the method comprising: (Abstract; “we propose the neural decision tree (NDT), which takes simplified neural networks as decision function in each branch and employs complex neural networks to generate the output in each leaf”, which discloses a predictor I the form of a neural decision tree that inherently processes received inputs or examples x to predict an outcome y; and Page 3, Figure 2;  the figure discloses the training of a neural tree structure that takes an input example x to predict an outcome y in the form of a target output, and this is inherently done on a computer; and Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses image classification using image data; and Page 5, § 5.2; “With this proposed principle from the seminal work, we attempt to tackle image classification”)
storing in a memory a plurality of training examples comprising examples x for which outcomes y are known; (Page 3, Figure 2; Page 3, Figure 2; the figure discloses the input example (input) x for which an outcome y (target outcome) is known during the training phase of building the decision tree and Page 5, §5.2;  the section discloses the use of training examples used in the experiment that inherently uses and stores information in memory, where the training examples have known outcomes for a given input in creating the decision tree)
generate the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes (Page 4, Algorithm 1; the algorithm discloses, under a broadest reasonable interpretation of the claim language, generating the at least one decision tree (disclosed as “TREECONSTRUCTION” at line 1 of the algorithm) through computing the differentiable operations to create the internal nodes and edges; and Page 4, §3; the section provides further details on how the decision tree is generated through computing the differentiable operations to create the nodes and edges of the decision tree. See “Mathematically, the component of neurons are continuous functions, such as matrix multiply, hyperbolic tangent (tanh), convolution layer, etc, which could be implemented as mathematical operations”; and Page 3, Figure 2; the figure discloses the creation of internal and edge nodes).
applying the [[pruned]] decision tree to the image to predict an identity of an object in the image (Page 5, §5.2; “The MNIST dataset (Lecun et al., 1998) is a classic benchmark dataset, which consists of handwritten digit images, 28 x 28 pixels in size, organized into 10 classes (0 to 9) with 60,000 training and 10,000 test samples”, which discloses that the decision tree or random forest that is developed in the Xiao reference is used to predict the identity of an object in an image or a handwritten number; and Page 6, Figure 3; the figure discloses applying a decision tree to predict an identity of an object or number in an image or handwritten digit image.  Note further that the table in the figure discloses the predictions; and Page 6, Column 2; “Firstly, we could clearly draw the conclusion from Figure 3, that each leaf node needs to predict less categories, which justifies our assumption. For example, in the bottom figure, the node “A” only needs to predict the category “1”, which is a single classification, and the node “H” only needs to predict the categories “0,3,5,8” which is a four classification”).
Xiao fails to explicitly disclose but Brabec discloses remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the training examples data to create a pruned decision tree ([0070]; “According to various embodiments, the device may optionally also perform a model complexity pruning operation on the model that results from the decision refinement phase”, the pruning operation is the removing of internal or edge nodes from the decision tree; and [0071]; “FIG. 5 illustrates an example of pruning leaves from a random decision forest, in some embodiments. As shown, assume that a given decision tree 500 includes nodes 1-7 shown, which may be the result of the decision refinement phase described above”, the pruning of leaves being the pruning of nodes; and [0072]; “In turn, parent node 3 may itself be considered for consolidation (e.g., by comparing its prediction with its sibling node 2)”, which discloses that the parent node may also be pruned, and the parent node may be broadly interpreted as an internal or edge node; and [0066]; “Note that in each iteration of the stochastic iterative retraining, the model is also trained on a different random subset S.sub.f. In theory, this may cause some of the objects, which were correctly classified in the previous iteration by chance, to be misclassified in the current iteration and added to D.sub.important set”, which discloses that, during the model retraining or refinement stage where the decision tree is pruned, training examples or a training set/subset is used to perform the pruning operation, thus resulting in the pruned decision tree; and [0064]; the paragraph and the accompanying pseudocode discloses the decision tree refinement that uses training or validation data to do the refinement to create a pruned or refine decision tree; and Figure 5)
apply the pruned decision tree to the image to predict [[an identity of an object in the image]] ([0074]; “Stated simply, pruning reduces the size of the tree, but does not affect the predictions of the tree. In doing so, the resources needed to store and execute the classifier may be reduced significantly, without any loss of performance”, which discloses that the pruned decision tree is then applied to make subsequent predictions; and [0083]; “At step 625, as detailed above, the device may adjust the prediction labels of individual leaves of the random decision forest of the retrained malware classifier, based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier. More specifically, for each leaf in each tree, the device may compute the histogram of objects from the training dataset which end up in this leaf. The device may then determine the final prediction of the leaf from the histogram (e.g., via soft voting) of class distributions in each leaf or, alternatively, the predicted class is the one with the highest object count, in various embodiment”, which discloses, under a BRI, that the pruned or retrained decision tree is then applied to make predictions)
Xiao and Brabec are analogous art because both are concerned with decision trees and machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in decision trees and machine learning to combine the pruning and predicting of Brabec with the predictor of Xiao to yield the predictable result of remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the training examples data to create a pruned decision tree. The motivation for doing so would be to adjust prediction labels of individual leaves of a random decision forest of a retrained malware classifier based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier (Brabec; [0011]).

Regarding claim 12, the rejection of claim 11 is incorporated and Xiao further discloses constructing a first model by simulating splitting of an internal node by adding a router module, and (Page 3, Figure 2; the figure discloses wherein the assigned modules are assigned to internal nodes or condition networks (router module) that compute a binary decision (>=0 or <0) in a stochastic manner)
constructing a second model by simulating increasing the depth of an incoming edge of the internal node by adding a transformer module, and (Page 2, Column 2; “To exactly pre-classify each sample, we employ a simplified neural network as condition network, which is usually a one- or two-layer multi-perceptions with the non-linear function of tanh; and Page 5, Table 1;  the table shows the multiple models at a certain depth; and Page 6, Table 2;  the table shows the multiple models at an increased depth that adds further transformer modules).

Regarding claim 13, the rejection of claims 11 and 12 are incorporated and Xiao further discloses making the decision by, fixing the parameters of the decision tree in the first and second models, except for the parameters of modules added in the simulation, and computing a local optimization using the training data to adjust the non-fixed parameters (Page 5, §5.1; “Regarding the condition network, we apply a two-layer fully connected perceptions, with the hyper-parameter input-300-1 for MNIST and input-3000-1 for CIFAR. Regarding the target network, we also employ a three-layer fully connected perceptions, with the hyper-parameter input-300-100-10 for MNIST, input3000-1000-10 for CIFAR-10 and input-3000-1000-100 for CIFAR-100. 1 To train the model, we leverage AdaDelta (Zeiler, 2012) as our optimizer, with hyper-parameter as moment factor η = 0.6 and = 1 × 10−6 . We train the model until convergence, but at most 1,000 rounds. Regarding the batch size, we always choose the largest one to fully utilize the computing devices. Notably, the hyper-parameters of approximated continuous function is α = 1000”).

Regarding claim 14, the rejection of claims 11, 12, and 13 are incorporated and Xiao further discloses making the decision by assessing the performance of: the first model, the second model, and the at least one decision tree before any changes, using the training examples and selecting according to a most accurate one of these options (Page 5, §5.1; “Regarding the condition network, we apply a two-layer fully connected perceptions, with the hyper-parameter input-300-1 for MNIST and input-3000-1 for CIFAR. Regarding the target network, we also employ a three-layer fully connected perceptions, with the hyper-parameter input-300-100-10 for MNIST, input3000-1000-10 for CIFAR-10 and input-3000-1000-100 for CIFAR-100. 1 To train the model, we leverage AdaDelta (Zeiler, 2012) as our optimizer, with hyper-parameter as moment factor η = 0.6 and = 1 × 10−6 . We train the model until convergence, but at most 1,000 rounds. Regarding the batch size, we always choose the largest one to fully utilize the computing devices. Notably, the hyper-parameters of approximated continuous function is α = 1000”; and Page 5, §5.2).

Regarding claim 15, the rejection of claim 11 is incorporated and Xiao further discloses refining the decision tree by computing a global optimization over parameters of the modules using the training examples, (Page 5, §5.1; “Regarding the condition network, we apply a two-layer fully connected perceptions, with the hyper-parameter input-300-1 for MNIST and input-3000-1 for CIFAR. Regarding the target network, we also employ a three-layer fully connected perceptions, with the hyper-parameter input-300-100-10 for MNIST, input3000-1000-10 for CIFAR-10 and input-3000-1000-100 for CIFAR-100. 1 To train the model, we leverage AdaDelta (Zeiler, 2012) as our optimizer, with hyper-parameter as moment factor η = 0.6 and = 1 × 10−6 . We train the model until convergence, but at most 1,000 rounds”, convergence being the global optimization)
wherein Page 34 of 37UTILITY PATENT MS Docket No. 404783-US-NPthe training examples comprise any of: the image or an image feature map derived from the image (Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses wherein the training example is an image; and Page 5, §5.2;  the section discloses using input images).


Regarding claim 16, the rejection of claims 11 and 15 are incorporated and Xiao further discloses wherein the global optimization jointly optimizes a hierarchical grouping of data to paths on the at least one decision tree and neural networks associated with those paths (Page 5, §5.1; “Regarding the condition network, we apply a two-layer fully connected perceptions, with the hyper-parameter input-300-1 for MNIST and input-3000-1 for CIFAR. Regarding the target network, we also employ a three-layer fully connected perceptions, with the hyper-parameter input-300-100-10 for MNIST, input3000-1000-10 for CIFAR-10 and input-3000-1000-100 for CIFAR-100. 1 To train the model, we leverage AdaDelta (Zeiler, 2012) as our optimizer, with hyper-parameter as moment factor η = 0.6 and = 1 × 10−6 . We train the model until convergence, but at most 1,000 rounds”, convergence being the global optimization).


Regarding claim 17, Xiao discloses [o]ne or more servers configured for predicting an outcome y given an example x for a data set; (Page 35 of 37Abstract; “we propose the neural decision tree (NDT), which takes simplified neural networks as decision function in each branch and employs complex neural networks to generate the output in each leaf”, which discloses a predictor I the form of a neural decision tree that inherently processes received inputs or examples x to predict an outcome y; and Page 5, Experiment; the experiment section inherently uses servers or computers for predicting an outcome given an input example, and these servers are inherently used in the experiments section of Xiao; and Page 3, Figure 2; the figure discloses the input example (input) x for which an outcome y (target outcome) is not known)
memory (Page 5, Experiment; the experiment section inherently uses a memory that stores inputs that are used in the experiment) storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes; (Abstract; “neural decision tree”; and Page 3, Figure 2;  the figure discloses the plurality of nodes connected by edges indicated by arrows, the nodes comprising a root node (upper-most condition network in the figure), internal nodes (lower condition networks in the figure), and leaf nodes (target network))
wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations; (Page 3, Figure 2;  the figure discloses, under a broadest reasonable interpretation of the claim language, wherein each of the individual one of the nodes each have an assigned module in the form of a respective condition network, and each of the internal nodes or condition networks the module computes a binary outcome in that the condition network splits according to >0 or <=0 for selecting a child node (one of the condition networks below a parent condition network) of the internal node. Note that the parameterized, differentiable operations, under a BRI, are computed at the condition network operation in the figure, which is akin to the “Router” operations as disclosed in paragraph [0023] of the present application; and Page 2, Column 2; “we employ a simplified neural network as condition network, which is usually a one- or two-layer multi-perceptions with the non-linear function of tanh”, which discloses that each node comprises parameterized differentiable operations in the form of a tanh operation) 
generate the at least one decision tree through computing the differentiable operations to create the internal nodes and the edge nodes (Page 4, Algorithm 1; the algorithm discloses, under a broadest reasonable interpretation of the claim language, generating the at least one decision tree (disclosed as “TREECONSTRUCTION” at line 1 of the algorithm) through computing the differentiable operations to create the internal nodes and edges; and Page 4, §3; the section provides further details on how the decision tree is generated through computing the differentiable operations to create the nodes and edges of the decision tree. See “Mathematically, the component of neurons are continuous functions, such as matrix multiply, hyperbolic tangent (tanh), convolution layer, etc, which could be implemented as mathematical operations”; and Page 3, Figure 2; the figure discloses the creation of internal and edge nodes).
applying the [[pruned]] decision tree to the image to predict an identity of an object in a video sample (Page 5, §5.2; “The MNIST dataset (Lecun et al., 1998) is a classic benchmark dataset, which consists of handwritten digit images, 28 x 28 pixels in size, organized into 10 classes (0 to 9) with 60,000 training and 10,000 test samples”, which discloses that the decision tree or random forest that is developed in the Xiao reference is used to predict the identity of an object in an image or a handwritten number.  Note that, under a broadest reasonable interpretation of the claim language, an image is broadly interpreted as a video sample; and Page 6, Figure 3; the figure discloses applying a decision tree to predict an identity of an object or number in an image or handwritten digit image.  Note further that the table in the figure discloses the predictions; and Page 6, Column 2; “Firstly, we could clearly draw the conclusion from Figure 3, that each leaf node needs to predict less categories, which justifies our assumption. For example, in the bottom figure, the node “A” only needs to predict the category “1”, which is a single classification, and the node “H” only needs to predict the categories “0,3,5,8” which is a four classification”).
Xiao fails to explicitly disclose but Brabec discloses remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree ([0070]; “According to various embodiments, the device may optionally also perform a model complexity pruning operation on the model that results from the decision refinement phase”, the pruning operation is the removing of internal or edge nodes from the decision tree; and [0071]; “FIG. 5 illustrates an example of pruning leaves from a random decision forest, in some embodiments. As shown, assume that a given decision tree 500 includes nodes 1-7 shown, which may be the result of the decision refinement phase described above”; and [0072]; “In turn, parent node 3 may itself be considered for consolidation (e.g., by comparing its prediction with its sibling node 2)”, which discloses that the parent node may also be pruned, and the parent node may be broadly interpreted as an internal or edge node; and [0066]; “Note that in each iteration of the stochastic iterative retraining, the model is also trained on a different random subset S.sub.f. In theory, this may cause some of the objects, which were correctly classified in the previous iteration by chance, to be misclassified in the current iteration and added to D.sub.important set”, which discloses that, during the model retraining or refinement stage where the decision tree is pruned, training examples or a training set/subset is used to perform the pruning operation, thus resulting in the pruned decision tree; and [0064]; the paragraph and the accompanying pseudocode discloses the decision tree refinement that uses training or validation data to do the refinement to create a pruned or refine decision tree; and Figure 5)
apply the pruned decision tree to the image to predict [[an identity of an object in a video sample]] ([0074]; “Stated simply, pruning reduces the size of the tree, but does not affect the predictions of the tree. In doing so, the resources needed to store and execute the classifier may be reduced significantly, without any loss of performance”, which discloses that the pruned decision tree is then applied to make subsequent predictions; and [0083]; “At step 625, as detailed above, the device may adjust the prediction labels of individual leaves of the random decision forest of the retrained malware classifier, based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier. More specifically, for each leaf in each tree, the device may compute the histogram of objects from the training dataset which end up in this leaf. The device may then determine the final prediction of the leaf from the histogram (e.g., via soft voting) of class distributions in each leaf or, alternatively, the predicted class is the one with the highest object count, in various embodiment”, which discloses, under a BRI, that the pruned or retrained decision tree is then applied to make predictions)
Xiao and Brabec are analogous art because both are concerned with decision trees and machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in decision trees and machine learning to combine the pruning and predicting of Brabec with the predictor of Xiao to yield the predictable result of remove at least one of the internal nodes or the edge nodes from the at least one decision tree through comparing values of the modules with the validation data to create a pruned decision tree; and applying the pruned decision tree to the image to predict an identity of an object in a video sample. The motivation for doing so would be to adjust prediction labels of individual leaves of a random decision forest of a retrained malware classifier based in part on decision changes in the forest that result from assessing the entire training dataset with the classifier (Brabec; [0011]).


Claim 8 is rejected under 35 U.S.C. § 103 as being obvious over Xiao in view of Brabec and further in view of Bulo et al. (Bulo et al., “Neural Decision Forests for Semantic Image Labelling”, 2014,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, hereinafter “Bulo”).

Regarding claim 8, the rejection of claim 1 is incorporated and Xiao further discloses wherein the training data comprises any of: the image, image feature map derived from the image, video, audio signal, text segment, phonemes from a speech recognition pre-processing system, skeletal data produced by a system which estimates skeletal positions of humans or animals from images, sensor data, data derived from sensor data (Page 1, Column 2; “With this proposed principle from the seminal work, we attempt to tackle image classification”, which discloses wherein the example is an image; and Page 5, §5.2;  the section discloses using input images).
Xiao fails to explicitly disclose but Bulo further discloses using a growing process which is dependent on a set of training data used to train the predictor (Page 3, Column 2; “The standard approach to training a random decision tree of a RF consists in a recursive procedure that starts from the root and iteratively builds the tree by splitting the actual terminal node”, the building of the tree is a decision to add another module to the incoming edge of the current node, adding another node to the current node, or terminate growing for the current node)
Xiao, Brabec, and Bulo are analogous art because all are concerned with decision tree structures.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in decision tree structures to combine the growing of decision trees as taught by Bulo with the method of Xiao and Brabec to yield the predictable result of using a growing process which is dependent on a set of training data used to train the predictor. The motivation for doing so would be to select the best split function and the best predictions for the children nodes of a decision tree structure (Bulo; Page 4, Column 1).


Claims 9, 10, and 18 are rejected under 35 U.S.C. § 103 as being obvious over Xiao in view of Brabec and further in view of Georgescu et al. (US 20160174902 A1, hereinafter “Georgescu”).

Regarding claim 9, the rejection of claim 1 is incorporated and Xiao further discloses wherein the outcome is a class label (Page 3, Figure 3; the figure discloses wherein the outcome is a class label or target output for the decision tree; and Page 3, Column 2; “Li,j is the adhoc label vector of i-th sample, where the true label position is 1 and otherwise 0”).
Xiao fails to explicitly disclose but Georgescu discloses the image is a voxel of a medical image, and wherein the predictor is used for medical image analysis ([0061]; “The first deep neural network operates directly on the voxels of the medical image, and not on handcrafted features extracted from the medical image. The first deep neural network inputs image patches centered at voxels of the medical image and calculates a number of position candidates in the medical image based on the input image patches” (emphasis added), which discloses that the example or input is a voxel or voxels of a medical image used for medical analysis; and [0107]).
Xiao, Brabec, and Georgescu are analogous art because all are concerned with machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the medical image analysis of Georgescu with the predictor of Xiao and Brabec to yield the predictable result of wherein the outcome is a class label and the example is a voxel of a medical image, and wherein the predictor is used for medical image analysis. The motivation for doing so would be to provide for anatomical object detection in medical image data using deep neural networks (Georgescu; [0002]).

Regarding claim 10, the rejection of claims 1 and 9 are incorporated and Xiao fails to explicitly disclose but Brabec discloses wherein the assigned modules which are assigned to edges of the decision tree are transformers . . . where a plurality of different transformers are used ([0043]; “More specifically, as would be appreciated by one skilled in the art, nodes in decision trees 404 may correspond to different decisions/conditions that can be applied to input 402. Probabilities can then be assigned, based on the results of these decisions/conditions”, the nodes being the assigned modules with conditions applied to incoming data, and these nodes are broadly interpreted to be transformers) 
The motivation to combine Xiao and Brabec is the same as discussed above with respect to claim 1.
Xiao fails to explicitly disclose but Georgescu discloses compute a non-linear function which acts to filter the medical image ([0090]; “The bias of this neuron is then added to this linear combination, and the resulting value is transformed by a non-linear mapping to obtain the activation value”, which discloses the use of a non-linear (activation) function used in a medical image analysis; and [0124]; “To achieve significant speed-up and save memory footprint, S needs to be reduced as much as possible. However, the present inventors have determined that, with a small S (e.g., 32), it is more difficult to approximate 3D filters than 2D filters. Non-linear functions g() are exploited in neural networks to bound the response to a certain range (e.g., [0, 1] using the sigmoid function)”; and [0072]; “The learned weights shown in FIG. 8 can be treated as filters for extracting high-level image features”, the images being medical images).
The motivation to combine Xiao, Nori, and Georgescu is the same as discussed above with respect to claim 9.

Regarding claim 18, the rejection of claim 17 is incorporated but Xiao fails to explicitly disclose but Georgescu discloses the image comprise medical image data ([0061]; “The first deep neural network operates directly on the voxels of the medical image, and not on handcrafted features extracted from the medical image. The first deep neural network inputs image patches centered at voxels of the medical image and calculates a number of position candidates in the medical image based on the input image patches” (emphasis added), which discloses that the example or input is a voxel or voxels of a medical image used for medical analysis; and [0107]).
The motivation to combine Xiao, Nori, and Georgescu is the same as discussed above with respect to claim 9.

Claim 21 is rejected under 35 U.S.C. § 103 as being obvious over Xiao in view of Brabec and further in view of Neves et al. (US 20200012943 A1, hereinafter “Neves”).

Regarding claim 21, the rejection of claim 18 is incorporated and Xiao fails to explicitly disclose but Neves discloses wherein processing the video comprises tagging the objects in the video ([0023]; “FIG. 6 is a schematic overview of an implementation for a smart tagging utility that is able to automatically tag an object similar to a user-selected object within the same frame or different frames of video or other image data”; and [0026]; “Conversely, current techniques to tag data rely on a human labeling each object of interest in each frame (e.g., frames in a video stream)”).
Xiao, Brabec, and Neves are analogous art because all are concerned with machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning and image processing to combine the video tagging of Neves with the servers and steps of Xiao and Brabec to yield the predictable result of wherein processing the video comprises tagging the objects in the video. The motivation for doing so would be to automatically tag an object similar to a user-selected object within the same frame or different frames of video (Neves; [0023]).

Claim 22 is rejected under 35 U.S.C. § 103 as being obvious over Xiao in view of Brabec and further in view of Ferstl et al. (US 10691943 B1, hereinafter “Ferstl”).

Regarding claim 22, the rejection of claim 1 is incorporated and Xiao fails to explicitly disclose but Ferstl discloses wherein processing the image comprises recognizing human-made objects from natural objects in the image (Column 4, Lines 12-17; “Based on the visual image 150-1, a plurality of candidate detections 160-1, 160-2, 160-3, 160-4, 160-5, e.g., colors, textures, outlines or other aspects of the actor 10-1, the artificial structure 10-2, and the natural structures 10-3, 10-4, 10-5, corresponding to portions of the visual image 150-1 that might depict a human may be identified”, which discloses processing an image to recognize human-made or artificial structures as well as natural objects or structures within an image; and Figure 1D).
Xiao, Brabec, and Ferstl are analogous art because all are concerned with machine learning.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the object detection of Ferstl with the predictor and steps of Xiao and Brabec to yield the predictable result of wherein processing the image comprises recognizing human-made objects from natural objects in the image. The motivation for doing so would be to detect artificial and natural structures in an image (Ferstl; Column 4, Lines 15-17).

Response to Arguments

Applicants arguments and amendments, filed on 5/23/2022, with respect to the 35 USC § 112(f) interpretation of claims, 11, and 17 and their dependents have been fully considered and are persuasive.  The 35 USC § 112(f) interpretation of claims, 11, and 17 and their dependents is withdrawn.

Applicants arguments and amendments, filed on 5/23/2022, with respect to the 35 USC § 101 rejection of claims 1-20 have been fully considered and are not persuasive.

Beginning on page 11 of the remarks, Applicant argues that “a person cannot reasonably generate decision trees wit the internal and edge nodes, prune those decision trees by removing an internal or edge node, and also apply the pruned decision trees to predict an identity of an object in an image or identify an object in a video sample”.  Examiner respectfully disagrees.  Applicant has not provided any arguments or evidence from the claim language or specification that demonstrates why the claimed features of amended claim 1 cannot reasonably and practically be performed in the human mind with the assistance of pen and paper. Rather, Applicant has made a generalized assertion that that the generating, pruning, and applying of a decision tree to make predictions cannot “:be done by a human”.  Applicant further argues “how is the person able to mentally apply a pruned version of the decision tree to an image or video sample to identify objects?”.  The limitation “apply the pruned decision tree to the image to predict an identity of an object in the image” can be reasonably and practically performed in the human mind with the assistance of pen and paper.  For example, one can mentally, with the assistance of pen and paper, look at an image and identify objects in the image as circles based on a decision to "identify all round objects as circles” as indicated in instructions contained in a flow chart or decision tree, and this flow chart or decision tree may be pruned/compressed/reduced based on redundancy.  Again, no evidence was provided from the specification or claim language that indicates that these specific claim limitations cannot be practically performed in the human mind with the assistance of pen and paper.
On page 11, last paragraphs of the remarks, Applicant further argues that the rejected claims recite a practical application of the alleged abstract idea.  Specifically, Applicant argues “independent claims 1, 11, and 17-as well as their progeny-recite features for generating decision trees, pruning those decision trees, and using the pruned decision trees to identify objects in images (claims 1 and 11) or videos (claim 17). Obviously, identifying objects in images and videos is a practical application. And pruning decision trees to do so saves valuable processing and memory resources because fewer nodes need to be used to identify the object”.  Examiner respectfully believes that the Applicant’s argument is misplaced.  Step 2A, Prong 2 and Step 2B of the 101-eligibility analysis requires one to look at additional elements beyond the identified abstract ideas to see if the additional elements integrate the identified abstract ideas into a practical application or provide significantly more than the abstract idea.  Looking at claim 1, the additional elements of the claim consist of “memory storing instructions for generating and storing at least one decision tree comprising a plurality of nodes connected by edges, the nodes comprising a root node, internal nodes and leaf nodes, wherein at least the internal nodes have assigned modules comprising parameterized, differentiable operations” and “a processor”. As discussed in the 101 rejection above, these additional elements amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05 (h)), are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)), or is insignificant extra-solution activity that does not amount to an inventive concept, and is a well-understood, routine, conventional activity (see MPEP §2106.05 (g); “mere data gathering”; and MPEP §2106.05(d); “Storing and retrieving information in memory”).  The abstract ideas themselves cannot integrate the claims into a practical application or provide significantly more than the abstract ideas; rather, additional elements in the claim language beyond the abstract ideas only may do this in a 101 inquiry.  Applicant seems to suggest that the limitations of claim 1 that are identified as abstract ideas integrate the claims into a practical application or provide “significantly more” in order to make the claim eligible in view of 101.  This is an incorrect 101 analysis.

	The examiner correctly applied the 101 analysis in view of the 2019 Patent Eligibility Guidance.  The claims recite abstract ideas and additional elements beyond the identified abstract ideas that do not integrate the identified abstract ideas into a practical application or provide significantly more than the abstract idea.  The Office is not explicitly required to provide a “comparison of the rejected claims against claims of other applications that have previously been found to be patent ineligible abstract ideas”, but, rather, follow the 2019 Patent Eligibility Guidance.  Applicant has failed to identify any additional elements beyond the abstract ideas that integrate the identified abstract ideas into a practical application or provide significantly more than the abstract idea, and has failed to point to any specific claim language or paragraphs in the specification that provide evidence of eligibility in view of 101.  As such, Applicant’s arguments are not persuasive, and the 35 USC § 101 rejection of claims 1-18 and 21-22 STANDS.

	Applicants arguments and amendments, filed on 5/23/2022, with respect to the 35 USC § 103 rejection of claims 1-20 have been fully considered and are moot because the arguments do not apply to any of the references being used in the current rejection to reject independent claims 1, 11, and 17.  Xiao and Brabec are now being used to render claims 1, 11, and 17 obvious over 35 USC § 103.

Conclusion

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403. The examiner can normally be reached Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127