Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Note
Providing supporting paragraph(s) with a clear explanation for each limitation of amended/new claim(s) in Remarks is strongly requested for clear and definite claim interpretations by Examiner.

Priority
Acknowledgment is made of applicant's claim for the present application filed on 03/12/2019.

Response to Arguments
Applicant's arguments filed on 07/11/2022 have been fully considered but they are not persuasive.
In Remarks, pp. 9-10, Applicant contends: 
Paragraph [0028], of the specification as filed provides the necessary definition of the term, stating: The iterative: forest creation, data appending, OOB prediction and OOB accuracy calculation, continues until the current layer OOB accuracy does not vary significantly from that of the preceding layer. In an embodiment, variation in the OOB accuracy of more than 0.005% is considered significant improvement. 
As significant change is defined as change of more than 0.005% between layers, the term and the claims are definite under the language of 35 USC §112. The rejections should be reconsidered and withdrawn.

Examiner’s response:
The examiner understands the applicant’s assertion “As significant change is defined as change of more than 0.005% between layers, the term and the claims are definite under the language of 35 USC §112”. However, par 28 just says “In an embodiment, variation in the OOB accuracy of more than 0.005% is considered significant improvement.” In other words, “variation in the OOB accuracy of more than 0.005% is considered significant improvement” is just an example which is true for an embodiment. However, it does not say “variation in the OOB accuracy of more than 0.005% is considered significant improvement” is true for all embodiments. Even par 47 says “If there is significant (In an embodiment, improvement of >0.005% constitutes significant improvement) improvement in the OOB accuracy, the method returns to step 230 and another layer /forest is grown and added to the model.”

Therefore, the applicant’s arguments are not convincing.

In Remarks, pp. 11-13, Applicant contends: 
The Office Action, on page 6, posits that Zhang teaches the limitation of appending the OOB predictions to the data set" through disclosure of the use of OOB error to adjust the weights of training samples. Applicant submits that adjusting training sample weights according to OOB error does not teach or suggest the expansion of the data set by appending OOB predictions to the training sample set, as set forth in the claims. Zhang modifies the data set by altering existing sample weights. The invention alters the data set by adding additional samples. To clarify, per Zhang, a training sample set of n samples would still have n sample after the sample weights have been adjusted using the OOB error. Per the claimed invention, an original training set of n samples will be expanded to now include n+OOB samples, where OOB represents the number OOB predictions determined by the method. The addition of Ironside, fails to cure this deficiency.

Examiner’s response:
The relevant claim limitation(s) appear(s) to be 
appending the Out-of-bag predictions to the data set by the one or more computer processors;

As noted in the rejections, Zhang teaches 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples. … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., steps 6-8 may read on “appending”.)

In other words, Zhang teaches a Cascaded Random Forest (CRF) method, which can improve the classification performance by means of combining two different enhancements into the Random Forest (RF) algorithm, based on out-of-bag (OOB) errors to update the sample weights in CRF. Zhang teaches appending predictions to the training samples to adjust the weights of the training samples (i.e. “appending the Out-of-bag predictions to the data set”, cf. “weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples” and “3) S subsets of training samples are selected by probability weight distribution”). In other words, Zhang clearly teaches the limitation since the predictions are added to the training samples.

The examiner understands the applicant’s assertion “The invention alters the data set by adding additional samples. … Per the claimed invention, an original training set of n samples will be expanded to now include n+OOB samples, where OOB represents the number OOB predictions determined by the method”. However, the assertion is not clearly reflected in the claim. 

Therefore, the applicant’s arguments are not convincing.

Claim Objections
Claim(s) 23 is/are objected to because of the following informalities: “with the appended OOB predictions” may need to read “with the appended first OOB predictions” or something else. Appropriate correction is suggested.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 4, 10, 16 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “significantly” in claim 4 is a relative term which renders the claim indefinite. The term “significantly” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. In addition, claims 10, 16 are rejected for the same reason.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5-9, 11-15, 17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (Cascaded Random Forest for Hyperspectral Image Classification) in view of Ironside (US 2019/0213685 A1)
 
Regarding claim 1
Zhang teaches 
A computer implemented method for developing and training models for analyzing data, the method comprising: 

constructing a model by: 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”;)

growing, by one or more computer processors, a random forest of decision trees from a data set; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV, p. 1091] “Table VIII shows the model construction time of different methods. All experiments were carried out on MATLAB 2015b platform with 3.4 GHz Intel i7-6700 CPU and 8 GB RAM.”; e.g., “training” may read on “growing”.)

determining, by the one or more computer processors, Out-of-bag (OOB) predictions for the random forest; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “To achieve this goal, we introduce OOB error instead of training error to update the weights of samples. For one decision tree, there are approximately 1/e samples belonging to OOB, and if we use the OOB error corresponding to those OOB samples to update the weights of training samples directly, this will update the OOB samples only but not cover all training samples. So we train several decision trees at each iteration, and the OOB samples belonging to different trees are merged together, in other words, any one sample in merged OOB sample set should belong to one of the decision trees at least. The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples.”; e.g., “weighted voting of trees’ predictions corresponding to the OOB sample” may read on “Out-of-bag (OOB) predictions”.)

appending the Out-of-bag predictions to the data set by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples. … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., steps 6-8 may read on “appending”.)

growing, by the one or more computer processors, an additional random forest using the data set with the appended OB predictions; and 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “9) repeat step 3 to step 7 T times” may read on “additional random forest”.)

combining an output of the additional random forest, by the one or more computer processors, with a combiner.
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “obtain the weight of S decision trees βt” may read on “combining an output of the additional random forest, by one or more computer processors, with a combiner”.)

In the alternative, Ironside can also be interpreted to teach the following limitation:
Ironside teaches
combining an output of the additional random forest, by the one or more computer processors, with a combiner.
(Ironside, [pars 61-94] “Gradient boosting is a machine learning technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. For example, gradient boosting combines weak learners into a single strong learner in an iterative fashion. Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever).”;)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the random forest system of Zhang with the combiner of Ironside. 
Doing so would lead to tending to aggressively exploit any opportunity to improve predictive accuracy of a prediction model.
(Ironside, [pars 61-94] “Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever).”)

Regarding claim 2
The combination of Zhang, Ironside teaches claim 1.

Zhang further teaches 
each random forest is grown from data selected from a group consisting of: numeric, text, audio, video, image data location, speech, music, entertainment, healthcare, financial information, vehicle, logistics, and sales data.
(Zhang, [figs 2-3] [sec IV] “To evaluate the performance of the proposed CRF, three public benchmark hyperspectral image datasets are used in the experiments. … 1) Indian Pines hyperspectral data [51], which was obtained by Airborne Visible Infrared Imaging Spectrometer (AVIRIS) with 224 spectral bands in the wavelength range from 0.4 to 2.5 μm. 2) Pavia University hyperspectral data [52], which is an urban image obtained by Reflective Optics Spectrographic Imaging System (ROSIS-03) from city Pavia, south of Italy. … 3) Kennedy Space Center (KSC) data [54], which was acquired by the AVIRIS sensor over KSC, Florida, on March 23, 1996.”; The examiner notes that this claim is a kind of Markush-type and the group “image data location” is elected.)

Regarding claim 3
The combination of Zhang, Ironside teaches claim 1.

growing the random forest of decision trees using the data set comprises (see the rejections of claim 1)

growing decision trees using a bootstrapped sample, taken with replacement from the data set, to grow each tree.
(Zhang, [figs 2-3] [sec IV] “Random Forests is an ensemble algorithm, which uses Classification and Regression Tree (CART) [32] for constructing base classifiers. And each tree is independent of other trees in the forest, in other words, RF can be implemented in parallel [15]. In the training steps, all trees are trained on different training set by bootstrap sampling with randomly selected features, and we can describe the steps of training for each tree as follows: 1) a set of samples are selected from training set by bootstrap sampling (random sampling with replacement); 2) randomly select m features from M dimensional feature space by RSM; and 3) compute the best split feature and grow a tree with maximum depth recursively.”;)

Regarding claim 5
The combination of Zhang, Ironside teaches claim 1.

Ironside further teaches
the combiner comprises a structure selected from a group consisting of: a random forest, and a gradient boosting structure.
(Ironside, [pars 61-94] “Gradient boosting is a machine learning technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. For example, gradient boosting combines weak learners into a single strong learner in an iterative fashion. Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever).”; The examiner notes that this claim is a kind of Markush-type and the group “gradient boosting structure” is elected.)

Zhang, Ironside are combinable with Ironside for the same rationale as set forth above with respect to claim 1.

Regarding claim 6
The combination of Zhang, Ironside teaches claim 1.

Zhang further teaches 
the model comprises a sequence of layers, each layer comprising a single random forest.
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., each round may read on “each layer” In addition, e.g., Cascaded Random Forests along with fig 1 may read on “sequence of layers”.)

Regarding claim 7
The claim is a product claim corresponding to the method claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 
Note that Zhang teaches one or more computer readable storage devices (Zhang, [sec IV, p. 1091] “Table VIII shows the model construction time of different methods. All experiments were carried out on MATLAB 2015b platform with 3.4 GHz Intel i7-6700 CPU and 8 GB RAM.”).

Zhang further teaches
providing a data set;
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV] “To evaluate the performance of the proposed CRF, three public benchmark hyperspectral image datasets are used in the experiments. … 1) Indian Pines hyperspectral data [51], which was obtained by Airborne Visible Infrared Imaging Spectrometer (AVIRIS) with 224 spectral bands in the wavelength range from 0.4 to 2.5 μm. 2) Pavia University hyperspectral data [52], which is an urban image obtained by Reflective Optics Spectrographic Imaging System (ROSIS-03) from city Pavia, south of Italy. … 3) Kennedy Space Center (KSC) data [54], which was acquired by the AVIRIS sensor over KSC, Florida, on March 23, 1996.”;)

Regarding claim 8
The combination of Zhang, Ironside teaches claim 7.
The claim is a product claim corresponding to the method claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 9
The combination of Zhang, Ironside teaches claim 7.
The claim is a product claim corresponding to the method claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 11
The combination of Zhang, Ironside teaches claim 7.
The claim is a product claim corresponding to the method claim 5, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 12
The combination of Zhang, Ironside teaches claim 7.
The claim is a product claim corresponding to the method claim 6, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 13
The claim is a system claim corresponding to the method claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 
Note that Zhang teaches one or more processors and one or more computer readable storage devices (Zhang, [sec IV, p. 1091] “Table VIII shows the model construction time of different methods. All experiments were carried out on MATLAB 2015b platform with 3.4 GHz Intel i7-6700 CPU and 8 GB RAM.”).

Zhang further teaches
providing a data set;
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV] “To evaluate the performance of the proposed CRF, three public benchmark hyperspectral image datasets are used in the experiments. … 1) Indian Pines hyperspectral data [51], which was obtained by Airborne Visible Infrared Imaging Spectrometer (AVIRIS) with 224 spectral bands in the wavelength range from 0.4 to 2.5 μm. 2) Pavia University hyperspectral data [52], which is an urban image obtained by Reflective Optics Spectrographic Imaging System (ROSIS-03) from city Pavia, south of Italy. … 3) Kennedy Space Center (KSC) data [54], which was acquired by the AVIRIS sensor over KSC, Florida, on March 23, 1996.”;)

Regarding claim 14
The combination of Zhang, Ironside teaches claim 13.
The claim is a system claim corresponding to the method claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 15
The combination of Zhang, Ironside teaches claim 13.
The claim is a system claim corresponding to the method claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 17
The combination of Zhang, Ironside teaches claim 13.
The claim is a system claim corresponding to the method claim 5, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 18
The combination of Zhang, Ironside teaches claim 13.
The claim is a system claim corresponding to the method claim 6, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Claim(s) 4, 10, 16, 19-25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (Cascaded Random Forest for Hyperspectral Image Classification) in view of Ironside (US 2019/0213685 A1) further in view of Aonpong et al. (Combining a Random Forest Algorithm and a Level Set Method for Land Cover Mapping)

(Note: Hereinafter, if a limitation has brackets (i.e. [·]) around claim languages, the bracketed claim languages indicate that they have not been taught yet by the current prior art reference but they will be taught by another prior art reference afterwards.)

Regarding claim 4
The combination of Zhang, Ironside teaches claim 1.

Zhang further teaches 
determining an OOB accuracy for each random forest and 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

adding random forests [until] the OOB accuracy does [not] significantly improve.
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] [sec III] “In this paper, our proposed CRF method achieves the above two aspects at the same time. In particular, HRSM used for feature selection can improve the strength of decision trees and increase the diversity between each two of the random forests. Besides, minimization of the OOB error in the procedure of Boosting iteration can increase the strength of decision trees iteratively. The flow of CRF is illustrated in Fig. 1. … 3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., fig 1 and Algorithm 1 may read on “adding random forests [until] the OOB accuracy does [not] significantly improve” since forests grow for the T iterations.)

	However, the combination of Zhang, Ironside does not appear to distinctly disclose:
determining an OOB accuracy for each random forest and adding random forests [until] the OOB accuracy does [not] significantly improve.

(Note: Hereinafter, if a limitation has one or more underlines, the one or more underlined claim languages indicate that they are taught by the current prior art reference, while the one or more non-underlined claim languages indicate that they have been taught already by one or more previous art references.)

Aonpong teaches
determining an OOB accuracy for each random forest and adding random forests until the OOB accuracy does not significantly improve.
(Aonpong, [sec II] “Because a RF is composed of a lot of individual decision trees (DTs), a RF algorithm is required to determine the suitable number of DTs. If forest is too small, the number of decisions will be too small to cope with decision errors. However, if there are too many trees, it will take too much time to train. The out-of-bagging method [3] allows us to find the proper size of a RF where, at least one-third of the training set, is used to estimate the classification accuracy of the results of the current RF. If the accuracy does not sufficient increases where more DTs are added, the RF stops adding more trees into the forest and the training process is terminated. Here, the accuracy is estimated as 
    PNG
    media_image3.png
    143
    444
    media_image3.png
    Greyscale
, (4) when |OOB| is the number of out-of-bag data. 𝐼𝑀,𝑁 can be computed from 
    PNG
    media_image4.png
    147
    920
    media_image4.png
    Greyscale
, (5) when 𝑟𝑓𝑀,𝑁 is the classification result of the random forest at position 𝑀, 𝑁 and 𝑜𝑜𝑏𝑀,𝑁 represent answer of 𝑜𝑜𝑏 in position 𝑀, 𝑁.”; Note that Zhang teaches “determining an OOB accuracy for each random forest and adding random forests [until] the OOB accuracy does [not] significantly improve”.)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the random forest system of Zhang, Ironside with the random forest addition based on the OOB accuracy of Aonpong. 
Doing so would lead to effectively finding a proper size of a random forest for coping with decision errors without taking too much time in training a prediction model.
(Aonpong, [sec II] “Because a RF is composed of a lot of individual decision trees (DTs), a RF algorithm is required to determine the suitable number of DTs. If forest is too small, the number of decisions will be too small to cope with decision errors. However, if there are too many trees, it will take too much time to train. The out-of-bagging method [3] allows us to find the proper size of a RF where, at least one-third of the training set, is used to estimate the classification accuracy of the results of the current RF”)

Regarding claim 10
The combination of Zhang, Ironside teaches claim 7.
The claim is a product claim corresponding to the method claim 4, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 16
The combination of Zhang, Ironside teaches claim 13.
The claim is a system claim corresponding to the method claim 4, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim. 

Regarding claim 19
Zhang teaches

A computer implemented method for developing and training models for analyzing data, the method comprising: 
constructing a model by: 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”;)

receiving, by one or more computer processors, a training data set; 
(Zhang, [fig 1] “Training samples” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV, p. 1091] “Table VIII shows the model construction time of different methods. All experiments were carried out on MATLAB 2015b platform with 3.4 GHz Intel i7-6700 CPU and 8 GB RAM.”; e.g., “training samples” may read on “training data set”.)

growing, by the one or more computer processors, a random forest of decision trees from the training data set; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., “training” may read on “growing”.)

determining, by the one or more computer processors, Out-of-bag (OOB) predictions for the random forest; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “To achieve this goal, we introduce OOB error instead of training error to update the weights of samples. For one decision tree, there are approximately 1/e samples belonging to OOB, and if we use the OOB error corresponding to those OOB samples to update the weights of training samples directly, this will update the OOB samples only but not cover all training samples. So we train several decision trees at each iteration, and the OOB samples belonging to different trees are merged together, in other words, any one sample in merged OOB sample set should belong to one of the decision trees at least. The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples.”; e.g., “weighted voting of trees’ predictions corresponding to the OOB sample” may read on “Out-of-bag (OOB) predictions”.)

appending the Out-of-bag predictions to the data set by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples. … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., steps 6-8 may read on “appending”.)

determining, by the one or more computer processors, a first OOB accuracy for the random forest; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

growing, by the one or more computer processors, an additional random forest using the training data set with the appended OOB predictions; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets; 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “9) repeat step 3 to step 7 T times” may read on “additional random forest”.)

determining, by the one or more computer processors, a second OOB accuracy for the additional random forest; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

[comparing], by the one or more computer processors, the first OB accuracy of the random forest and the second OOB accuracy of the additional random forest; and 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

combining an output of the additional random forest by the one or more computer processors with a combiner.
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “obtain the weight of S decision trees βt” may read on “combining an output of the additional random forest by one or more computer processors with a combiner”.)

In the alternative, Ironside can also be interpreted to teach the following limitation:
Ironside teaches
combining an output of the additional random forest by the one or more computer processors with a combiner.
(Ironside, [pars 61-94] “Gradient boosting is a machine learning technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. For example, gradient boosting combines weak learners into a single strong learner in an iterative fashion. Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever).”;)

Zhang is combinable with Ironside for the same rationale as set forth above with respect to claim 1.

However, the combination of Zhang, Ironside does not appear to distinctly disclose:
[comparing], by the one or more computer processors, the first OB accuracy of the random forest and the second OOB accuracy of the additional random forest; and

	Aonpong teaches
comparing, by the one or more computer processors, the first OB accuracy of the random forest and the second OOB accuracy of the additional random forest; and
(Aonpong, [sec II] “Because a RF is composed of a lot of individual decision trees (DTs), a RF algorithm is required to determine the suitable number of DTs. If forest is too small, the number of decisions will be too small to cope with decision errors. However, if there are too many trees, it will take too much time to train. The out-of-bagging method [3] allows us to find the proper size of a RF where, at least one-third of the training set, is used to estimate the classification accuracy of the results of the current RF. If the accuracy does not sufficient increases where more DTs are added, the RF stops adding more trees into the forest and the training process is terminated. Here, the accuracy is estimated as 
    PNG
    media_image3.png
    143
    444
    media_image3.png
    Greyscale
, (4) when |OOB| is the number of out-of-bag data. 𝐼𝑀,𝑁 can be computed from 
    PNG
    media_image4.png
    147
    920
    media_image4.png
    Greyscale
, (5) when 𝑟𝑓𝑀,𝑁 is the classification result of the random forest at position 𝑀, 𝑁 and 𝑜𝑜𝑏𝑀,𝑁 represent answer of 𝑜𝑜𝑏 in position 𝑀, 𝑁.”; e.g., “If the accuracy does not sufficient increases where more DTs are added, the RF stops adding more trees into the forest and the training process is terminated” read(s) on “comparing”. Note that Zhang teaches “[comparing], by the one or more computer processors, the first OB accuracy of the random forest and the second OOB accuracy of the additional random forest”.)

The combination of Zhang, Ironside is combinable with Aonpong for the same rationale as set forth above with respect to claim 4.

Regarding claim 20
The combination of Zhang, Ironside, Aonpong teaches claim 19.
The claim is a method claim corresponding to the method claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim 2. 

Regarding claim 21
The combination of Zhang, Ironside, Aonpong teaches claim 19.
The claim is a method claim corresponding to the method claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim 3. 

Regarding claim 22
The combination of Zhang, Ironside, Aonpong teaches claim 19.
The claim is a method claim corresponding to the method claim 6, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim 6. 

Regarding claim 23
Zhang teaches
A computer implemented method for developing and training models for analyzing data, the method comprising: 

constructing a model of sequential layers, each layer including a single random forest, by: 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., each round may read on “each layer” In addition, e.g., Cascaded Random Forests along with fig 1 may read on “model of sequential layers”.)

receiving, by one or more computer processors, a training data set; 
(Zhang, [fig 1] “Training samples” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV, p. 1091] “Table VIII shows the model construction time of different methods. All experiments were carried out on MATLAB 2015b platform with 3.4 GHz Intel i7-6700 CPU and 8 GB RAM.”; e.g., “training samples” may read on “training data set”.)

receiving, by the one or more computer processors, a determined number of trees per forest and a class vector specification; 
(Zhang, [fig 1] “Training samples” [Algorithm 1] “Input: G = {X, Y} = {xi, yi}(i = 1 . . .n): training set, T: number of iterations, S: number of trees at each iteration, m: number of features for generating each tree. Output: EC: the ensemble classifier” [tables I-IV] [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets” [sec IV, p. 1091] “The original reference map have 10 366 pixels that belong to 16 classes. … The reference map have 42 776 pixels that belong to 9 classes. … For classification purposes, 13 classes representing the various land cover types that occur in this environment were defined.”; e.g., “13 classes representing the various land cover types that occur in this environment were defined” may read on “class vector specification”.)

growing, by the one or more computer processors, the number of determined trees for a first forest using the training data set; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., “training” may read on “growing”.)

determining, by the one or more computer processors, first Out-of-bag (OOB) predictions for the first forest; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “To achieve this goal, we introduce OOB error instead of training error to update the weights of samples. For one decision tree, there are approximately 1/e samples belonging to OOB, and if we use the OOB error corresponding to those OOB samples to update the weights of training samples directly, this will update the OOB samples only but not cover all training samples. So we train several decision trees at each iteration, and the OOB samples belonging to different trees are merged together, in other words, any one sample in merged OOB sample set should belong to one of the decision trees at least. The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples.”; e.g., “weighted voting of trees’ predictions corresponding to the OOB sample” may read on “Out-of-bag (OOB) predictions”.)

appending the first OOB predictions to the training data set by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples. … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., steps 6-8 may read on “appending”.)

determining an OOB accuracy for the first forest by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

growing the number of determined trees for an additional forest using the training data set with the appended OOB predictions by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets; 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “9) repeat step 3 to step 7 T times” may read on “additional random forest”.)

determining an additional OOB prediction for the additional forest by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “To achieve this goal, we introduce OOB error instead of training error to update the weights of samples. For one decision tree, there are approximately 1/e samples belonging to OOB, and if we use the OOB error corresponding to those OOB samples to update the weights of training samples directly, this will update the OOB samples only but not cover all training samples. So we train several decision trees at each iteration, and the OOB samples belonging to different trees are merged together, in other words, any one sample in merged OOB sample set should belong to one of the decision trees at least. The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples.”; e.g., “weighted voting of trees’ predictions corresponding to the OOB sample” may read on “additional OOB prediction”.)

appending additional OOB predictions to the data set by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “The predicted class label of an OOB sample is determined by the weighted voting of trees’ predictions corresponding to the OOB sample, then the OOB error for several trees is obtained to update the weights of training samples. … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., steps 6-8 may read on “appending”.)

determining an additional OOB accuracy for the additional forest by the one or more computer processors; 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] “calculate the OOB error of the trained tree OEts: 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
, where i∈OOBts” [sec III] “OEts is the OOB error rate of the sth decision tree in the tth iteration and C is the number of classes. After the end of each iteration, the weight of S decision trees, βt, is computed by 
    PNG
    media_image2.png
    75
    170
    media_image2.png
    Greyscale
 (3) where Et is the error rate of S decision trees in the tth iteration … 6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “OOB error” along with 
    PNG
    media_image1.png
    88
    713
    media_image1.png
    Greyscale
 may read on “OOB accuracy” since the OOB error indicates instances which are classified correctly or incorrectly by the class label prediction.)

adding forests, by the one or more computer processors, [until] the additional OOB accuracy does [not] improve; and 
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [Algorithm 1] [sec III] “In this paper, our proposed CRF method achieves the above two aspects at the same time. In particular, HRSM used for feature selection can improve the strength of decision trees and increase the diversity between each two of the random forests. Besides, minimization of the OOB error in the procedure of Boosting iteration can increase the strength of decision trees iteratively. The flow of CRF is illustrated in Fig. 1. … 3) S subsets of training samples are selected by probability weight distribution wi(t); 4) select S feature subsets corresponding to S training subsets, each feature subset is obtained by randomly selecting half number of the m features from B1 and B2, respectively; 5) train S decision trees on S training subsets corresponding to their feature subsets”; e.g., fig 1 and Algorithm 1 may read on “adding forests, by one or more computer processors, [until] the additional OOB accuracy does [not] improve” since forests grow for the T iterations.)

combining an output of the additional forest by the one or more computer processors.
(Zhang, [fig 1] “Flow of CRF algorithm. (a) Training phase. (b) Classification phase.” [sec III] “6) compute the OOB error of each decision tree OEts and obtain the weight of each decision tree αts; 7) compute the OOB error of S decision trees Et and obtain the weight of S decision trees βt; 8) update the weights of samples wi(t+1) according to βt; and 9) repeat step 3 to step 7 T times.”; e.g., “obtain the weight of S decision trees βt” may read on “combining an output of the additional random forest, by one or more computer processors, with a combiner”.)

In the alternative, Ironside can also be interpreted to teach the following limitation:
Ironside teaches
combining an output of the additional forest by the one or more computer processors.
(Ironside, [pars 61-94] “Gradient boosting is a machine learning technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. For example, gradient boosting combines weak learners into a single strong learner in an iterative fashion. Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever).”;)

Zhang is combinable with Ironside for the same rationale as set forth above with respect to claim 1.

However, the combination of Zhang, Ironside does not appear to distinctly disclose:
adding forests, by the one or more computer processors, [until] the additional OOB accuracy does [not] improve; and 

	Aonpong teaches
adding forests, by the one or more computer processors, until the additional OOB accuracy does not improve; and 
(Aonpong, [sec II] “Because a RF is composed of a lot of individual decision trees (DTs), a RF algorithm is required to determine the suitable number of DTs. If forest is too small, the number of decisions will be too small to cope with decision errors. However, if there are too many trees, it will take too much time to train. The out-of-bagging method [3] allows us to find the proper size of a RF where, at least one-third of the training set, is used to estimate the classification accuracy of the results of the current RF. If the accuracy does not sufficient increases where more DTs are added, the RF stops adding more trees into the forest and the training process is terminated. Here, the accuracy is estimated as 
    PNG
    media_image3.png
    143
    444
    media_image3.png
    Greyscale
, (4) when |OOB| is the number of out-of-bag data. 𝐼𝑀,𝑁 can be computed from 
    PNG
    media_image4.png
    147
    920
    media_image4.png
    Greyscale
, (5) when 𝑟𝑓𝑀,𝑁 is the classification result of the random forest at position 𝑀, 𝑁 and 𝑜𝑜𝑏𝑀,𝑁 represent answer of 𝑜𝑜𝑏 in position 𝑀, 𝑁.”;)

	The combination of Zhang, Ironside is combinable with Aonpong for the same rationale as set forth above with respect to claim 4.

Regarding claim 24
Zhang, Ironside, Aonpong teaches claim 23.
The claim is a method claim corresponding to the method claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim 2. 

Regarding claim 25
Zhang, Ironside, Aonpong teaches claim 23.
The claim is a method claim corresponding to the method claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of the method claim 3. 

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Mishina et al. (Boosted Random Forest) teaches generating Random Forest based on boosting.
Watts et al. (MERGING RANDOM FOREST CLASSIFICATION WITH AN OBJECT-ORIENTED APPROACH FOR ANALYSIS OF AGRICULTURAL LANDS) teaches comparing OOB accuracy assessments.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409. The examiner can normally be reached Mon - Thu 7:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.K./Examiner, Art Unit 2129                                                                                                                                                                                                        
10/3/2022
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129