DETAILED ACTION
This Final Office Action is responsive to Applicant’s Remarks filed on 01/20/2022 for application 16/152,578.
Claims 1 - 20 are currently pending and under examination, of which claims 1, 10 and 19 are independent claims. No claims are currently in condition for allowance.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments in view of the prior art rejection dated 01/20/2022 have been fully considered, but they are not deemed persuasive. The rejection is maintained with following response to remarks: 
Applicant traverses on the grounds that the prior art does not teach the following limitation: 
applying, by the device and for each of the split operations to be performed locally by the workers, a machine-learning based predictor to the determined data size and entropy measure of the training records to be used for the split operation, to predict a completion time for the split operation
Applicant first argues that Chen17 does not apply the model to 1) data size and 2) an entropy measure of training records to be used for a node split operation. However, the examiner respectfully disagrees. Regarding Chen17 the variables of Eq. 20.5 [P.414] are 1) data x is size k and 2) entropy is regressed from h. This is apparent where entropy h(x) equation per instant application PGP [0050] is exactly identical to the entropy equation already noted in Chen16 (all Chens same author) per [P.3] Eq. 2 among training subsets and node-splits. Examiner notes “applying…predictor” is weighted regression. 
Applicant secondly argues that Chen17 does not predict a completion time for a node split operation. However, the examiner respectfully disagrees. As cited, Chen17 [P.414] Eq. 20.5 calculation objective is to calculate Tik “Predict the waiting time of all treatment tasks” tasks are clearly illustrated as modeled over node splits of a tree per Fig 20.6 [P.413] the illustration further including “time range”. Such modeling may entail a loss function for the node splits such as argmin of Chen17 [P.412] Eq. 20.2 and/or simply accorded by the training dataset being indexed per Chen16 [P.6] Tbl.2 so that one of ordinary skill can easily train models based on the training records being indexed. Furthermore, the rejection is obviousness type where motivation includes “improve the execution speed” hence Speedup Eqs. 11-12 per Chen16 [P.13]. In light of the foregoing, the balance of evidence favors maintaining the rejection under 35 U.S.C. 103 obviousness.
In consideration of these points, the arguments are not persuasive. The arguments presented above support the rejections to independent claims 1, 10, 19 and related dependent claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-11 and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over: 
Guillame-bert et al., US PG Pub No 20190251468A1, hereinafter GB, as evidenced by 
Guillame-bert et al., US Provisional 16/271,064, hereinafter GB064 and/or the NPL upon which it appears to be based which is noted as 
Guillame-bert et Teytaud, “Exact Distributed Training: Random Forest with Billions of Examples”, hereinafter GB755, arXiv:1804.06755v1, in view of
Chen et al., Chinese Patent CN105550374A, hereinafter Chen, as evidenced by
Chen et al., “A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment”, hereinafter Chen16, and further evidenced by 
Chen et al., “Parallel Data Mining and Applications in Hospital Big Data Processing”, hereinafter Chen17.
With respect to claim 1, GB teaches: 
A method {GB (Google) discloses methods and systems for distributed training Random Forest, see [Abstract]} comprising: 
distributing, by a device, sets of training records from a training dataset for a random forest- based classifier among a plurality of workers of a computing cluster, wherein each worker determines whether it can perform a node split operation locally on the random forest by comparing a number of training records at the worker to a predefined threshold {GB Figs 1-2 illustrates the overall framework of distributed training dataset for decision tree building and splitting with master/worker arrangement, see [0023] “distributed algorithm to train Random Forest models” and wherein [0071] “During training, each splitter is searching for the optimal split… defined as the split with the highest split score”. By way of example [0080] “a splitter estimates the optimal threshold” again at [0077]. Further, [0129] “Fig. 3A illustrates… models 120 can be trained and used locally at the user computing device”. It should further be noted that the prior art dataset scale is ~18billion as opposed to instant application ~30million}; 
GB further teaches using Gini Index as split score per [0071]
However, GB does not expressly disclose “entropy” or “predicted completion time”. 
Chen teaches:
determining, by the device and for each of the split operations to be performed locally by the workers, a data size and entropy measure of the training records to be used for the split operation {Chen [0020-21] “Calculate the information entropy of each feature variable in the training data subset” again at [0046-47]. This is detailed in Chen16 per [P.3-4 PgBrk] Equations 2-5 for entropy similarly used by the instant specification. Further, Chen16 [P.4] Table 1 where data size variables comprise |S| and/or [P.6 ¶3] “the size of the training dataset S is N… i is the index of each record in the training dataset S… S is split into (M-1) feature subsets” i.e., indexed training data subset. See code [P.4] Alg3.1 Lines 2-8}; -
applying, by the device and for each of the split operations to be performed locally by the workers, a machine learning-based predictor to the determined data size and entropy measure of the training records to be used for the split operation, to predict a completion time for the split operation {Chen17 [P.414-15 Sect20.3.3] Eqs. 20-5-20.6 header “Predict the waiting time of all treatment tasks” is task time prediction, cont’d “predicted waiting time… Sort all the treatment tasks of the current patient in ascending order by waiting time” emphasis temporal sorting of prediction times for a task. Tasks being introduced comprise Chen17 [P.412-13] “Step 1. Calculate the best splitting features variables and the best split point”. Applying predictors is weighted regression}; and 
coordinating, by the device, the workers of the computing cluster to perform the node split operations in parallel such that the node split operations in a given batch are grouped based on their predicted completion times {Chen16 parallel-scheduler is coordinating, described [P.7-10 Sect 4.2-4.3] implemented with Apache Spark known environment (clusters/workers). Parallelization is provided for [P.8 ¶2] “Node-splitting task” as [4.2.2] “Task-Parallel Scheduling” with LocalScheduler/ClusterScheduler e.g., “ClusterScheduler module monitors the execution situation of the computing resources and tasks the whole Spark cluster and allocates tasks to suitable workers”. See Alg4.3 (TNS) task node split includes ranking for queued TS task set (group/batch). The effect considers Speedup per [P.12 Sect5.3.3] Eq.12}.
Both GB and Chen are directed to Random Forest distributed training with splitting operation thus being analogous. One having ordinary skill in the art would have considered it obvious prior to the effective filing date to combine the teachings of GB and Chen to arrive at the claimed invention prior to the effective filing date. The motivation for combination is that “Parallelizing the random forest machine learning method can improve the execution speed” (Chen [0009]). Doing so may assist the reader in resolving time complexity or data attributes (GB [0081, 88, 91]) by describing scheduling operations (Chen16 [P.8 Sect4.2.2]). One may further do so by describing split scores (GB) according to known measures such as entropy (Chen) as applying known techniques to known methods to yield predictable results. Empirical evidence such as Chen’s graphed execution times or GB’s scale of data exemplify superior performance.

With respect to claim 2, the combination of GB and Chen teaches the method as in claim 1, wherein coordinating the workers of the computing cluster to perform the node split operations in parallel such that the node split operations in a given batch are grouped based on their predicted completion times comprises: 
ordering, for a given worker, the split operations to be performed locally by the worker according to their predicted completion times {Chen17 [P.416 ¶1] “predicted waiting times for all the tasks for the current patient are sorted in ascending order with a sort() function” comprises [P.414 ¶1] “queue of each task. Then, these tasks are re-sorted by the predicted waiting time” sorting/re-sorting is ordering and considered wrt predicted time of task. Local worker processing environment is described per Chen16 [P.9 LeftCol] ranking, LocalScheduler}.

With respect to claim 4, the combination of GB and Chen teaches the method as in claim 1, wherein 
	a worker of the computing cluster waits until the node split operations performed in parallel in a given batch by the other workers are complete, before starting its next node split operation locally {Chen16 [P.9 LeftCol] “ClusterScheduler… there is a wait and synchronization restraint for these tasks” with Alg4.3 queue and describing “get available worker executora from workers”. See also [P.7 Alg4.1] Line5 “findAvailableSlaves()”}.

With respect to claim 5, the combination of GB and Chen teaches the method as in claim 1, wherein 
	the machine learning-based predictor comprises a regression model {Chen16 describing random forest RF [P.5 ¶2] “the RF is trained as a regression model”}.

With respect to claim 6, the combination of GB and Chen teaches the method as in claim 1, further comprising training, by the device, the machine learning-based predictor by: 
	measuring completion times for node split operations performed by the workers; and forming a training dataset for the predictor by associating the measured completion times for the node split operations with the data size and entropy measure of the training records used for those split operations {Chen17 [P.414-15 Sect20.3.3] Eqs. 20.5-20.6 and/or Chen16 [P.12 RtCol] Eqs.11-12 are measuring completion times of spark distributed operations for training random forest with meta decision tree h(x,θ) replete with node splitting task, entropy, and training subsets in the distributed Apache Spark environment with scheduling and speedup evaluation as well as iterative process. Additionally, GB755 “while the number of leaves increases exponentially with the depth of the trees, the computation time does not” being particularly prescient in consideration of obviousness}.

With respect to claim 7, the combination of GB and Chen teaches the method as in claim 1, wherein 
	the worker nodes redistribute the training records associated with a particular node of the random forest to a worker that determines that it can perform a node split operation on the node locally {Chen16 [P.7 ¶4] “To reuse the training dataset, each RDD object of the feature subset is allocated and persisted to Spark cluster via a dataAllocation() function and a persist() function” is reuse/redistribute as RDD supports iterative computation per [P.10 RtCol] or Chen17 [P.417]. Further, Chen16 [P.7 RtCol] Alg4.1 describes RDD with “findAvailableSlave”; [P.9 ¶1] LocalScheduler thread pool. Additionally, see GB [0027, 58, 66]}.

With respect to claim 8, the combination of GB and Chen teaches the method as in claim 1, further comprising: 
	aggregating, by the device, the split nodes to form the random forest {Chen16 [P.3 RtCol] aggregating is Sigma notation of Equation 1, “Collecting k trees into a RF model” RF being random forest. See illustration Fig 1 repeats Chen17 Fig20.3, 20.5, 20.7 or again per Chen Figs2-3. Further, GB [87-88]}.

With respect to claim 9, the combination of GB and Chen teaches the method as in claim 1, wherein 
	a node split operation for a particular node of the random forest seeks to divide the training records for that node into two subsets of training records based on values of one or more features in the training records {Chen16 [P.6 ¶2] “The training dataset is divided into several feature subsets” illustrated Figs 3-4 indexed training partition tables. See also GB [0124] partitioned training}.

With respect to claim 10, GB teaches: 
	An apparatus {GB [0013] “systems, apparatuses, non-transitory” Figs 3A, 3C}, comprising: 
	one or more network interfaces to communicate with a network {GB [0111] “Fig 3A depicts a block diagram of an example computing system… communicatively coupled over a network”}; 
a processor coupled to the one or more network interfaces and configured to execute a process {GB Fig 3A [0127] “hardware… executed by one or more processors”}; and 
a memory configured to store the process executable by the processor, the process when executed {GB Fig3A [0113] “memory 114 can store data 116 and instructions 118 which are executed by the processor”} configured to:
	The remainder of this claim is rejected for the same rationale as claim 1.

Claim 11 is rejected for the same rationale as claim 2.
Claims 13-18 are rejected for the same rationale as claims 4-9, respectively.

With respect to claim 19, GB teaches: 
	A tangible, non-transitory, computer-readable medium storing program instructions that cause a device in a network to execute a process {GB [0103] “non-transitory computer-readable storage medium” with [0127] “software… computer-executable instructions that are stored in a tangible computer-readable storage medium” Fig 3A} comprising: 
	The remainder of this claim is rejected for the same rationale as claim 1. 

Claim 20 is rejected for the same rationale as claim 5.

Claims 3 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over GB and Chen in view of: 
Axenie et al., WO2020043267A1, hereinafter Axenie (relates US20210124983A1). 
With respect to claim 3, the combination of GB and Chen teaches the method as in claim 1. Axenie teaches wherein 
	the training dataset comprises traffic flow features extracted from Hypertext Transfer Protocol (HTTP) traffic flows, and wherein the random forest-based classifier is configured to classify a given traffic flow as malicious or benign based on its features {Axenie (Huawei) discloses [P.22 Line11 - P.23 Line21] “Streaming Random Forest operator for anomaly detection provides real-time detection and scoring of anomalies… HTTP involving network intrusions are used… training data has labeled instances only for the normal data and will build a model for the class corresponding to normal behavior, and use this model to detect anomalies in the test data” illustrated Figs 8-9. See also “Spark” per [P.4 Line9]}.
	Axenie is directed to random forest modeling thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to implement use case of anomaly detection with random forest as disclosed by Axenie in combination with GB and Chen for the motivation being “key benefit is to enable low latency change detection and an efficient data stream modeling (i.e. using incremental histograms) over the input data stream, even for high rates of incoming events” (Axenie [P.8 Lines1-17]).

Claim 12 is rejected for the same rationale as claim 3.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
Jiang et al., US PG Pub No 20190034834A1 “Method and Apparatus for Training Model Based on Random Forest” disclosure Alibaba.
Ma et al., US PG Pub No 20190286486A1 Accenture discloses Random Forest, Fig 10 “Build Decision Tree Splitting Threshold Considering Time Delay Tolerance”.
Wang et al., “DistForest: A Parallel Random Forest Training Framework based on Supercomputer” see Figs 1-6 similar to Chen.














Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Chase P Hinckley whose telephone number is (571)272-7935. The examiner can normally be reached M-F 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda M. Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHASE P. HINCKLEY/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124