Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in response to the amendments filed 09/10/2021. Claims 9-11 have been amended. Claims 1-20 are currently pending.

Response to Arguments
In light of Applicant's amendments to the specification, the objections to the abstract and to the drawings have been withdrawn.
Applicant’s arguments regarding the 112(b) rejection of claims 8 and 16 have been fully considered but they are not persuasive. Applicant's argues on pages 13-14 that the claims as filed in the original specification are part of the disclosure. While this is true, the claim itself is not clear and the specification does not provide clarification or further explain the limitations of claims 8 and 16. Applicant states that “each tier is selected independently according to a pre-defined probability” and that the pi term in the equation from claim 8 refers to the probability of selecting a specific tier ti, but the claim merely states that “pi is a corresponding probability to a queried tier ti” without mentioning that the pi is a selection probability.  Applicant points to paragraphs [0033] and [0040] to show where random selection of a tier is described, but neither of these paragraphs nor any other part of the disclosure mention that the random selection uses a probability distribution. 
k term (described in the claim as “an aggregation result from a last epoch) does not specify if responses from all tiers are aggregated or if only the responses from a selected tier are aggregated. It is unclear how responses for participants of a selected tier can be predicted if the Gk and Gk+1 terms are meant to include responses from all tiers for a given round or if the step is meant to predict responses for all tiers how the responses from a selected tier result in predictions for the other tiers.
Lastly, Applicant argues on page 14 that “for a certain round, the aggregator first computes the average of the most recent replies for the current tier, denoted as "AVG(mostRecent_replies )". Then, once the aggregator receives the replies from the current round of the queried tier, it averages them "AVG(replies)". The results are combined recited in the formula in the claim”. Neither the claims nor the specification describe how to differentiate the “most recent” replies from the rest of the replies for a selected tier during a current round, so it is unclear at what point a reply stops being a “most recent” reply or if the “AVG(replies)” term also includes the most recent replies or not. 
For these reasons, the metes and bounds of claims 8 and 16 cannot be determined and no prior art can be applied.
Applicant’s amendments and arguments regarding the 112(b) rejection of claims 9-11 have been fully considered but they are not persuasive. Applicant's argues on page 15 that the 

Applicant’s amendments regarding the 101 rejection of claims 9-11 have been fully considered but they are not persuasive. Applicant's has amended the claims to recite that the actions in the claims are performed by a computing device. However, adding this limitation to the claims does not overcome the 101 rejection of these claims, but instead changes the interpretation of the amended limitations to “mere instructions to apply an abstract idea using a computer” as per MPEP 2106.05(f). The analysis for the 101 rejection has been updated to include the amended limitations.
Applicant’s amendments and arguments regarding the prior art rejection have been fully considered but they are not persuasive. 
With regards to claim 1, Applicant argues on page 18 that Ouyang does not teach “applying a predicted response including collected participants’ replies and computed predictions associated with the stragglers” and that Ouyang merely determines whether node performance is weak or not. However, the “predicted response” in claim 1 is not further defined and Ouyang pg. 76 Section III A 1 describes collecting the run times of tasks in a given node, inferring that since a node was slow during a current run then it will be slow in a future run, and therefore deciding that future tasks assigned to that node in the future will be slow based on that inference. Changing the determination of slowness of a node in the future after determining that that node is currently slow (because it contains stragglers) would be updating the model by applying the predicted response of slowness, wherein the predicted response is based on the response times from the tasks from the current run.

With regards to claim 9, Applicant argues on page 20 that the prior art does not teach the feature of removing response times of the drop outs for which RTi = n_syn*Tmax and creating a histogram of the remaining response times. Applicant argues that Prakash teaches this feature, but the non-final office action (see page 19) relies upon the McColl reference to teach identifying drop outs as participants for which RTi = n_syn*Tmax. Prakash is only relied upon to teach the process of removing drop outs that have already been identified in the previous limitation. Additionally, the Prakash reference is not relied upon to teach creating a histogram of remaining response times, the Martin reference is relied upon to teach creating a histogram of response times (see page 20 of the non-final office action). 
The prior art rejections have been updated to include the amended limitations and to clarify the reasoning given for the limitations that were not amended.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8, 9-11, and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. 

    PNG
    media_image1.png
    49
    562
    media_image1.png
    Greyscale
Claims 8 and 16 recite a limitation “applying a prediction step to aggregate responses from the federated learning participants of the selected tier that respond to the querying with information from the federated learning participants in non-selected tiers” and the following equation: 

This equation is not clearly defined by the claims or the specification – it is unclear what the “corresponding probability pi” corresponds to or how it is used in the prediction step. The claims state only that pi corresponds to a queried tier ti, but do not define what the probability is for. The formula in the claims appears to aggregate responses from a current epoch, but it unclear how this then becomes a prediction step. Additionally, neither the claims nor the specification describe how to differentiate the “most recent” replies from the rest of the replies 
Claim 9 recites the limitation “determining that a number of run epochs is less than a number of synchronization epochs”. It is unclear how the number of run epochs could be less than the number of synchronization epochs given that the method appears to gather data (i.e. run) before it checks to see if all the participants have responded (i.e. synchronized). The claim as written does not make it clear what “a number of run epochs” or “a number of synchronization epoch” refers to, as one of ordinary skill would interpret that a number of epochs refers to a count (i.e. at epoch 4, the number of run epochs would be 4 and the number of synchronization epochs would be 0 because no synchronization has occurred yet, then at epoch 5 the number of run epochs would be 5 and the number of synchronization epochs would be 1, etc.), so there could not be a situation where the number of run epochs would be less than the number of synchronization epochs. For purposes of prior art examination, Examiner is interpreting that the intention of claim 9 is to let a certain number of training runs occur before synchronizing in order to prevent pre-emptively identifying stragglers or dropouts
Dependent claims 10-11 are rejected under 35 U.S.C 112(b) because they fail to cure the deficiencies of their independent claim.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:



Claims 9-11 are rejected under 35 U.S.C. 101. Claims 9-11 are directed to a method; therefore, claims 9-11 fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter). However, claims 9-11 fall within the judicial exception of an abstract idea, specifically the abstract ideas of “Mental Processes” (including observation, evaluation, and opinion) and “Mathematical Concepts (including mathematical calculations and relationships)”.
Claim 9:
Step 1: Claim 9 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 9 recites the following abstract ideas:
in response to determining that a number of run epochs is less than a number of synchronization epochs (mental process directed to evaluation, determining that a number of run epochs is less than a number of synchronization epochs could be done in the mind):
updating a response time (RTi) until a maximum time (Tmax) elapses (mental process directed to evaluation; given the broadest reasonable interpretation, updating a response time could be accomplished by a person recording response times in a table with a pen and paper);
in response to determining that a number of run epochs is greater than a number of synchronization epochs (mental process directed to evaluation, determining that a number of run epochs is greater than a number of synchronization epochs could be done in the mind);

create a histogram of remaining response times (mathematical calculation); 
and assigning an average reply time to each tier of a plurality of tiers having a predetermined number of federated learning participants per tier (mental process directed to judgement, given the broadest reasonable interpretation, assigning an average reply time to a plurality of tiers could be accomplished by a person re-sorting data in a table using pen and paper).
Step 2A, Prong 2: Claim 9 recites the following additional elements:
initializing a plurality of federated learning participants in training of a federated learning model; receiving responses from at least some of the plurality of federated learning participants, removing response times of the drop outs, and a computing device. Initializing a plurality of federated learning participants, receiving responses, and removing response times from dropout are interpreted as receiving or transmitting data. Performing these steps with a computing device is interpreted as mere instructions to apply an abstract idea using a computer, as the claim does not describe updating a response time, identifying a federated learning participant as a drop out, removing response times, or assigning an average reply time to a tier in a way that requires a computer and excludes the ability to perform these limitations as a mental step. These elements do not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 9 recites the following additional elements:

The independent claim is not patent eligible.
Dependent claims 10-11 when analyzed as a whole are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea, as they recite further embellishment of the judicial exception.
Claim 10:
Step 1: Claim 10 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 10 recites the following abstract ideas:

Step 2A, Prong 2: Claim 10 recites the following additional elements:
 a computing device. Performing the step of creating a histogram with a computing device is interpreted as mere instructions to apply an abstract idea using a computer, as the claim does not describe creating a histogram in a way that requires a computer and excludes the ability to perform these limitations as a mathematical calculation (that could be performed in the mind, assisted by pen and paper). This element does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 10 recites the following additional elements:
 a computing device. Performing the step of creating a histogram with a computing device is interpreted as mere instructions to apply an abstract idea using a computer, as the claim does not describe creating a histogram in a way that requires a computer and excludes the ability to perform these limitations as a mathematical calculation (that could be performed in the mind, assisted by pen and paper). This element does not amount to significantly more (see MPEP 2106.05(f)).
	Claim 11:
Step 1: Claim 11 is directed to a method; therefore the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2A, Prong 1: Claim 11 recites the following abstract ideas:

Step 2A, Prong 2: Claim 11 recites the following additional elements:
 a computing device. Performing the step of updating a response time with a computing device is interpreted as mere instructions to apply an abstract idea using a computer, as the claim does not describe updating a response time in a way that requires a computer and excludes the ability to perform these limitations as a mental step (that could be performed in the mind, assisted by pen and paper). This element does not integrate the abstract idea into a practical application.
Step 2B, Prong 2: Claim 11 recites the following additional elements:
 a computing device. Performing the step of updating a response time with a computing device is interpreted as mere instructions to apply an abstract idea using a computer, as the claim does not describe updating a response time in a way that requires a computer and excludes the ability to perform these limitations as a mental step (that could be performed in the mind, assisted by pen and paper). This element does not amount to significantly more (see MPEP 2106.05(f)).
Viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-7, 12, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ouyang et al (“ML-NA: A Machine Learning based Node Performance Analyzer Utilizing Straggler Statistics”, herein Ouyang) in view of Prakash et al (US 20190138934 A1, herein Prakash).
Regarding claim 1, Ouyang teaches a computer-implemented method (Ouyang pg. 73 section 1 para. 7 recites “Proposed ML-NA, a Machine Learning based Node performance Analyzer. This multi-stage framework can classify machine nodes into different categories depending on their performance through clustering” (i.e. a method to assign machine learning participants into tiers based on response time)) of communicating [in a federated learning environment], the method comprising:
monitoring a plurality of federated learning participants for one or more factors associated with stragglers (Ouyang fig. 1 and pg. 74 Section II C 1 recites “From task durations for different nodes shown in Figure 1, it is observable that some machines have a shorter average task processing time than the others, while some are either with a much longer average duration indicating a slower execution, or with a larger variation in time of processing tasks, showing an unstable performance.” Ouyang pg. 75 Section II C 2 recites “collecting all Dij from tasks assigned in each node (i.e. monitoring the execution time of each participant) to reflect the quickness or slowness derived from different node performance rather than job heterogeneity. Statistics of Dij values per node are calculated as the basic metrics to measure the node performance” (i.e. identifying factors associated with stragglers));
assigning the federated learning participants into tiers based on the monitoring of the one or more factors, each of the tiers having a designated wait time (Ouyang pg. 77, Section III B 1 recites that “the first step to label the nodes is to put the nodes with similar performance into the same group. In this scenario, clustering is the most well-known technique that can be used, and k-means is one of the simplest whilst very effective clustering algorithms” (i.e. assigning participants into groups based on their response time));
querying the federated learning participants in a selected tier (Ouyang pg. 75 Section II C 2 recites “Dij reveals the relative speed of tij compared to other tasks within Jj . A positive Dij value represents a slower execution because the duration of tij is larger than the job average, and the increment of the positive Dij indicates an aggravated straggler behavior tij exhibits. Vice versa, a negative Dij indicates a shorter response, and the smaller the negative value, the quicker tij performs. We then collect all Dij from tasks assigned in each node to reflect the quickness or slowness derived from different node performance rather than job heterogeneity. Statistics of Dij values per node are calculated as the basic metrics to measure the node performance.” Ouyang pg. 77 Section III B 2 recites “after putting the nodes with similar performance into k groups, we then need to determine which cluster represents the weakest performance group” (i.e. querying a specific set of participants from a given tier));
designating the federated learning participants that respond after a predetermined time within the designated wait time as stragglers (Ouyang pg. 75 Section II C 2 recites that “a positive Dij value represents a slower execution because the duration of Dij is larger than the job average, and the increment of the positive Dij indicates an aggravated straggler behavior Dij. Vice versa, a negative Dij indicates a shorter response, and the smaller the negative value, the quicker Dij performs” (i.e. a node with a slower execution time is determined to be a straggler)); 
and updating a training of a federated learning model by applying a predicted response for the stragglers including collected participants' replies and computed predictions associated with the stragglers (Ouyang pg. 76 Section III A 1 recites “if all tasks assigned to node M1 have an average Dij of 2, we can infer that M1 is a weak performance node because most tasks assigned on M1 are stragglers in their own jobs, characterized by 2*σj slower than their own average duration Dij. And we can assume later tasks that are about to be assigned on M1 in the near future will have a possible relative speed around 2*σj times slower as well” (Examiner’s Note: assuming later tasks will be slow based on the inference of the current node speed (having collected current response times from tasks) is updating the model by applying a predicted response based on collected participant replies)).
However, Ouyang does not explicitly teach a federated learning environment.
(Prakash para. [0037] recites “coding mechanisms for federated learning based GD algorithms trained from decentralized data available at a plurality of edge compute nodes” (i.e. a federated learning environment)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by utilizing the methods from Ouyang in the federated learning environment from Prakash. Ouyang recites a heterogeneous machine learning model, but not specifically a federated learning environment. It would be obvious to use the federated learning environment from Prakash so as to save time and bandwidth, and increase user privacy by keeping the training data on the local device and only passing the results of the machine learning training between the local and global models.

Regarding claim 4, the combination of Ouyang and Prakash teaches the method according to claim 1, wherein the selected tier for querying is selected by a randomizing procedure (Prakash para. [0035] recites that “the central server selects a random set of client compute nodes in each epoch to provide additional updates and waits for the selected client compute nodes to return their updated models. The central server averages the received models to obtain the final global model” (i.e. a tier is selected randomly)).
Regarding claim 5, the combination of Ouyang and Prakash teaches the method according to claim 1, further comprising: periodically updating the training of the federated learning model with the collected participants' replies and computed predictions of the stragglers (Ouyang pg. 75 Section II C 2 and Figure 3 give five nodes as an example. Each line in the graph represents a node, with y-axis being the Dij average for the specific month, showing how the average execution time changes each month (i.e. periodically updating the training model)).
Regarding claim 6, the combination of Ouyang and Prakash teaches the method according to claim 1, further comprising:
updating the monitoring of the federated learning participants (Ouyang pg. 75 Section II C 2 and Figure 3 show how the monitoring of participants is updated over time); 
and determining whether to reassign the federated learning participants into different tiers, based on the updated monitoring for each synchronization time period of a plurality of synchronization time periods (Ouyang pg. 74 Section 1 para. 4 recites “through classifying nodes into different categories and predicting the corresponding performance category with high accuracy, the scheduler can select suitable nodes to launch latency-sensitive tasks, avoid assigning speculative tasks onto nodes that are likely to be in their weak performance state in the near future” (i.e. reassigning participants based on monitoring data)).
Regarding claim 7, the combination of Ouyang and Prakash teaches the method according to claim 1, further comprising: dynamically rearranging the tiers based on updated monitoring of the federated learning participants (Ouyang pg. 73, the abstract recites “that by leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide” (i.e. dynamically rearranges participants into different categories based on monitoring data) to improve speculation effectiveness and minimize task straggler generation).
Claim 12 is a computer readable storage medium claim and its limitation is included in claim 1. The only difference is that claim 12 requires a computer readable storage medium (Prakash para. [0076] recites “the components of the computer network 120 may be implemented in one physical node or separate physical nodes including components to read and execute instructions from a machine-readable or computer-readable medium” (e.g., a non-transitory machine-readable storage medium)). Therefore, claim 12 is rejected for the same reasons as claim 1.
Claim 17 is a computer readable storage medium claim and its limitation is included in claim 7. Claim 17 is rejected for the same reasons as claim 7.
Claim 18 is a computer readable storage medium claim and its limitation is included in claim 5. Claim 18 is rejected for the same reasons as claim 5.
Claim 19 is a computer readable storage medium claim and its limitation is included in claim 4. Claim 19 is rejected for the same reasons as claim 4.

Claims 2-3, 13-15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ouyang et al (“ML-NA: A Machine Learning based Node Performance Analyzer Utilizing Straggler Statistics”, herein Ouyang) in view of Prakash et al (US 20190138934 A1, herein Prakash), in further view of McColl (WO 2019086120 A1, herein McColl).
Regarding claim 2, the combination of Ouyang and Prakash teaches the computer-implemented method according to claim 1.
The combination of Ouyang and Prakash does not explicitly teach updating the training of the federated learning model with collected participants' replies and computed predictions in response to identifying whether a quorum of federated learning participants has responded 
McColl teaches updating the training of the federated learning model with collected participants' replies and computed predictions in response to identifying whether a quorum of federated learning participants has responded to the querying (McColl pg. 16, lines 18-21 recite “in a first step 901, it is checked whether a first process/subtask is still being executed by the current computing node 201 and has not been completed by another computing node 201 yet. If this is the case, computation must stop, as there is no completed copy of that sub-task” (i.e. there is not a quorum of participants that have responded)); 
and identifying the federated learning participants that do not respond within the designated wait time as drop outs (McColl pg. 14 lines 34-36 recite “in an embodiment, the computation performed by the distributed computing system 200 has a TailLimit T. In an embodiment, any sub-task that fails to complete before T*MinTime is marked as a fault/tail, others can” (i.e. a participant that does not respond is designated as a drop out)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by utilizing the fault tolerance method from McColl with the straggler avoidance method from Ouyang in the federated learning environment from Prakash. All three of these methods are trying to improve a distributed machine learning system, but neither Prakash nor Ouyang recite a method to handle a system failure (i.e. a dropout). It would be obvious to include this capability to prevent a system failure from delaying or stopping the progress of the other distributed machines.

Regarding claim 3, the combination of Ouyang, Prakash, and McColl teaches the method according to claim 2, wherein for each round of updating the training of the federated learning model, updating the designated wait time per tier, and the method further comprising:
determining an accuracy of the training of the federated learning model according to one or more predetermined criteria (Ouyang pg. 78 Section IV B para. 3 recites that “figure 6(a) concludes the minimal, average and maximum accuracies when predicting each month’s node performance categories utilizing different training sizes with the optimal parameter settings” (i.e. determining an accuracy of the training model)), 
and terminating an asynchronized training stage of the federated learning model when the accuracy does not increase after a predetermined number of asynchronization time periods (Prakash fig. 2 and para. [0098] recite “at operation 233, the edge compute nodes 2101 calculate an updated partial gradient, which is then provided to the master node 2112 at operation 236 for further aggregation similar to operation 227 (not shown by FIG. 2). Operations 224-236 repeat until the underlying model sufficiently converges” (Examiner’s note: convergence is well known in the art to represent the point at which the model is closest to a desired value (i.e. the accuracy will not increase further)).
Claim 13 is a computer readable storage medium claim and its limitation is included in claim 2. The only difference is that claim 13 requires a computer readable storage medium (Prakash para. [0076] recites “the components of the computer network 120 may be implemented in one physical node or separate physical nodes including components to read and execute instructions from a machine-readable or computer-readable medium” (e.g., a non-transitory machine-readable storage medium)). Therefore, claim 13 is rejected for the same reasons as claim 2.
Regarding claim 14, the combination of Ouyang, Prakash, and McColl teaches the computer readable storage medium according to claim 13, wherein the monitoring of the plurality of federated learning participants further comprises capturing behavior patterns of the federated learning participants (Ouyang pg. 73, the abstract recites that “by leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation” (i.e. monitoring behavior patterns of participants)).
Regarding claim 15, the combination of Ouyang, Prakash, and McColl teaches the computer readable storage medium according to claim 14, further comprising identifying at least one of the drop outs or predicting at least one of the stragglers based on the captured behavior patterns of the federated learning participants (Ouyang pg. 73, the abstract recites that “by leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation” (i.e. predicting a straggler based on the known behavior of participants)).
Claim 20 is a computer readable storage medium claim and its limitation is included in claim 3. Claim 20 is rejected for the same reasons as claim 3.	

Claims 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over Ouyang et al (“ML-NA: A Machine Learning based Node Performance Analyzer Utilizing Straggler Statistics”, herein Ouyang) in view of Prakash et al (US 20190138934 A1, herein Prakash), in further view of McColl (WO 2019086120 A1, herein McColl) and Martin et al (US 9946465 B1, herein Martin).
Regarding claim 9, Ouyang teaches a computer-implemented method (Ouyang pg. 73 section 1 para. 7 recites “Proposed ML-NA, a Machine Learning based Node performance Analyzer. This multi-stage framework can classify machine nodes into different categories depending on their performance through clustering” (i.e. a method to assign machine learning participants into tiers based on response time))  of communicating [in a federated learning environment], the method comprising:
assigning, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), an average reply time to each tier of a plurality of tiers having a predetermined number of federated learning participants per tier (Ouyang pg. 76 Section III A 2 recites that “the three basic meta-features selected to build up the node performance analysis model are the average and the standard deviation of all Dij from tasks per node (i.e. the average reply time is calculated for each cluster or tier), as well as the normalized task number).
However, Ouyang does not explicitly teach a federated learning environment.
Prakash teaches a federated learning environment (para. [0037] recites “coding mechanisms for federated learning based GD algorithms trained from decentralized data available at a plurality of edge compute nodes” (i.e. a federated learning environment)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by utilizing the methods from Ouyang in the federated learning environment from Prakash. Ouyang recites a heterogeneous machine learning model, but not specifically a federated learning environment. Using the federated learning environment from Prakash would allow one of ordinary skill to save time and bandwidth and increase user privacy by keeping the training data on the local device and only passing the results of the machine learning training between the local and global models.
The combination of Ouyang and Prakash does not explicitly teach initializing a plurality of federated learning participants in training of a federated learning model; (a) in response to determining that a number of run epochs is less than a number of synchronization epochs (nsyn): receiving responses from at least some of the plurality of federated learning participants; and updating a response time (RTi) until a maximum time (Tmax) elapses; (b) in response to determining that a number of run epochs is greater than a number of 
McColl teaches initializing, by a computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a plurality of federated learning participants in training of a federated learning model (McColl fig. 6 and pg. 15, lines 25-27 recite “in a first step 601, a first process/sub-task is executed by a current computing node 201 and communications are initiated” (i.e. initializing a learning participant))
(a) in response to determining that a number of run epochs is less than a number of synchronization epochs (McColl fig. 6 and pg. 15 lines 28-30 recite “in a further step 603, it is checked whether the minimum duration, i.e. MinTime has been set and/or whether the synchronization parameter has been set to "False" for the current computing node 201” (i.e. the nodes have not all synchronized yet, but they are not determined to be stragglers yet either)): 
(Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), responses from at least some of the plurality of federated learning participants (McColl fig. 6 and pg. 15 lines 30-31 recite “in a further step 603, it is checked whether the minimum duration, i.e. MinTime has been set and/or whether the synchronization parameter has been set to "False" for the current computing node 201. If this is not the case, the current computing node 201 tries to set the minimum duration, i.e. MinTime in a further step 605” (i.e. responses are being received because the nodes have not been instructed to synchronize yet); 
and updating, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a response time (RTi) until a maximum time (Tmax) elapses (McColl fig. 6 and pg. 15 lines 31-34 recite “in a further step 607, the current computing node 201 notifies other clones, i.e. computing nodes 201 executing the same process/sub-task about the completion of the process by the current computing node 201” (i.e. the nodes can update their response times because the elapsed time is not larger than the maximum allowed time yet));
(b) in response to determining that a number of run epochs is greater than a number of synchronization epochs (McColl fig. 8 and pg. 16, lines 10-12 recite “in a first step 801, it is check whether the computing round is not complete yet and whether the elapsed time is larger than the minimum duration or a multiple thereof, e.g. T*MinTime” (i.e. the number of run epochs is greater than the number of synchronization epochs)):
identifying, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a federated learning participant from the plurality of federated learning participants as a drop out for which RTi= nsyn * Tmax (McColl fig. 8 and pg. 16, lines 9-14 recite “if this is the case, the respective computing node 201 is a high-latency computing node 201 and will send Tail Limit interrupts in a step 803 to the other computing nodes 201” (i.e. the participant that has not responded is designated as a drop out));
and removing, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), response times of the drop outs (Prakash para. [0274] recites “as the transient devices 1512, 1514, and 1516, leave the vicinity of the fog 1520, it may reconfigure itself to eliminate those IoT devices 1504 from the network” (Examiner’s Note: once dropouts have been identified in the previous limitation, they are removed and therefore their associated response times are removed during reconfiguration)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by utilizing the fault tolerance method from McColl with the straggler avoidance method from Ouyang in the federated learning environment from Prakash. All three of these methods are designed to improve a distributed machine learning system, but neither Prakash nor Ouyang explicitly recite a method to handle a system failure (i.e. a dropout). It would be obvious to include this capability to prevent a system failure from delaying or stopping the progress of the other distributed machines.
The combination of Ouyang, Prakash, and McColl does not explicitly teach creating a histogram of remaining response times.
Martin teaches creating, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a histogram of remaining response times (Martin col. 2, lines 54-64 recite “determining one of a plurality of l/O workload classifications for each of the plurality of data sets in accordance with said set of values of said each data set; and for each of the plurality of I/O workload classifications including more than one of the plurality of data sets, combining said more than one of the plurality of data sets into a first aggregate data set including an aggregate set of values in accordance with said set of values of each of said more than one data set and including an aggregate response time histogram in accordance with said response time histogram of each of said more than one data set” (Examiner’s Note: a histogram can be created for the remaining response times from the previous limitation)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by creating histograms for the response times in each tier using the methods from Martin along with the fault tolerance method from McColl and the straggler avoidance method from Ouyang in the federated learning environment from Prakash. Once McColl has identified drop-outs, Prakash can remove the drop outs and their response times, so that Martin can create a histogram of remaining response times to more accurately reflect the state of the federated learning system.  Martin, Ouyang, McColl, and Prakash are all built to track response times for distributed machine learning systems, so it would be obvious to include the ability to create histograms to allow one of ordinary skill to visualize the response times and determine whether a parameter 

Regarding claim 10, the combination of Ouyang, Prakash, McColl, and Caccavale teaches the method according to claim 9, wherein when the number of run epochs is greater than the number of synchronization epochs, the method further comprising:
creating, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a histogram of remaining response times (Martin col. 2, lines 54-64 recite “determining one of a plurality of l/O workload classifications for each of the plurality of data sets in accordance with said set of values of said each data set; and for each of the plurality of I/O workload classifications including more than one of the plurality of data sets, combining said more than one of the plurality of data sets into a first aggregate data set including an aggregate set of values in accordance with said set of values of each of said more than one data set and including an aggregate response time histogram in accordance with said response time histogram of each of said more than one data set” (i.e. a histogram of response times)); 
and dividing the histogram into the plurality of tiers including the plurality of federated learning participants (Martin col. 2, lines 46-54 recite that “the method may include collecting a plurality of data sets for a plurality of time periods, wherein each of the plurality of data sets is collected during one of the plurality of time periods and said each data set includes a set of values for a plurality of parameters characterizing I/O workload for said one time period and a response time histogram characterizing response time for said one time period” (i.e. the aggregated histogram is created from histograms that correspond to the plurality of tiers)).
Regarding claim 11, the combination of Ouyang, Prakash, McColl, and Martin teaches the method according to claim 9, further comprising:
updating, by the computing device (Ouyang pg. 73, the abstract recites “Current Cloud clusters often consist of heterogeneous machine nodes, which can trigger performance challenges such as the task straggler problem, whereby a small subset of parallel tasks running abnormally slower than the other sibling ones” and later the abstract recites “In this paper we develop ML-NA, a Machine Learning based Node performance Analyzer. By leveraging historical parallel tasks execution log data, ML-NA classifies cluster nodes into different categories and predicts their performance in the near future as a scheduling guide to improve speculation effectiveness and minimize task straggler generation. We consider MapReduce as a representative framework to perform our analysis, and use the published OpenCloud trace as a case study to train and to evaluate our model” (i.e. the methods of Ouyang are performed by a computing device)), a response time to Tmax for the federated learning participants from which responses were not received by an aggregator when the number of run epochs is less than a number of synchronization epochs (McColl pg. 14, lines 26-28 recite that “each sub-task can have access not only to its own local data and state, but also to other information including: a copy of the minimum duration, i.e. MinTime for the computation round; its elapsed time for the round” (i.e. each participant has its own response time). Pg. 14, lines 34-36 recite that “the computation performed by the distributed computing system 200 has a TailLimit T (i.e. Tmax). In an embodiment, any sub-task that fails to complete before T*MinTime is marked as a fault/tail, others can be marked as live” (i.e. the participants that were did not respond get assigned the maximum response time)).

Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571)272-8350. The examiner can normally be reached on M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
	/L.M.F./             Examiner, Art Unit 2121                                                                                                                                                                                           
/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121