DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This action is in response to amendment filed on 12/02/2021.
Claims 1-8 and 11-22 are pending.
Claims 1, 14, 19 and 20 are independent claims.
Claims 9 and 10 have been canceled via Examiner’s amendment.
Claims 21-22 have been added via Examiner’s amendment.
Claims 1, 11, 14, 19 and 20 are currently amended via Examiner’s amendment.
Claims 1-8 and 11-22 are allowed.

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with Stephen LoVerme (Reg. No. 72,363) on 12/14/2021 to place the application in condition for allowance. 

The application has been amended as follows: 
In the Claims:
	Claims 1, 7, 10, 16 and 19 have been amended via Examiner’s amendment.
	Claims 2, 9 and 18 have been canceled via Examiner’s amendment.
Claims 21 and 22 have been canceled via Examiner’s amendment.

This list of claims will replace all prior versions, and listings, of claims in the application:
 
List of the Claims: 

1.	(Currently Amended) A computer-implemented method comprising:
obtaining (i) a size of a model and (ii) a set of batch sizes for an input dataset corresponding to a job to be processed using a distributed computing system, wherein said distributed computing system comprises a plurality of nodes;
computing, based at least in part on the set of batch sizes, a plurality of node counts, each node count corresponding to a number of nodes that can be used for processing said job;
estimating, for each given one of said node counts, an execution time to process the job using the number of nodes corresponding to the given node count, wherein said estimating comprises determining (i) an average computation time for a batch of said input dataset at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said 
selecting, based at least in part on said estimating, at least one of said node counts for processing the job;
	wherein the method is carried out by at least one computing device.

2.	(Original) The computer-implemented method of claim 1, wherein said estimating comprises estimating a cost to process the job using the number of nodes corresponding to the given node count.

3. 	(Previously Presented) The computer-implemented method of claim 2, wherein said estimating the cost is based at least in part on pricing information associated with using one or more of the plurality of nodes and the estimated execution time.

4. 	(Original) The computer-implemented method of claim 1, comprising:
causing information to be output to a user interface, the information comprising the estimated execution time for each of the selected node counts, and
in response to causing said information to be output, causing the job to be processed with one of said selected node counts based on user input received via the user interface.

5. 	(Original) The computer-implemented method of claim 1, wherein said selecting is based at least in part on one or more constraints provided by a user, wherein said one or more constraints comprises at least one of (i) a cost constraint and (ii) a time constraint. 

6. 	(Previously Presented) The computer-implemented method of claim 5, wherein said selecting comprises selecting an optimal node count for the one or more user constraints. 

7. 	(Original) The computer-implemented method of claim 6, comprising:
causing the job to be processed with the optimal node count.

8. 	(Previously Presented) The computer-implemented method of claim 1, wherein said determining said average computation time for said batch of said input dataset comprises:
determining an amount of time to process one or more batches of the input dataset using a single one of the plurality of nodes.

9.	(Canceled) 


10. 	(Canceled)  

11. 	(Currently Amended) The computer-implemented method of claim 1, wherein said determining said average communication time comprises:
interpolating the aggregation communication timing information for a first one of the node counts and a second one of the node counts to determine aggregation communication timing information for at least a third one of the node counts. 


a deep learning job for training a Neural Network; 
a higher order singular value decomposition job; and 
a history matching job in reservoir characterization. 

13. 	(Previously Presented) The computer-implemented method of claim 1, wherein said obtaining comprises obtaining a batch size per node, and wherein said computing the plurality of node counts is based at least in part on said batch size per node. 

	14.	(Currently Amended) A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to:
obtain (i) a size of a model, and (ii) a set of batch sizes for an input dataset corresponding to a job to be processed using a distributed computing system, wherein said distributed computing system comprises a plurality of nodes;
compute, based at least in part on the set of batch sizes, a plurality of node counts, each node count corresponding to a number of nodes that can be used for processing said job;
estimate, for each given one of said node counts, an execution time to process the job using the number of nodes corresponding to the given node count, wherein said estimating comprises determining (i) an average computation time for a batch of said input dataset at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said batch of said input dataset, wherein the average communication time is based at least in part on the size of the model and the number of nodes corresponding to the given node count; and
select, based at least in part on said estimating, at least one of said node counts for processing the job. 

15. 	(Original) The computer program product of claim 14, wherein said estimating comprises estimating a cost to process the job using the number of nodes corresponding to the given node count.

16. 	(Previously Presented) The computer program product of claim 15, wherein said estimating the cost is based at least in part on pricing information associated with using one or more of the plurality of nodes and the estimated execution time.

17. 	(Original) The computer program product of claim 14, wherein the program instructions further cause the computing device to:
cause information to be output to a user interface comprising the estimated execution time for each of the selected node counts, and
in response to causing said information to be output, cause the job to be processed with one of said selected node counts based on user input received via the user interface.



19.	(Currently Amended) A system comprising:
	a memory; and
	at least one processor operably coupled to the memory and configured for:
obtaining (i) a size of a model and (ii) a set of batch sizes for an input dataset corresponding to a job to be processed using a distributed computing system, wherein said distributed computing system comprises a plurality of nodes;
computing, based at least in part on the set of batch sizes, a plurality of node counts, each node count corresponding to a number of nodes that can be used for processing said job;
estimating, for each given one of said node counts, an execution time to process the job using the number of nodes corresponding to the given node count, wherein said estimating comprises determining (i) an average computation time for a batch of said input dataset at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said batch of said input dataset, wherein the average communication time is based at least in part on the size of the model and the number of nodes corresponding to the given node count; and


20.	(Currently Amended) A computer-implemented method comprising:
determining a plurality of computing resource configurations of a distributed computing system for processing a batch based job based at least in part on (i) a size of a model and (ii) a set of batch sizes for an input dataset corresponding to the batch based job; 
determining, for each given one of said computing resource configurations, (i) an average computation time to process a batch of said input dataset using the given computing resource configuration at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said batch of said input dataset; and
selecting the optimal computing resource configuration for processing the batch based job from among the plurality of computing resource configurations, based at least in part on (i) the average computation time and (ii) the average communication time determined for each of the computing resource configurations; 
wherein the method is carried out by at least one computing device.

21. 	(New) The computer program product of claim 18, wherein said selecting comprises selecting an optimal node count for the one or more user constraints. 

22. 	(New) The computer program product of claim 15, wherein said determining said average computation time for said batch of said input dataset comprises:
determining an amount of time to process one or more batches of the input dataset using a single one of the plurality of nodes.
	
REASONS FOR ALLOWANCE
The following is an examiner’s statement of reasons for allowance:
The cited prior arts taken alone or in combination fail to teach, in combination with other claimed limitations, “estimating, for each given one of said node counts, an execution time to process the job using the number of nodes corresponding to the given node count, wherein said estimating comprises determining (i) an average computation time for a batch of said input dataset at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said batch of said input dataset, wherein the average communication time is based at least in part on the size of the model and the number of nodes corresponding to the given node count; and selecting, based at least in part on said estimating, at least one of said node counts for processing the job;” as recited in the independent claims 1, 14 and 19.
The cited prior arts taken alone or in combination also fail to teach, in combination with other claimed limitations, “determining, for each given one of said computing resource configurations, (i) an average computation time to process a batch of said input dataset using the given computing resource configuration at least in part by obtaining aggregation communication timing information for each of the node counts based on one or more communication timing measurements, wherein obtaining the aggregation communication timing information comprises calling an Allreduce function on different numbers of nodes and a buffer size equal to the model size and (ii) an average communication time, corresponding to communications between at least a portion of the plurality of nodes, for said batch of said input dataset; and selecting the optimal computing resource configuration for processing the batch based job from among the plurality of computing resource configurations, based at least in part on (i) the average computation time and (ii) the average communication time determined for each of the computing resource configurations;” as recited in the independent claim 20.

These claimed limitations are not present in the prior art of record and would not have been obvious, thus all pending claims 1-8, and 11-21 are allowed.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hiren Patel whose telephone number is (571) 270-3366.  The examiner can normally be reached on Monday to Friday 9:30 AM to 6:00 PM.		

The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


December 17, 2021


/HIREN P PATEL/Primary Examiner, Art Unit 2196