Detailed Office Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to the communication filed on 7/9/20.
Original claims 1-20 are pending.
Claims 7 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  


Claim 1-5, 8-12, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. U.S, Patent Application Publication No. 2019/0171935[hereinafter Agrawal] in view of Arunachalam et al. U.S. Patent Application No. 2019/0155620 [hereinafter Arunachalam].
As per claim 1 and 8 Agrawal  discloses computer-implemented method, comprising: 
performing distributed deep learning training on a batch of training data(see par. 0066, FIG. 5 which depicts an example   neural network that is being trained using training data) ; 
modifying a communication aspect of the learner to reduce a future network communication time for a (see par. 0068,  0075-0076, matrices are then modified)
Agrawal is silent reading determining a training time representing an amount of time between: a beginning batch time for a learner; and an end batch time for the learner ; and determining that the learner is a communication straggler.
 Arunachalam discloses  an automatic network communication resource configuration for deep learning neural network including determining a training time representing an amount of time between: a beginning batch time for a learner and an end batch time for the learner(see par. 0054, 0068, where training time for plurality learners and its effect on the system , for example, system  600 includes a plurality of learner processing systems 610, 620, 630, 640 which are responsible for preforming deep network learning) and  (par 0032 for measuring the completion time variable in seconds/batch may ensure that when the neural network workload is run across the two very different architectures, for synchronous SGD, the combined throughput performance in images/sec is given by the maximum time to complete the local batch on any given architecture (e.g., because the slowest compute node will drive); and determining that the learner is a communication straggler(see par. 0016, 0019). Therefore, it would have been obvious to one having ordinary skill in the art  prior to effective filing date of the claimed invention to incorporate the teaching of Croxford into the system of Arunachalam in to the system of  Agrawal  because , the difference in batch completion times (or service time) for different architectures may be minimized).


as per claim 2, and 9 Agrawal  discloses the method of claim 1, wherein modifying the communication aspect comprises compressing the future result before sending the future result to the centralized parameter server, wherein the future result is compressed using a compression rate based on a network communication time of the communication straggler(see par. 0068).

as per claim 3,and 10  Arunachalam discloses the method of claim 1, further comprising: 
determining a plurality of network communication times representing an amount of time between: a plurality of batch end times (see par. 0054, 0068); and a plurality of times when the centralized parameter server receives a plurality of results (see par 0032); and 
identifying the communication straggler based on the plurality of network communication times and a threshold network communication time (see par. 0054, 0068, where training time for plurality learners and its effect on the system , for example, system  600 includes a plurality of learner processing systems 610, 620, 630, 640 which are responsible for preforming deep network learning) and  (see par 0032 for measuring the completion time variable in seconds/batch may ensure that when the neural network workload is run across the two very different architectures, for synchronous SGD, the combined throughput performance in images/sec is given by the maximum time to complete the local batch on any given architecture (e.g., because the slowest compute node will drive)

as per claim 4,  and 11 Agrawal discloses the method of claim 1, further comprising modifying a processing aspect of the communication straggler, wherein modifying the processing aspect comprises: determining a size of the new batch of training data; and distributing a reduced amount of communication straggler training data to a plurality of remaining learners for performing the distributed deep learning training(see par. 0075-0076, where  adjusted mini-batch of the training data may be divided into mini-batches, or subsets)

as per claim 5 and 12 Arunachalam discloses the method of claim 4, further comprising: performing the distributed deep learning training on the new batch of training data by the communication straggler(slow computing node); and performing the distributed deep learning training on the reduced amount of communication straggler training data by the plurality of remaining learners (see par. 0054, 0068). 

As per claims 15, Agrawal discloses a system comprising: a computer processing circuit; and a computer-readable storage medium storing instructions, which, when executed by the computer processing circuit, are configured to cause the computer processing circuit to perform a method comprising: 
identifying   plurality of learners based on whether a network communication time of the communication learner exceeds a threshold training time for the plurality of learners (see par. 0066, FIG. 5 which depicts an example   neural network that is being trained using training data) ;  and 
requesting learner to modify a communication aspect to reduce a future network communication time for the learner to send a future result to a centralized parameter server(see par. 0056, 00268, 0075-0076w her each learner sends a scale-factor in addition to the compressed sparse vector. In some embodiments of the present invention).
Agrawal is silent regarding identifying the learner as a communication straggler.
  Arunachalam  disclose s system identifying a learner as a communication straggler (see par 0016, 0019). Therefore, it would have been obvious to one having ordinary skill in the art  prior to effective filing date of the claimed invention to incorporate the teaching of Arunachalam in to the system of  Agrawal  thus enabling to find an optimal set of (e.g., matching) parameters tuned to run deep learning (DL) training or inference processes on a computing architecture having compute nodes with different compute capacities.


 as per claim 16, Arunachalam discloses the system of claim 15, wherein modifying the communication aspect comprises compressing the future result before sending the future result to the centralized parameter server, wherein the future result is compressed using a compression rate based on the network communication time of the communication straggler (see par. 0068).


 as per claim 17,  Arunachalam  discloses e system of claim 15, the method further comprising  determining a plurality of network communication times representing an amount of time between: a plurality of batch end times; and a plurality of times when the centralized parameter server receives a plurality of results; and 
identifying the communication straggler based on the plurality of network communication times and a threshold network communication (see par. 0054, 0068, where training time for plurality learners and its effect on the system , for example, system  600 includes a plurality of learner processing systems 610, 620, 630, 640 which are responsible for preforming deep network learning) and  (par 0032 for measuring the completion time variable in seconds/batch may ensure that when the neural network workload is run across the two very different architectures, for synchronous SGD, the combined throughput performance in images/sec is given by the maximum time to complete the local batch on any given architecture (e.g., because the slowest compute node will drive].

as per claim 18, Agrawal discloses the system of claim 15, the method further comprising performing distributed deep learning training on a batch of training data(see par. 0066, FIG. 5 which depicts an example   neural network that is being trained using training data).


as per claim 19, Arunachalam   discloses the system of claim 18, the method further comprising determining a plurality of training times representing an amount of time between: a beginning batch time; and an end batch time see par. 0054, 0068, where training time for plurality learners and its effect on the system , for example, system  600 includes a plurality of learner processing systems 610, 620, 630, 640 which are responsible for preforming deep network learning) and  (par 0032 for measuring the completion time variable in seconds/batch may ensure that when the neural network workload is run across the two very different architectures, for synchronous SGD, the combined throughput performance in images/sec is given by the maximum time to complete the local batch on any given architecture (e.g., because the slowest compute node will drive); and determining that the learner is a communication straggler(see par. 0016, 0019).

as per claim 20, Agrawal discloses the system of claim 19, wherein modifying the communication aspect comprises compressing the future result before sending at a compression rate(see par. 0068).

Claims 6 and 13  are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. U.S, Patent Application Publication No. 2019/0171935 [hereinafter Agrawal] in view of Arunachalam et al. U.S. Patent Application No. 2019/0155620 [hereinafter Arunachalam] and further in view of Croxford et al. U.S. Patent Application No. 2020/0184320[hereinafter Croxford].
As per claims 6 and 13 the system of Agrawal-Arunachalam discloses substantial features of the claimed invention as discussed above with respect to claims 1 and 8,   Agrawal-Arunachalam is silent regarding modifying a processing aspect of the communication straggler, wherein modifying the processing aspect comprises increasing a frequency rate of a computational processor of the communication straggler(slow processor).
 Croxford discloses a system for modifying the processing aspect comprises increasing a frequency rate of a computational processor of the communication straggler(slow processor)(see par. 0107, 0128, 0154). Therefore, it would have been obvious to one having ordinary skill in the art  prior to effective filing date of the claimed invention to incorporate the teaching of Croxford into the system of Agrawal-Arunachalam-Agrawal in to the system of   in this way  the amount of training time may distributed among processors by increasing the training time for the  learners  that are not straggling to  ensure that the neural network will produce a desired output
Hence, the determined distribution of neural network processing is dynamically adjusted based on available processing capability of the processors.


 		Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDULLAHI ELMI SALAD whose telephone number is (571)272-4009. The examiner can normally be reached 9:30AM-6:PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thu Nguyen can be reached on 571-272-6967. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABDULLAHI E SALAD/Primary Examiner, Art Unit 2452