DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendment filed 1/18/2022 has been entered. Applicant amended claims 1-3, 8-9, 13 and 15-16, added claims 21-25, did not cancel any claims in the amendment. Therefore, claims 1-25 are pending. 
The objections to claims 3 and 13, set forth in the previous Office Action, are withdrawn in view of the 1/18/2022 amendments to these claims. 
The rejections of claims 1-20 under 35 U.S.C. 112(b) and 103, set forth in the previous Office Action, are withdrawn in view of the 1/18/2022 amendments to the claims. 

Allowable Subject Matter
Claims 1-25 are allowed over the prior art of record.

Reasons for Allowance
The following is an examiner's statement of reasons for allowance:

The closest art of record, non-patent literature Wen et al. ("TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning." arXiv preprint 
Wen further discloses “Scaling to Large-scale Deep Learning” by “apply[ing] TernGrad to large-scale DNNs. … we are able to train large-scale DNNs by TernGrad successfully after making some or all of the following changes: (1) decreasing dropout ratio to keep more neurons; (2) using smaller weight decay; and (3) disabling ternarizing in the last classification layer. Dropout can regularize DNNs by adding randomness, while TernGrad also introduces randomness. Thus, dropping fewer neurons helps avoid over-randomness. Similarly, as the randomness of TernGrad introduces regularization, smaller weight decay may be adopted. We suggest not to apply ternarizing to the last layer, considering that the one-hot encoding of labels generates a skew distribution of gradients and the symmetric ternary encoding {−1, 0, 1} is not optimal for such a skew distribution” [i.e., analysis of the skew pattern/distribution generated by a distribution of gradients/distributed gradient synchronization] (See, e.g., page 7, section 4.2). However, Wen was published May 22, 2017, which is not before the effective filing date of the instant application, April 28, 2017. Therefore, Wen does not constitute prior art under 35 U.S.C. 102(a)(1).

The prior art of record Roblek et al. (U.S. Patent Application Pub. No. 2017/0330586 A1, hereinafter “Roblek”) discloses “trained parameters define an optimal 
Roblek also discloses “important information may be lost during the mapping process and a hardcoded fixed-scale mapping may not provide an optimal mapping of frequency domain features for a given task. Therefore, the accuracy and performance of an audio classification system receiving the mapped frequency domain features may be reduced” [i.e., when an optimal mapping is not provided, then the performance of an audio classification system is reduced, as such, the optimal mapping inherently provides a non-reduced or non-degraded/without degrading performance of the classification system] (See, e.g., e.g., paragraph 28).
The prior art of record non-patent literature Dettmers, Tim ("8-bit approximations for parallelism in deep learning." arXiv preprint arXiv:1511.04561 (2015). pp. 1 -14, hereinafter “Dettmers”) discloses “In data parallelism, the model is kept constant for all GPUs while each GPU is fed with a different mini-batch. After each pass the gradients are exchanged, i.e. synchronized with each GPU” [i.e., a GPU/graphics processor used 

The prior art of record Tokui et al. (U.S. Patent Application Pub. No. 2018/0349772 A1, hereinafter “Tokui”) discloses a “method for using calculation libraries such as Caffe … and Theano (http://deeplearning.net/software/theano/). … According to these libraries, by using a dedicated Mini programming language to describe the loss function as a combination of prepared primitives [i.e., libraries include machine learning primitives] it is possible to automatically obtain a gradient function of the loss function, too. This is because a gradient of each primitive is defined, and therefore a gradient of the entire combination can be also obtained by automatic differentiation. … by using this Mini programming language, the neural network can perform learning by the gradient method by using a gradient function” [i.e., the primitives are used to analyze a pattern in a distributed gradient method/function/synchronization implemented/performed by the neural network application/function] (See, e.g., paragraphs 67-68).

The prior art of record non-patent literature Lambert et al. ("Adaptive Frequency Neural Networks for Dynamic Pulse and Metre Perception." ISMIR. Schloss Dagstuhl LZI, 2016, hereinafter “Lambert”) discloses “We have introduced this rule to ensure the AFNN retains a spread of frequencies (and thus metrical structure) across the gradient. The force is relative to natural frequency, and can be scaled through the ϵh parameter. By balancing the adaptive (ϵf) and elastic (ϵh) parameters, the oscillator frequency is able to entrain to a greater range of frequencies, whilst also returning to its natural frequency (ω0) when the stimulus is removed. Figure 4 shows the frequencies adapting over time in the AFNN under sinusoidal input” [i.e., the gradients as part of the Adaptive Frequency Neural Network/AFNN correspond to the skew characteristics associated with/observed in a gradient synchronization implemented by the AFNN/neural network application] (See, e.g., FIG. 4 and page 63, right col., paragraph 3).

However, the prior art of record does not anticipate, nor do they render obvious in any reasonable combination to one of ordinary skill in the art at the time of Applicant’s invention, the combination of recited limitations of independent claim 21.

For example, the prior art of record does not anticipate or render obvious the limitations:
“implement, using the neural network application, the distributed gradient synchronization using a tree structure such that local weight vectors start at one or more 
determine, using the machine learning primitives of the library as implemented by the neural network application, a point to apply frequency scaling in the graphics processor that does not degrade performance of the neural network application, the point determined based on analysis of the skew pattern generated by the distributed gradient synchronization implemented via the tree structure; and
determine, using the library as implemented by the neural network application, a core frequency of the frequency scaling applied at the point, wherein the library is to account for skew characteristics associated with the distributed gradient synchronization to decide the core frequency”
as recited in independent claim 21 in combination with its other limitations.
 
Thus, independent claim 21 is patently distinct over the prior art of record for at least the reasons above. 
Independent claims 1, 8 and 15 recite similar distinguishing features.
Thus, independent claims 1, 8, 15 and 21 are patently distinct over the prior art of record for at least the reasons above. 

The remaining claims are dependent claims, thus, they are also patently distinct over the prior art of record for at least the reasons above. In particular, claims 2-7, 9-14, 16-20 and 22-25 each depend directly or indirectly from independent claims 1, 8, 15 and . 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled "Comments on Statement of Reasons for Allowance."

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business 





/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125