DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendment, filed 4/13/2021 has been entered. Applicant amended claims 1, 6, 8 and 171, and did not add or cancel any claims in the amendment. Claims 3, 10 and 19 were previously cancelled and claims 21-23 were previously added in the amendment filed 11/19/2020. Therefore, claims 1-2, 4-9, 11-18 and 20-23 are pending.
The objections to claims 1-2, 4-9, 11-18 and 20-23 set forth in the previous Office Action, have been withdrawn due to Applicant’s claim amendments.
The rejections of claims 1-2, 4-9, 11-18 and 20-23 under 35 U.S.C. 103, set forth in the previous Office Action, have been withdrawn due to Applicant’s claim amendments and remarks.

Allowable Subject Matter
Claims 1-2, 4-9, 11-18 and 20-23 are allowed over the prior art of record.

REASONS FOR ALLOWANCE
The following is an examiner's statement of reasons for allowance:

The prior art of record Chilimbi et al. (U.S. Patent Application Pub. No. 2015/0324690 A1, hereinafter “Chilimbi”) discloses “methods to train large neural network models by providing training input to model training machines” [i.e., loading training data], “the functionally described herein can be performed, at least in part, by one or more hardware logic components such as accelerators … an accelerator can represent a hybrid device … that includes a CPU embedded in an FPGA fabric”, “at least one CPU, GPU, and/or accelerator is incorporated in computing device 1100” and “data store 1112 can store training data 136. Alternately, some or all of the above-referenced data can be stored on separate memories 1114 on board one or more processing unit(s) 1102 such as … a GPU-type processor, an FPGA-type accelerator, a DSP-type accelerator, and/or another accelerator.” [i.e., method of accelerating data loading from data store 1112/storage] (see, e.g., paragraphs 9, 33, 114 and 119).
Chilimbi also discloses that “neurons (e.g., v1, v2, v3, etc.) associated with the first layer 202 receive an input 204. The first layer 202 represents the input layer.”, “neural networks may be trained by back-propagation using gradient descent”, “Activation a describes the output of each neuron i in a layer I.” [i.e., at least one first/initial layer], “training continues for multiple epochs, reprocessing the training data set”, “Model Training” and “Multi-Threaded Training”, “In stochastic gradient descent the training inputs are processed … [t]he inputs may be processed one at a time … for each input to update the model weights” [i.e., generate processed training data], “the models may be partitioned such that neurons in each of the layers are within a predetermined vertical distance to neurons in neighboring layers” and the “training context may store 
Chilimbi further discloses that “models may be trained on graphics processing units (GPUs)” [i.e., GPU configured to train layers of the model], “known embodiments, describe large-scale distributed systems comprised of tens of thousands of CPU cores for training large deep neural network” [i.e., neural network model with a plurality of layers], “the deep learning training module is further configured for asynchronously sending updates to shared parameters associated with the model” [i.e., sending training data], “This process may be repeated for each input until the entire training dataset has been processed, which constitutes a training epoch … training continues for multiple epochs, reprocessing the training data set each time,” [i.e., train remaining layers using processed training data], “controllers (NICs) or other types of transceiver devices to send and receive communications over a network”, “Processing unit(s) 612 and can represent, for example … a GPU-type processing unit”, “server(s) 706 may be in constant communication with the model training machines (e.g., Machine 1, Machine 2, etc.) receiving updates to model parameters and sending the current weight values” [i.e., sending processed data/updated parameters, current weights] and “processing unit(s) 612 may execute one or more modules and/or processes to cause the server(s) and other machines 610 to perform a variety of functions, as set forth above [i.e., including the training functions] and … each of the processing unit(s) 612 may possess its own local memory, which also may store … program data” [i.e., including storing the training data] (see, e.g., paragraphs 7-8, 10, 36 and 47-48).

The prior art of record Park et al. (U.S. Patent Application Pub. No. 2018/0315153 A1, hereinafter “Park”) discloses that “component 204 is embodied as one or more integrated circuit (IC) chip and performs various data processing processes. SOC component 204 may include, among other subcomponents, image signal processor (ISP) 206, a central processor unit (CPU) 208, a network interface 210, sensor interface 212, display controller 214, graphics processor (GPU) 220, memory controller 222, video encoder 224, storage controller 226” [i.e., a storage controller 226 of a machine/component 204 performing various operations/data processing processes and a graphics processing unit/GPU 220 that is different from the storage controller 226] (see, e.g., FIG. 2 showing storage controller 226 and paragraph 38).
Park additionally discloses that “The convolution core circuit generates a stream including values associated with multiple interleaved channels by applying multiple convolution kernels to input data” [i.e., input training data], “the spatial pooling circuit performs per-channel spatial pooling and/or normalization operations on the interleaved stream [i.e., generate processed data]. The per-channel spatial pooling and normalization operations facilitate image processing in deep learning architectures … data size is reduced without substantially sacrificing performance” and “Smaller data sizes may support faster processing for machine inferencing tasks” [i.e., size of the processed data is less than the input training data] (see, e.g., paragraphs 27 and 144).

The prior art of record non-patent literature Figurnov et al. ("Perforated CNNs: Acceleration through elimination of redundant convolutions." Advances in Neural 
Figurnov further discloses that tensor U is processed to produce tensor V with 3 dimensions “to leave only a subset of rows” [i.e., a reduced size of at least one dimension] and “A convolutional layer takes as input a tensor U of size X × Y × S and outputs a tensor V of size X’ × Y’ × T, X’ = X − d + 1, Y’ = Y − d”, “input tensor U of size d × d × S … The resulting matrix of size X’Y’ × T is the output tensor V” [i.e., tensor V has 3 dimensions with a reduced size of at least one of X and Y dimensions, reduced to X’ = X – d + 1 and Y’ = Y – d + 1] (see, e.g., FIG. 1 depicting tensor U processed to produce tensor V with 3 dimensions and pages 2-3). 

The prior art of record Honkala et al. (U.S. Patent Application Pub. No. 2016/0314392 A1, hereinafter “Honkala”) discloses “the bidirectional recurrent neural network [RNN] may, in some example embodiments, include a first hidden layer having recurrent connections forward in time and a second hidden layer having recurrent connections back in time” [i.e., at least one first/initial layer], “weights are duplicated for 

However, the prior art of record does not anticipate, nor do they render obvious in any reasonable combination to one of ordinary skill in the art at the time of Applicants' invention, the combination of recited limitations of independent claim 1.

For example, the prior art of record does not anticipate or render obvious the limitations:
wherein the storage controller performs a backwards propagation during which, the storage controller issues a weight update request for receiving updated weight parameters that include one or more gradients computed by the at least one graphics processing unit during the training of the one or more remaining layers, wherein the storage controller determines whether an entire batch of gradients has been processed by the at least one graphics processing unit;

computing, by the storage controller as part of an inference operation, at least one initial layer of the trained neural network using input data; and
sending, by the storage controller, results of the computing of the at least one initial layer to at least one graphics processing unit of the machine that is different from the storage controller, the at least one graphics processing unit configured to compute one or more remaining layers of the trained neural network,
as recited in independent claim 1 in combination with the other limitations of the claim.
Independent claims 8 and 17 recite similar distinguishing features.

Thus, independent claims 1, 8 and 17 are patently distinct over the prior art of record for at least the reasons above. 

The remaining claims are dependent claims, thus, they are also patently distinct over the prior art of record for at least the reasons above. In particular, claims 2 and 4-7 each depend directly from claim 1. Similarly, claims 9 and 11-16 each depend directly from independent claim 8. Also, claims 18-23 each depend directly from independent claim 17. As such, claims 2 and 4-7, 9 and 11-16, and 18-23 each include all of the limitations of base claims 1, 8 and 17, respectively.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO 





/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Although the status identifier of claim 4 in the amendment filed 4/13/2021 indicates that the claim is “Currently Amended”, claim 4 was not amended in the amendment filed 4/13/2021. Claim 4 was amended in the amendment filed 11/19/2020, and as such, the status identifier of claim 4 should be “Previously Presented”.