DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 14-33 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dirac (US 2015/0379072).
For claim 14, Dirac teaches a system (Figure 1) comprising: 
a system memory (within a provider network, [0042]) to store a set of trainable machine learning parameters (features for feature processing, [0042]-[0043]) and a library (e.g., Java or Python, [0046] and [0098]) to facilitate data transmission during distributed training of a neural network ([0046] and Figure 26); 
a fabric interface (9040, Figure 33) to enable transmission and receipt of data associated with the set of trainable machine learning parameters ([0170]-[0175] and Figure 1); 
a first set of general-purpose processor (9010, Figure 33) cores (used to implement 1110 of Figure 11, details of which are shown in Figure 33, [0170]) to execute instructions provided by the library (e.g., “executeRecipe”, [0046] and [0099]), the instructions to control a data transmission library (libraries used for scheduling the execution of transformation operations, [0099]); and 
a general-purpose graphics ([0171]) processor (1175A and 1175B of Figure 11, implemented using Figure 33, [0170]-[0171]) to perform compute operations associated with machine learning framework workflow (recipe execution), the compute operations to generate gradient data for the trainable machine learning parameters (although gradient data is not explicitly stated, it is understood to be an inherently present in “iterations of training” in [0066] and [0095], see also Figure 26), wherein the first set of general-purpose processor cores are to control the data transmission library to send and receive training data (based on requests input to 1110 of Figure 11) via the fabric interface ([0050]) during the machine learning framework workflow (parallel operations, [0118], [0137], [0167]).
For claim 15, Dirac further teaches:
the general-purpose graphics processor includes the fabric interface and a local memory (9020, Figure 33) that is shared between the fabric interface and the general-purpose graphics processor (as understood by examination of the Figures and [0170]-[0171]).
For claim 16, Dirac further teaches:
the fabric interface includes a hardware module (9040) that is configurable via instructions within the library ([0174]), the hardware module to accelerate transmission of data stored in the local memory (as understood by examination of the Figures).
For claim 17, Dirac further teaches:
to perform compute operations associated with the machine learning framework workflow, the general-purpose graphics processor is to perform compute operations associated with a forward compute to generate a set of activation data ([0099]-[0100]).
For claim 18, Dirac teaches an apparatus comprising: 
a general-purpose graphics ([0171]) processor (1175A and 1175B of Figure 11, implemented using Figure 33, [0170]-[0171]) to: 
perform compute operations associated with a machine learning framework workflow (parallel operations, [0118], [0137], [0167]); and 
generate, as part of the machine learning framework workflow, gradient data (although gradient data is not explicitly stated, it is understood to be an inherently present in “iterations of training” in [0066] and [0095]) for trainable machine learning parameters (features for feature processing, [0042]-[0043]) during distributed training of a neural network ([0046] and Figure 26); 
wherein a first set of general-purpose processor (9010, Figure 33) cores (used to implement 1110 of Figure 11, details of which are shown in Figure 33, [0170]) are to control a data transmission library (libraries used for scheduling the execution of transformation operations, [0099]) to send and receive training data for the machine learning framework workflow (based on requests input to 1110 of Figure 11) via a fabric interface ([0050]) during the machine learning framework workflow ([0118], [0137], [0167]).
For claim 19, Dirac further teaches:
the general-purpose graphics processor comprises the fabric interface and a local memory (9020, Figure 33) that is shared between the fabric interface and the general-purpose graphics processor (as understood by examination of the Figures and [0170]-[0171]).
For claim 20, Dirac further teaches:
the fabric interface comprises hardware circuitry  (9040) that is configurable via instructions within a library (e.g., Java or Python, [0046], [0098] and [0174]), the hardware circuitry to accelerate transmission of data stored in the local memory (as understood by examination of the Figures).
For claim 21, Dirac further teaches:
the general-purpose graphics processor is further to, while performing the machine learning framework workflow, automatically exchange one or more of activation data corresponding to activations of the neural network, the gradient data with respect to the activations, and gradients corresponding to the gradient data with respect to the machine learning parameters (as understood by examination of the Figures).
For claim 22, Dirac further teaches:
performing the machine learning framework workflow comprises performing forward propagation computation to generate a set of activation data and performing a backward propagation computation to determine the gradient with respect to a set of trainable machine learning parameters (required by training and evaluation of models, [0085] and Figure 26).
For claim 23, Dirac further teaches:
the general-purpose graphics processor is further to, while performing the machine learning framework workflow, automatically exchange one or more of activation data corresponding to activations of the neural network, the gradient data with respect to the activations, and gradients corresponding to the gradient data with respect to the machine learning parameters (as understood by examination of the Figures).
For claim 24, Dirac further teaches:
performing the machine learning framework workflow comprises performing forward propagation computation to generate a set of activation data and performing a backward propagation computation to determine the gradient with respect to a set of trainable machine learning parameters (required by training and evaluation of models, [0085] and Figure 26).
For claim 25, Dirac teaches a method comprising: 
performing, by a general-purpose graphics processor (1175A and 1175B of Figure 11, implemented using Figure 33, [0170]-[0171]), compute operations associated with a machine learning framework workflow (recipe execution); and 
generating, by the general-purposes graphics processor as part of the machine learning framework workflow, gradient data for trainable machine learning parameters (although gradient data is not explicitly stated, it is understood to be an inherently present in “iterations of training” in [0066] and [0095], see also Figure 26) during distributed training of a neural network ([0046] and Figure 26); 
wherein a first set of general-purpose processor (9010, Figure 33) cores (used to implement 1110 of Figure 11, details of which are shown in Figure 33, [0170]) are to control a data transmission library (libraries used for scheduling the execution of transformation operations, [0099]) to send and receive training data (based on requests input to 1110 of Figure 11) for the machine learning framework workflow via a fabric interface ([0050])  during the machine learning framework workflow (parallel operations, [0118], [0137], [0167]).
For claim 26, Dirac further teaches:
the general-purpose graphics processor comprises the fabric interface and a local memory (9020, Figure 33) that is shared between the fabric interface and the general-purpose graphics processor (as understood by examination of the Figures and [0170]-[0171]).
For claim 27, Dirac further teaches:
the fabric interface comprises hardware circuitry (9040) that is configurable via instructions within a library (e.g., Java or Python, [0046], [0098] and [0174]), the hardware circuitry to accelerate transmission of data stored in the local memory (as understood by examination of the Figures).
For claim 28, Dirac further teaches:
the general-purpose graphics processor is further to, while performing the machine learning framework workflow, automatically exchange one or more of activation data corresponding to activations of the neural network, the gradient data with respect to the activations, and gradients corresponding to the gradient data with respect to the machine learning parameters (as understood by examination of the Figures).
For claim 29, Dirac further teaches:
performing the machine learning framework workflow comprises performing forward propagation computation to generate a set of activation data and performing a backward propagation computation to determine the gradient with respect to a set of trainable machine learning parameters (required by training and evaluation of models, [0085] and Figure 26).
For claim 30, Dirac teaches a non-transitory machine-readable storage medium (9020) having stored thereon executable computer program instructions that, when executed by one or more processors (9010, Figure 33), cause the one or more processors to perform operations ([0175]) comprising: 
performing, by a general-purpose graphics ([0171]) processor of the one or more processors (within Figure 33, [0170]-[0171]), compute operations associated with a machine learning framework workflow (recipe execution); and 
generating, by the general-purposes graphics processor as part of the machine learning framework workflow, gradient data for trainable machine learning parameters during distributed training of a neural network (although gradient data is not explicitly stated, it is understood to be an inherently present in “iterations of training” in [0066] and [0095], see also Figure 26); 
wherein a first set of general-purpose processor (9010, Figure 33) cores (used to implement 1110 of Figure 11, details of which are shown in Figure 33, [0170]) are to control a data transmission library (libraries used for scheduling the execution of transformation operations, [0099]) to send and receive training data(based on requests input to 1110 of Figure 11) for the machine learning framework workflow via a fabric interface ([0050]) during the machine learning framework workflow (parallel operations, [0118], [0137], [0167]).
For claim 31, Dirac further teaches:
the general-purpose graphics processor comprises the fabric interface and a local memory (9020, Figure 33) that is shared between the fabric interface and the general-purpose graphics processor (as understood by examination of the Figures and [0170]-[0171]), and wherein the fabric interface comprises hardware circuitry (9040)  that is configurable via instructions within a library, the hardware circuitry to accelerate transmission of data stored in the local memory (as understood by examination of the Figures).
For claim 32, Dirac further teaches:
the general-purpose graphics processor is further to, while performing the machine learning framework workflow, automatically exchange one or more of activation data corresponding to activations of the neural network, the gradient data with respect to the activations, and gradients corresponding to the gradient data with respect to the machine learning parameters (as understood by examination of the Figures).
For claim 33, Dirac further teaches:
performing the machine learning framework workflow comprises performing forward propagation computation to generate a set of activation data and performing a backward propagation computation to determine the gradient with respect to a set of trainable machine learning parameters (required by training and evaluation of models, [0085] and Figure 26).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL CALRISSIAN PUENTES whose telephone number is (571)270-5070. The examiner can normally be reached M-F 9-6:30 (flex).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Menatoallah Yousseff can be reached on 571-270-3684. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL C PUENTES/Primary Examiner, Art Unit 2849