DETAILED ACTION
1.	This action is in response to the claims filed 10/23/2021. Claims 1-22 are pending and have been examined. 
Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1-22 rejected under 35 U.S.C. 101 because claimed invention is directed to an abstract idea without significantly more. The rejections of multiple claims are lumped together because the related claims do not present additional limitations that affect the 101 rejection. For this reason claim 1 and 8 are rejected under the same rational. Further, claim 2, 5, 9, and 12 are rejected as a whole as well. This same treatment is given to the following claim groups: [3, 6, 10 and 13] as well as [4, 7, 11 and 14], [15, 19], [16, 20], [17, 21], [18, 22].
	
 
Regarding Claim 1/8
Step 1 Analysis: Claim 1/8 is directed to a general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor each of the following limitations:
define a profile for a reference inference engine…
the profile identifying a group of operations…
define one or more levels…
the one or more levels identifying…
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“a general-purpose graphics processing compute block comprising a plurality of graphics process cores”, “a shared memory communicatively coupled to the plurality of graphics processing cores”). The above limitations in the context of this claim encompasses defining and identifying (mental processes). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception is not integrated into a practical application. In particular, the claim only recited additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a 
2106.05(f). “a neural network unit” only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. Further, using features of a neural network and
“a neural network unit” generally links the judicial exceptions to the field of Neural Networks. The claim is not patent eligible.

Regarding Claim 2/5/9/12
Step 1 Analysis: The rejection of Claim 1/8 is incorporated, therefore Claim 2/5/9/12 is directed to a general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
apply a training set to a neural network to create a first trained model
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“apply” and “create” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). The limitation does not specify that through the action of training a neural network a first trained model is created. Under BRI, the neural network is already trained and applying a training set to the neural network does not necessarily mean that the neural network is being trained. If the claim were edited to specify that the neural network is the feature doing the training, the claim limitation would be indicative of an inventive concept. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “apply” and “create” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 3/6/10/13
Step 1 Analysis: The rejection of Claim 2/5/9/12 is incorporated, therefore Claim 3/6/10/13 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform operations to the profile
operations executed by the first trained model
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform operations” and “execute operations” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform operations” and “execute operations” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 4/7/11/14
Step 1 Analysis: The rejection of Claim 3/6/10/13 is incorporated, therefore Claim 4/7/11/14 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform a first set of levels to the first set of levels
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform a first set of levels” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform a first set of levels” generally links the judicial exceptions to neural networks inference. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 15/19  
Step 1 Analysis: Claim 15/19 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor each of the following limitations:
receive a file
receive a test data set
conform an inference engine to the reference inference engine
apply the inference engine to the data set
processes a workload or compute operations
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“a general-purpose graphics processing compute block comprising a plurality of graphics process cores”, “a shared memory communicatively coupled to the plurality of graphics processing cores”). The above limitations in the context of this claim encompasses receiving (mental processes) and conforming, applying and inference engine to a data set, processing and computing (mathematical calculations). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception is not integrated into a practical application. In particular, the claim only recited additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a tool to perform an abstract idea. The additional element of “receive”, “conform”, “apply an inference engine”, “process or compute” amounts to mere instructions to implement an abstract ideas on a computer, or merely uses a computer as a tool to 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. Further, using features related to neural network processing and “a neural network unit” generally links the judicial exceptions to the field of Neural Networks. The claim is not patent eligible.

Regarding Claim 16/20
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 16/20 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
apply the reference inference engine to the data set
confirm that an output of the inference engine matches an output
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“apply inference engine to the data set” and “confirm that an output matches” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). The limitation does not specify that through the action of training a neural network a first trained model is created. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed  the additional elements “apply inference engine to the data set” and “confirm that an output matches” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.

Regarding Claim 17/21
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 17/21 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform one or more operations executed by the inference engine to a profile
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform operations” and “execute by the inference engine” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform operations” and “execute by the inference engine” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.

Regarding Claim 18/22
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 18/22 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform one or more levels to one or more levels
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:

“conform a set of levels” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform a set of levels” generally links the judicial exceptions to neural networks inference. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-22 are rejected under 35 U.S.C. 103 as being unpatentable over Fachini et al US Document ID US 20190332441 A1, hereinafter, Fachini. Further in view of Yu et al. “Bridging the Gap between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler” hereinafter, Yu. Further in view Andryc et al. “FlexGrip: A Soft GPGPU for FPGAs” hereinafter, Andryc.

Regarding claim 1
Fachini teaches, A… processor, comprising: and a neural network unit comprising processing circuitry (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.) and store [the information] (¶0063 “memory 1010 may be operatively and communicatively coupled to processor…¶0065 the encoded instructions may then be loaded as computer executable instructions or process steps to processor 1005 from storage device 1020, from memory” The process steps for defining the characteristics describing the reference inference engine are stored in memory by the system.)
Fachini does not explicitly teach, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores a shared memory communicatively coupled to the plurality of graphics processing cores; reference inference engine using the plurality of graphics processing cores; operations implemented on the plurality of graphics processing cores … define a profile for a reference inference engine, the profile identifying a group of operations which may be implemented by the reference inference engine;  define one or more levels for the reference inference engine, the one or more levels identifying a bit depth of operation for the reference inference engine; 
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches, define a profile for a reference inference engine, the profile identifying a group of operations which may be implemented by the reference (pg 3 right column ¶02-4 “our goal is to build G′ from G…We define a minimum set of operations C…that has to be satisfied to use our NN compiler… We choose this dot–core-op as the minimum requirement for hardware because it can cover most existing NN chips. Thus, our NN compiler can support most existing NN chips” the set of minimum operations that comprise a Neural Network compiled by the compiler defines a profile, where the profile identifies the Dot-core-ops operations that would be implemented by a reference inference engine, a NN hardware chip, that emulates the original neural network G before being compiled into G’) define one or more levels for the reference inference engine, the one or more levels identifying a bit depth of operation for the reference inference engine; (pg 3 Section 3 ¶04 “We also adopt CG as the programming model with a slight modification. The difference is that we regard model parameters as immutable states… a trained NN…” ¶07 “Thus, our goal is to build G′ from G” ¶09 “In addition, the I/O data precision is B bits. Formally, the dot–core-op meets the following constraints: N,M,B are fixed…. our NN compiler can support most existing NN chips” As described Yu presents taking a trained Neural network and building a reference model G’, by compiling the original model G, that summarizes desired characteristics of the original graph to be implemented on hardware. The I/O precision corresponds to the one or more levels where the levels define the bit precision, or bit depth, of the model. The resulting model can now be implemented by the reference inference engine.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Fachini.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 
Fachini/Yu does not explicitly teach, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores a shared memory communicatively coupled to the plurality of graphics processing cores; reference inference engine using the plurality of graphics processing cores; operations implemented on the plurality of graphics processing cores
Andryc when addressing issues related to performing compute operations using a GPU architecture teaches, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a shared memory communicatively coupled to the plurality of graphics processing cores (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads”) inference engine using the plurality of graphics processing cores;… operations implemented on the plurality of graphics processing (Pg 2 Background Section A  “In the CUDA programming model, the host program launches a series of kernels organized as a grid of thread blocks. A thread block represents a collection of operations which can be performed in parallel” pg 6 Section B “We have evaluated five CUDA applications, bitonic sort, autocorrelation, matrix multiplication, parallel reduction and transpose” matrix multiplication is an operation used in neural network inference. A processor that performs matrix multiplications is an inference engine.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GPGPU to perform compute operations as taught by Andryc to the disclosed invention of Fachini/Yu.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a low energy parallel multiprocessor which includes “circuitry which can automatically handle complex conditional control operations in hardware… On average, the GPGPU requires 66% less energy than the MicroBlaze processor, with the largest energy decrease of 78% for the 32-SP reduction implementation.” (Andryc pg 8) 

Regarding claim 2
Fachini/Yu/Andryc teaches the system of claim 1.
Further Yu teaches, the neural network unit to: apply a training set to a neural network to create a first trained model. (pg 2 ¶04 “We propose a transformation workflow to convert a trained NN [first trained model], expressed as a CG, into an equivalent representation of HW/SW interface through the fine-tuning method” the trained NN is the model that expressed as a CG graph, is used to characterize the reference inference engine described in claim 1. In order to produce a trained neural network, the network must have been trained with a corresponding training set.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.

Regarding claim 3
Fachini/Yu/Andryc teaches the system of claim 2.
Fachini/Yu/Andryc teaches, conform one or more operations executed by the first trained model to the profile of the reference inference engine. (pg 6 ¶07 “In this step, we will turn Gˆ into G′. Since the core-op–like operations fˆ ∈ Fˆ can be combined by core-ops in F′, we expand all operations in Gˆ into individual subgraphs consisting of operations in F′ to form the graph G′…. Take dot-like operations as an example. Dot-like operations can be represented as a fully-connected layer. As shown in Figure 3(f)” As shown in the figure, complex operations can be mapped to their component part, so that a first trained model can be mapped to the profile of the reference hardware engine. The compiling or conforming processes is shown in figure 3)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.

Regarding claim 4
Fachini/Yu/Andryc teaches the system of claim 3.
Fachini/Yu/Andryc teaches, the neural network unit to: conform a first set of levels for the first trained model to the first set of levels of the reference inference engine. (pg 5 ¶02 and Figure 3 and Table 1 “Since the original model G [first trained model] usually use floating point numbers for computation and data representation… To encode a floating-point vector with a low precision vector, we employ an auto encoder to get the low-precision representation.” The original trained model, first trained model, is mapped, or conformed, to the constraints imposed by the inference hardware as discussed in previous rejections. The first set of levels corresponds to the desired bit precision as set forth by the hardware constraints. As shown in table 1 there are different hardware constrains for different chips, depending on which is determined as the reference inference engine. Using the constraints the first trained model is compiled or conformed to the desired reference inference engine’s low precision specification.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.

Regarding claim 5
Fachini/Yu/Andryc teaches the system of claim 2.
Further Yu teaches, apply a training set to a neural network to create a second trained model, different from the first trained model. (Section 4.5 ¶03” our workflow is not dependent on the concrete NN type (ANN or rate-coding SNN), it can support SNN hardware and SNN models, too. For SNN models, the training data is the firing rate of each neuron… RNN is an NN with some cycle(s). We could transform and fine-tune each operation inside an RNN as normal, and add an additional step to fine-tune the entire RNN after that.” The steps described by Yu relate to a generalized Artificial Neural Network, however, the steps are applicable for Spiking neural networks and Recurrent neural networks. A RNN for example corresponds to a second trained model different from the first trained model initially described in the rejection of claim 2.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.


Regarding claim 6
Fachini/Yu/Andryc teaches the system of claim 5.
Further Yu teaches, conform one or more operations executed by the second trained model to the profile of the reference inference engine. (As stated in claim 5, an alternative model corresponding to the second trained model can be used with the system described in claim 3.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.

Regarding claim 7
Fachini/Yu/Andryc teaches the system of claim 6.
Further Yu teaches, conform a second set of levels for the second trained model to a second set of levels of the reference inference engine (As stated in claim 5, an alternative model corresponding to the second trained model can be used with the system described in claim 3. Wherein the second set of levels would be chosen based on the alternative model.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 1.

Regarding claim 8-14
	Fachini teaches, and a graphics processing device comprising a graphics processing compute block to process a workload including graphics or compute ( ¶0062 “Although not illustrated in FIG. 10, the processing elements that make up processor 1005 may also include one or more of other types of hardware processing components, such as graphics processing units (GPU) [graphics processing device]”  Further, one of ordinary skill in the art would know that GPUs are routinely used in distributed neural network compute engines for job/workload based operations. The operations including graphics or compute operations.) A data processing system comprising: a general purpose processor;( ¶0062 “Examples of processors include but are not limited to a central processing unit (CPU) [general purpose processor that processes data] a microprocessor…”a memory; and a neural network unit comprising a processor (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” ¶0063 “memory 1010 may be operatively and communicatively coupled to processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.)
For the motivation to combine Fachini/Yu see the rejection of claim 1.
Fachini/Yu does not explicitly teach, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores a unified memory communicatively coupled 
However Andryc when addressing issues related to performing compute operations using a GPU architecture teaches, a unified memory communicatively coupled to the general purpose processor and the graphics processing device; (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads”) 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GPGPU to perform compute operations as taught by Andryc to the disclosed invention of Fachini/Yu.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a low energy parallel multiprocessor which includes “circuitry which can automatically handle complex conditional control operations in hardware… On average, the GPGPU requires 66% less energy than the MicroBlaze processor, with the largest energy decrease of 78% for the 32-SP reduction implementation.” (Andryc pg 8) 
	The remaining limitations of claims 8-14 are addressed in the rejection of claims 1-7.
	For the reasons to combine Fachine/Yu/Andryc see the rejection of claim 1

Regarding claim 15
Fachini teaches, A processor, comprising: and a neural network unit comprising a processor processing circuitry (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.) receive an interchange file comprising a representation of a trained model and inference engine data which characterizes a reference inference engine; (¶0033 “An interoperability formatting module 120 is also shown in FIG. 1. The interoperability formatting module 120 functions, in this example, as a neutral entity which consolidates the inputs from the plurality of NN frameworks 105-1-105-N to provide operation information in an interoperable format to scheduler module 115. NNEF and ONNX are two similar open formats to represent and interchange neural networks [trained neural networks] among deep learning frameworks and inference engines [reference inference engine]…At the core, both formats are based on a collection of often used operations from which networks can be built.” NNEF is used to transfer trained networks, and described the structure operations and parameters of the trained neural network) receive a test data set for the trained model; 
Fachini does not explicitly teach, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores a shared memory communicatively coupled to the plurality of graphics processing cores reference inference engine using the plurality of graphics processing cores; operations implemented on the plurality of graphics processing cores; conform an inference engine to the reference inference engine; and apply the inference engine to the test data set.
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches, conform an inference engine to the reference inference engine; and apply the inference engine to the test data set. (pg 2 ¶04 “We propose a transformation workflow to convert a trained NN [representing an inference engine], expressed as a CG, into an equivalent representation of HW/SW interface through the fine-tuning method” the trained NN, defines a non-linear function that corresponds to an inference engine expressed as a CG graph, that is used to characterize the reference inference engine. The transformation corresponds to conforming a trained NN to a specified reference inference engine defined by Hardware specifications. Examiner notes that in this and the following rejections the inference engine is interpreted similarly to the first trained model described in claims 1-8.) 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Fachini.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 
Fachini/Yu does not explicitly teach, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores a shared memory communicatively coupled to the plurality of graphics processing cores reference inference engine using the plurality of graphics processing cores; operations implemented on the plurality of graphics processing cores
Andryc when addressing issues related to performing compute operations using a GPU architecture teaches, A general purpose graphics processor… a general-purpose graphics processing compute block comprising a plurality of graphics processing cores (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a shared memory communicatively coupled to the plurality of graphics processing cores (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads”) inference engine using the plurality of graphics processing cores;… operations implemented on the plurality of graphics processing (Pg 2 Background Section A  “In the CUDA programming model, the host program launches a series of kernels organized as a grid of thread blocks. A thread block represents a collection of operations which can be performed in parallel” pg 6 Section B “We have evaluated five CUDA applications, bitonic sort, autocorrelation, matrix multiplication, parallel reduction and transpose” matrix multiplication is an operation used in neural network inference.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GPGPU to perform compute operations as taught by Andryc to the disclosed invention of Fachini/Yu.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a low energy parallel multiprocessor which includes “circuitry which can automatically handle complex conditional control operations in hardware… On average, the GPGPU requires 66% less energy than the MicroBlaze processor, with the largest energy decrease of 78% for the 32-SP reduction implementation.” (Andryc pg 8) 

Regarding claim 16
Fachini/Yu/Andryc teaches the method of claim 15.
Further Yu teaches, the neural network unit to: apply the reference inference engine to the test data set; and confirm that an output of the inference engine matches an output of the reference inference engine. (pg 6 right column ¶05 “As shown in Figure 3(g)(f), we use the corresponding supervised signal from graph G [inference engine] to fine-tune [match] current subgraph of G′ [reference engine]… Thus, the transformation of current subgraph will consider the error from previous transformed subgraphs, which can avoid error accumulation. “As described previously the inference engine is fine-tuned using an input data set to match the output produced by the reference inference engine.) 
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 15.

Regarding claim 17
Fachini/Yu/Andryc teaches the method of claim 15.
Fachini/Yu/Andryc teaches, the neural network unit to: conform one or more operations executed by the inference engine to a profile identifying a group of operations which implemented by the reference inference engine. (pg 6 ¶07 “In this step, we will turn Gˆ into G′. Since the core-op–like operations fˆ ∈ Fˆ can be combined by core-ops in F′, we expand all operations in Gˆ into individual subgraphs consisting of operations in F′ to form the graph G′…. Take dot-like operations as an example. Dot-like operations can be represented as a fully-connected layer. As shown in Figure 3(f)” As shown in the figure, groups of complex operations can be mapped to their component part, so that an inference engine can be mapped to the profile of the reference hardware engine.)
Fachini/Yu/Andryc see the rejection of claim 15.

Regarding claim 18
Fachini/Yu/Andryc teaches the method of claim 15.
Fachini/Yu/Andryc teaches, the neural network unit to: conform one or more levels identifying a bit depth of operation for the inference engine to one or more levels for the reference inference engine. (pg 5 ¶02 and Figure 3 and Table 1 “Since the original model G [first trained model] usually use floating point numbers for computation and data representation… To encode a floating-point vector with a low precision vector, we employ an auto encoder to get the low-precision representation.” The original trained model, inference engine, is mapped, or conformed, to the constraints imposed by the inference hardware as discussed in previous rejections. The first set of levels corresponds to the desired bit precision as set forth by the hardware constraints. As shown in table 1 there are different hardware constrains for different chips, depending on which is determined as the reference inference engine.)
For the motivation to combine Fachini/Yu/Andryc see the rejection of claim 15.

Regarding claim 19-22
	Fachini teaches, and a graphics processing device comprising a graphics processing compute block to process a workload including graphics or compute operations; ;( ¶0062 “Although not illustrated in FIG. 10, the processing elements that make up processor 1005 may also include one or more of other types of hardware processing components, such as graphics processing units (GPU) [graphics processing device]”  Further, one of ordinary skill in the art would know that GPUs are routinely used in distributed neural network compute engines for job/workload based operations. The operations including graphics or compute operations.)  A data processing system comprising: a general purpose processor; ( ¶0062 “Examples of processors include but are not limited to a central processing unit (CPU) [general purpose processor that processes data] a microprocessor…”) a memory; and a neural network unit comprising a processor (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” ¶0063 “memory 1010 may be operatively and communicatively coupled to processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.)
	The remaining limitations of claims 19-22 are addressed in the rejection of claims 15-18.
Response to Arguments

Applicant’s arguments with respect to claim(s) 1-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 5:00 pm (EST).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number 5712723719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

	
/J.R.G./Examiner, Art Unit 2122    
/ERIC NILSSON/Primary Examiner, Art Unit 2122