DETAILED ACTION
This action is in response to the claims filed 1/18/2022. Claims 1, 8, 15 and 19 are amended. Claims 1, 8, 15 and 19 are independent claims. Claims 1-22 are pending and have been examined. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/18/2022 has been entered.

Response to Arguments
	Applicant's arguments filed 01/18/2022 have been fully considered but they are not persuasive. 
	Regarding the rejection under 35 U.S.C 101
Applicant argues that the claims amount to significantly more because the claimed subject matter is not well understood or conventional, at least evidenced by shortcomings in the cited art.
 Examiner disagrees, the art rejections have been updated in response to the amendments.
Further applicant argues that the claims recite tangible operations, that, when taken as a whole amount to more than simply organizing and comparing data. 
Examiner disagrees, the claims amount to simply defining operations then storing the operations on a memory. The claims have been amended so that the memory is shared between the “cores” and the “host processor”, processors that share a memory is a feature of a generic computing system.
Regarding the rejection under 35 U.S.C 103
Applicant simply states on pg 2 of the remarks that the “references, alone or in combination” teach the amended claims. Applicant has provided no arguments as to why this may be the case. Upon further consideration the examiner has updated the rejection in view of the amendments.  


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1-22 rejected under 35 U.S.C. 101 because claimed invention is directed to an abstract idea without significantly more. The rejections of multiple claims are lumped together because the related claims do not present additional limitations that affect the 101 rejection. For this reason claim 1 and 8 are rejected under the same rational. Further, claim 2, 5, 9, and 12 are rejected as a whole as well. This same treatment is given to the following claim groups: [3, 6, 10 and 13] as well as [4, 7, 11 and 14], [15, 19], [16, 20], [17, 21], [18, 22].
	
 
Regarding Claim 1/8
Step 1 Analysis: Claim 1/8 is directed to a general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor each of the following limitations:
define a profile for a reference inference engine…
the profile identifying a group of operations…
define one or more levels…
the one or more levels identifying…
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“a general-purpose graphics processing compute block comprising a plurality of graphics process cores”, “a shared memory communicatively coupled to the plurality of graphics processing cores and to a host processor”). The above limitations in the context of this claim encompasses defining and identifying (mental processes). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception is not integrated into a practical application. In particular, the claim only recited additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a tool to perform an abstract idea. The additional element of “Storing…in the shared/unified memory…” amounts to mere instructions to implement an abstract ideas on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f). “a neural network unit” only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. Further, using features of a neural network and
“a neural network unit” generally links the judicial exceptions to the field of Neural Networks. The claim is not patent eligible.

Regarding Claim 2/5/9/12
Step 1 Analysis: The rejection of Claim 1/8 is incorporated, therefore Claim 2/5/9/12 is directed to a general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
apply a training set to a neural network to create a first trained model
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“apply” and “create” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). The limitation does not specify that through the action of training a neural network a first trained model is created. Under BRI, the neural network is already trained and applying a training set to the neural network does not necessarily mean that the neural network is being trained. If the claim were edited to specify that the neural network is the feature doing the training, the claim limitation would be indicative of an inventive concept. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “apply” and “create” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 3/6/10/13
Step 1 Analysis: The rejection of Claim 2/5/9/12 is incorporated, therefore Claim 3/6/10/13 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform operations to the profile
operations executed by the first trained model
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform operations” and “execute operations” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform operations” and “execute operations” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 4/7/11/14
Step 1 Analysis: The rejection of Claim 3/6/10/13 is incorporated, therefore Claim 4/7/11/14 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform a first set of levels to the first set of levels
As the rejection of claim 1/8 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform a first set of levels” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform a first set of levels” generally links the judicial exceptions to neural networks inference. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 1/8.

Regarding Claim 15/19  
Step 1 Analysis: Claim 15/19 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor each of the following limitations:
conform an inference engine to the reference inference engine
apply the inference engine to the data set
processes a workload or compute operations
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“a general-purpose graphics processing compute block comprising a plurality of graphics process cores”, “a shared memory communicatively coupled to the plurality of graphics processing cores and to a host processor”). The above limitations in the context of this claim encompasses receiving (mental processes) and conforming, applying and inference engine to a data set, processing and computing (mathematical calculations). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception is not integrated into a practical application. In particular, the claim only recited additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a tool to perform an abstract idea. The additional element of  “conform”, “apply an inference engine”, “process or compute” amounts to mere instructions to implement an abstract ideas on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f). Further, conforming is understood to mean preparing a data structure for processing, which is analogous to a computer computation. “Inference engine” and “neural network” only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Further, “receive…an interchange file” and “receive a test data set” amounts to extra solution activity, as discussed in (MPEP 2106.05(g)) Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. Further, using features related to neural network processing and “a neural network unit” generally links the judicial exceptions to the field of Neural Networks. Further, the additional elements of “receive…an interchange file” and “receive a test data set” amounts to receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)), thus routine and conventional. The claim is not patent eligible.

Regarding Claim 16/20
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 16/20 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
apply the reference inference engine to the data set
confirm that an output of the inference engine matches an output
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“apply inference engine to the data set” and “confirm that an output matches” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). The limitation does not specify that through the action of training a neural network a first trained model is created. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “apply inference engine to the data set” and “confirm that an output matches” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.

Regarding Claim 17/21
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 17/21 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform one or more operations executed by the inference engine to a profile
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform operations” and “execute by the inference engine” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform operations” and “execute by the inference engine” generally links the judicial exceptions to neural networks. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.

Regarding Claim 18/22
Step 1 Analysis: The rejection of Claim 15/19 is incorporated, therefore Claim 18/22 is directed to general purpose graphics processor, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: Further, the claim recites a computer apparatus each of the following limitations:
conform one or more levels to one or more levels
As the rejection of claim 15/19 is incorporated, the claim recites an abstract idea.
Step 2A Prong Two Analysis:
The judicial exception in not integrated into a practical application. In particular,
“conform a set of levels” Only generally links the use of the judicial exceptions to a particular technological environment of field of use, as discussed in MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “conform a set of levels” generally links the judicial exceptions to neural networks inference. The claim does not provide and inventive concept for the reason provided here and as set forth in the rejection of claim 15/19.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-14 are rejected under 35 U.S.C. 103 as being unpatentable over Andryc et al. “FlexGrip: A Soft GPGPU for FPGAs” hereinafter, Andryc, further in view of Yu et al. “Bridging the Gap between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler” 

Regarding claim 1
Andryc teaches, A general purpose graphics processor comprising: a general-purpose graphics processing compute block comprising a plurality of graphics processing cores (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a shared memory communicatively coupled to the plurality of graphics processing cores and to a host processor; and store [the operations]… for the reference inference engine in the shared memory such that [information] for the reference engine are accessible to the host processor(Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads” pg 4 “The Fetch stage is the initial stage of the execution pipeline and is responsible for fetching four or eight-byte CUDA binary instructions from system memory.” The system memory is accessible to the processors to fetch the instructions. As shown in Figure 2, the system memory is coupled to the SM blocks and the Microblaze through the AXI interface ) define a profile for a reference inference engine,  the profile identifying a group of operations which may be implemented by the reference inference engine using the plurality of graphics processing cores; operations implemented on the plurality of graphics processing cores by the reference inference engine(Pg 2 Background Section A  “In the CUDA programming model, the host program launches a series of kernels organized as a grid of thread blocks. A thread block represents a collection of operations which can be performed in parallel” pg 6 Section B “We have evaluated five CUDA applications, bitonic sort, autocorrelation, matrix multiplication, parallel reduction and transpose” matrix multiplication is an operation used in neural network inference. A processor that performs matrix multiplications is an inference engine. These operations define a profile representing the operations to be performed by the GPGPU.)
Andryc does not explicitly teach,  and a neural network unit comprising processing circuitry; define one or more levels for the reference inference engine, the one or more levels identifying a bit depth of operation for the reference inference engine; the profile and the one or move levels
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches,  and a neural network unit comprising processing circuitry (pg 3 right column ¶02-4 “our goal is to build G′ from G…We define a minimum set of operations C…that has to be satisfied to use our NN compiler… We choose this dot–core-op as the minimum requirement for hardware because it can cover most existing NN chips. Thus, our NN compiler can support most existing NN chips” the set of minimum operations that comprise a Neural Network compiled by the compiler defines a profile, where the profile identifies the Dot-core-ops operations that would be implemented by a reference inference engine, a NN hardware chip, that emulates the original neural network G before being compiled into G’) define one or more levels for the reference inference engine, the one or more levels identifying a bit depth of operation for the reference inference engine; the profile and the one or move levels (pg 3 Section 3 ¶04 “We also adopt CG as the programming model with a slight modification. The difference is that we regard model parameters as immutable states… a trained NN…” ¶07 “Thus, our goal is to build G′ from G” ¶09 “In addition, the I/O data precision is B bits. Formally, the dot–core-op meets the following constraints: N,M,B are fixed…. our NN compiler can support most existing NN chips” As described, Yu presents taking a trained Neural network and building a reference model G’, by compiling the original model G, that summarizes desired characteristics of the original graph to be implemented on hardware, corresponding to a profile of operations. The I/O precision corresponds to the one or more levels where the levels define the bit precision, or bit depth, of the model. The resulting model can now be implemented by the reference inference engine.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Andryc.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 

Regarding claim 2
Andryc/Yu teaches the system of claim 1.
Further Yu teaches, the neural network unit to: apply a training set to a neural network to create a first trained model. (pg 2 ¶04 “We propose a transformation workflow to convert a trained NN [first trained model], expressed as a CG, into an equivalent representation of HW/SW interface through the fine-tuning method” the trained NN is the model that expressed as a CG graph, is used to characterize the reference inference engine described in claim 1. In order to produce a trained neural network, the network must have been trained with a corresponding training set.)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 3
Andryc/Yu teaches the system of claim 2.
Further Yu teaches, conform one or more operations executed by the first trained model to the profile of the reference inference engine. (pg 6 ¶07 “In this step, we will turn Gˆ into G′. Since the core-op–like operations fˆ ∈ Fˆ can be combined by core-ops in F′, we expand all operations in Gˆ into individual subgraphs consisting of operations in F′ to form the graph G′…. Take dot-like operations as an example. Dot-like operations can be represented as a fully-connected layer. As shown in Figure 3(f)” As shown in the figure, complex operations can be mapped to their component part, so that a first trained model can be mapped to the profile of the reference hardware engine. The compiling or conforming processes is shown in figure 3)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 4
Andryc/Yu teaches the system of claim 3.
Yu teaches, the neural network unit to: conform a first set of levels for the first trained model to the first set of levels of the reference inference engine. (pg 5 ¶02 and Figure 3 and Table 1 “Since the original model G [first trained model] usually use floating point numbers for computation and data representation… To encode a floating-point vector with a low precision vector, we employ an auto encoder to get the low-precision representation.” The original trained model, first trained model, is mapped, or conformed, to the constraints imposed by the inference hardware as discussed in previous rejections. The first set of levels corresponds to the desired bit precision as set forth by the hardware constraints. As shown in table 1 there are different hardware constrains for different chips, depending on which is determined as the reference inference engine. Using the constraints the first trained model is compiled or conformed to the desired reference inference engine’s low precision specification.)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 5
Andryc/Yu teaches the system of claim 2.
Further Yu teaches, apply a training set to a neural network to create a second trained model, different from the first trained model. (Section 4.5 ¶03” our workflow is not dependent on the concrete NN type (ANN or rate-coding SNN), it can support SNN hardware and SNN models, too. For SNN models, the training data is the firing rate of each neuron… RNN is an NN with some cycle(s). We could transform and fine-tune each operation inside an RNN as normal, and add an additional step to fine-tune the entire RNN after that.” The steps described by Yu relate to a generalized Artificial Neural Network, however, the steps are applicable for Spiking neural networks and Recurrent neural networks. A RNN for example corresponds to a second trained model different from the first trained model initially described in the rejection of claim 2.)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 6
Andryc/Yu teaches the system of claim 5.
Further Yu teaches, conform one or more operations executed by the second trained model to the profile of the reference inference engine. (As stated in claim 5, an alternative model corresponding to the second trained model can be used with the system described in claim 3.)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 7
Andryc/Yu teaches the system of claim 6.
Further Yu teaches, conform a second set of levels for the second trained model to a second set of levels of the reference inference engine (As stated in claim 5, an alternative model corresponding to the second trained model can be used with the system described in claim 3. Wherein the second set of levels would be chosen based on the alternative model.)
For the motivation to combine Andryc/Yu see the rejection of claim 1.

Regarding claim 8-14
	Claims 8-14 are rejected in part for the reasons set forth in the rejections of claims 1-7 under Andryc/Yu.
	Andryc teaches, and a graphics processing device comprising a graphics processing compute block to process a workload including graphics or compute operations; A data processing system comprising: a general purpose processor (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a unified memory communicatively coupled to the general purpose processor and the graphics processing device; such that the [information] for the reference engine are accessible to the general purpose processor (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads” pg 4 “The Fetch stage is the initial stage of the execution pipeline and is responsible for fetching four or eight-byte CUDA binary instructions from system memory.” The system memory is accessible to the processors to fetch the instructions. As shown in Figure 2, the system memory is unified through the SM blocks and the Microblaze through the AXI interface )
	For the reasons to combine Andryc/Yu see the rejection of claim 1

Claims 15 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Andryc et al. “FlexGrip: A Soft GPGPU for FPGAs” hereinafter, Andryc, further in view of Fachini et al US Document ID US 20190332441 A1, hereinafter, Fachini. 

Regarding claim 15
Andryc teaches, A general purpose graphics processor, comprising: 
a general-purpose graphics processing compute block comprising a plurality of graphics processing cores;  (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a shared memory communicatively coupled to the plurality of graphics processing cores and to a host processor; receive, from the shared memory [data] which implements operations using the plurality of graphics processing cores; (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads” pg 4 “The Fetch stage is the initial stage of the execution pipeline and is responsible for fetching four or eight-byte CUDA binary instructions from system memory.” The system memory is accessible to the processors to fetch the instructions. As shown in Figure 2, the system memory is coupled to the SM blocks and the Microblaze through the AXI interface ) conform an inference engine which implements operations using the plurality of graphics processing cores to the reference inference engine;  (Pg 2 Background Section A  “In the CUDA programming model, the host program launches a series of kernels organized as a grid of thread blocks. A thread block represents a collection of operations which can be performed in parallel” pg 6 Section B “We have evaluated five CUDA applications, bitonic sort, autocorrelation, matrix multiplication, parallel reduction and transpose” matrix multiplication is an operation used in neural network inference. A processor that performs matrix multiplications is an inference engine.)
	Andryc does not explicitly teach, and a neural network unit comprising a processor processing circuitry; receive… an interchange file comprising a representation of a trained model and inference engine data which characterizes a reference inference engine; receive a test data set for the trained model;  apply the inference engine to the test data set
Fachini when addressing a interoperable neural network file exchange system teaches, and a neural network unit comprising a processor processing circuitry (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.) receive… an interchange file comprising a representation of a trained model and inference engine data which characterizes a reference inference engine; (¶0033 “An interoperability formatting module 120 is also shown in FIG. 1. The interoperability formatting module 120 functions, in this example, as a neutral entity which consolidates the inputs from the plurality of NN frameworks 105-1-105-N to provide operation information in an interoperable format to scheduler module 115. NNEF and ONNX are two similar open formats to represent and interchange neural networks [trained neural networks] among deep learning frameworks and inference engines [reference inference engine]…At the core, both formats are based on a collection of often used operations from which networks can be built.” NNEF is used to transfer trained networks, and described the structure operations and parameters of the trained neural network) receive a test data set for the trained model;  apply the inference engine to the test data set  (¶0029 “For example, after executing a complete model, and collecting metrics through the annotations, the adaptable device may be retrained with any new data.” test data is used to retrain a model. ¶0028-0029 “In one implementation the adaptation, or learning, could be considered “online.” That is, every time there is new information available…traditional batch training techniques or other heuristics may be used to update the scheduler's components in addition to, or instead of, these online techniques…after executing a complete model, and collecting metrics through the annotations, the adaptable device may be retrained with any new data” a neural network model can be trained and tested based on metrics and new data. Both online and batch training involves receiving test data and apply test data to a trained model.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a system for matrix operations scheduling, based on a interoperable format. In this case the matrix operations correspond to neural network models as taught by Fachini to the disclosed invention of Andryc.
One of ordinary skill in the arts would have been motivated to make this modification because both Andryc and Fachini discuss scheduling matrix operations to available hardware. Fachini presents systems and methods which  “optimizes and adapts processing functions for an NN model either prior to processing or for just-in-time determination of operation scheduling” this is beneficial because “current state of the widely used NN tools and frameworks do not appear to provide any intelligent mechanism to account for various computational requirements of NNs… current NN tools may fail to perform comprehensive optimizations for the multitude of available platforms, frameworks, and hardware” (Fachini abstract and ¶02)

Regarding claim 19
Claim 19 is rejected in part for the reasons set forth in the rejection of claim 15. 
	Further Andryc teaches, A data processing system comprising: a general purpose processor; and a graphics processing device comprising a graphics processing compute block to process a workload including graphics or compute operations; (Abstract “This architecture supports direct CUDA compilation to a binary which is executable on the FPGA based GPGPU” Pg 2 Background Section A “GPGPUs have a many-core device architecture and possess substantial parallel processing capabilities. As shown in Fig. 1, a typical GPGPU consists of an array of multiprocessors (each with two or more processors)…with each multiprocessor consisting of multiple scalar processor (SP) cores) a unified memory communicatively coupled to the general purpose processor and the graphics processing device (Pg 2 Background Section A “A shared memory serves as a communication medium between different cores residing in the same SM….In addition, there is a read-only constant memory accessible by all the threads” pg 4 “The Fetch stage is the initial stage of the execution pipeline and is responsible for fetching four or eight-byte CUDA binary instructions from system memory.” The system memory is accessible to the processors to fetch the instructions. As shown in Figure 2, the system memory is coupled to the SM blocks and the Microblaze through the AXI interface ) 
	Further Fachini teaches, and a neural network unit comprising a processor (Abstract “A Neural Network (NN) scheduler and techniques to implement features of different possible NN schedulers are disclosed” ¶0062 “in FIG. 10, computing device 1000 includes a processing element such as processor” ¶0063 “memory 1010 may be operatively and communicatively coupled to processor” Examiner notes the neural network scheduler used to process neural networks is a neural network unit.)
	For the reasons to combine Andryc and Fachini see the rejection of claim 15.

Claims 16-18 and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Andryc/Fachini, further in view of Yu et al. “Bridging the Gap between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler” 

Regarding claim 16
Andryc/Fachini teaches the method of claim 15.
Andryc/Fachini does not explicitly teach, the neural network unit to: apply the reference inference engine to the test data set; and confirm that an output of the inference engine matches an output of the reference inference engine.
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches,  the neural network unit to: apply the reference inference engine to the test data set; and confirm that an output of the inference engine matches an output of the reference inference engine. (pg 6 right column ¶05 “As shown in Figure 3(g)(f), we use the corresponding supervised signal from graph G [inference engine] to fine-tune [match] current subgraph of G′ [reference engine]… Thus, the transformation of current subgraph will consider the error from previous transformed subgraphs, which can avoid error accumulation. “As described previously the inference engine is fine-tuned using an input data set to match the output produced by the reference inference engine.) 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Andryc/Fachini.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 

Regarding claim 17
Andryc/Fachini teaches the method of claim 15.
Andryc/Fachini does not explicitly teach, the neural network unit to: conform one or more operations executed by the inference engine to a profile identifying a group of operations which implemented by the reference inference engine.
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches,  the neural network unit to: conform one or more operations executed by the inference engine to a profile identifying a group of operations which implemented by the reference inference engine. (pg 6 ¶07 “In this step, we will turn Gˆ into G′. Since the core-op–like operations fˆ ∈ Fˆ can be combined by core-ops in F′, we expand all operations in Gˆ into individual subgraphs consisting of operations in F′ to form the graph G′…. Take dot-like operations as an example. Dot-like operations can be represented as a fully-connected layer. As shown in Figure 3(f)” As shown in the figure, groups of complex operations can be mapped to their component part, so that an inference engine can be mapped to the profile of the reference hardware engine.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Andryc/Fachini.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 

Regarding claim 18
Andryc/Fachini teaches the method of claim 15.
Andryc/Fachini does not explicitly teach, the neural network unit to: conform one or more levels identifying a bit depth of operation for the inference engine to one or more levels for the reference inference engine.
Yu however, when addressing issues related to defining a computational neural network graph in terms of simple operations and precision for physical hardware teaches,  the neural network unit to: conform one or more levels identifying a bit depth of operation for the inference engine to one or more levels for the reference inference engine. (pg 5 ¶02 and Figure 3 and Table 1 “Since the original model G [first trained model] usually use floating point numbers for computation and data representation… To encode a floating-point vector with a low precision vector, we employ an auto encoder to get the low-precision representation.” The original trained model, inference engine, is mapped, or conformed, to the constraints imposed by the inference hardware as discussed in previous rejections. The first set of levels corresponds to the desired bit precision as set forth by the hardware constraints. As shown in table 1 there are different hardware constrains for different chips, depending on which is determined as the reference inference engine.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method for translating a software defined graph defining the inference operation of the original graph in terms of predefined features such as bit depth (levels), operations to be conformed, and training sets for further training for optimal implementation on diverse hardware as taught by Yu to the disclosed invention of Andryc/Fachini.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a system that “decouple[s] the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints.” (Yu Abstract) 
Regarding claim 20
	Claim 20 is rejected at least for the reasons set forth in claim 16 in view of claim 19
Regarding claim 21
	Claim 21 is rejected at least for the reasons set forth in claim 17 in view of claim 19
Regarding claim 22
	Claim 22 is rejected at least for the reasons set forth in claim 18 in view of claim 19

 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
	
/J.R.G./Examiner, Art Unit 2122    
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122