DETAILED ACTION
1.	This office action is in response to the Application No. 16330906 filed on 09/04/2017. Claims 10-18 are presented for examination and are currently pending.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
3.	Claims 10, 12, 13, 16, and 18 are objected to because of the following informalities:  
	Claims 10, 12, 13, 16, and 18 recites the acronym “DMA”.  The first instance of the acronym should be defined.  It should be “Direct Access Memory” (see [0032] of published specification).
	 Appropriate correction is required.

Double Patenting
4.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.


This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.


Instant application 16330906
Copending application 16330625
10. A hardwired hardware-implemented model calculation unit for calculating a multi-layer perceptron model, the model calculation unit comprising:
a processor core;
a memory; and
a DMA unit; wherein
12. A model calculation unit for calculating a multilayer perceptron model having a plurality of neuron layers, the model calculation unit being designed in hardware and being hardwired, the model calculation unit comprising:
     the processor core is configured to calculate one or more output quantities of a neuron layer of the multi-layer perceptron model having a number of neurons as a function of one or more input quantities of an input quantity vector;
     a processor core which is configured to calculate one or multiple output variables of an output variable vector of a neuron layer of the multilayer perceptron model having a number of neurons as a function of one or of multiple input variables of an input variable vector;
the memory includes, for each neuron layer, a configuration memory region for storing configuration parameters in a respective configuration memory segment* and a data storage region for storing the input quantities of the input quantity vector and the one or more output quantities in a respective data storage segment; and

calculate a respective neuron layer based on the configuration parameters of each configuration memory segment, and to calculate the input quantities defined thereby of the input quantity vector, and to store respectively resulting output quantities in a data storage segment, defined by the corresponding configuration parameters, of the data storage region*.
*See below for mapping of underlined limitations to dependent claim 16 of co-pending application
  a DMA unit which is configured to successively instruct the processor core to calculate a neuron layer, in each case based on input variables of the assigned input variable vector and to store the respectively resulting output variables of the output variable vector in the assigned data memory section;

          wherein the data memory section for the input variable vector assigned to at least one of the neuron layers at least partially includes in each case the data memory sections of at least two of the output variable vectors of two different neuron layers of the neuron layers.
Claim 1, continued
…, a configuration memory region for storing configuration parameters in a respective configuration memory segment


calculate a respective neuron layer based on the configuration parameters of each configuration memory segment…defined by the corresponding configuration parameters, of the data storage region.

16. The model calculation unit as recited in claim 12, wherein the memory for each of the neuron layers includes a configuration memory area for storing configuration parameters in a respective configuration memory section, and 

wherein DMA unit is configured to successively instruct the processor core to calculate a neuron layer in each case based on the configuration parameters of a respective configuration memory section and on the input variable vector defined as a result, and to store the respectively resulting output variable vector in a data memory section of the 








Instant application 16330906
Copending application 16330625
13. The model calculation unit of claim 10, wherein the processor core is configured to signal the end of the current calculation of the neuron layer to the DMA unit or to an external location, the DMA unit subsequently starting the calculation of the next neuron layer based on configuration parameters stored in a further configuration memory segment.
17. The model calculation unit as recited in claim 12, wherein the processor core is configured to signal the DMA unit or to signal externally, an end of an instantaneous calculation of the neuron layer, the DMA unit starting the calculation of a next neuron layer of the neuron layers based on configuration parameters stored in an additional configuration memory section.



Instant application 16330906
Copending application 16330625
14. The model calculation unit of claim 10, wherein the processor core is 
 The model calculation unit as recited in claim 12, wherein the processor core is configured to calculate an output variable for each neuron of a neuron layer of the multilayer perceptron model having a number of neurons as a function of one or multiple input variables of an input variable vector, as a function of a weighting matrix having weighting factors and of an offset value predefined for each neuron, a sum of values of the input variables weighted with a weighting factor determined by the neuron and the input variable being calculated for each neuron, and a result being transformed with an activation 





Instant application 16330906
Copending application 16330625
15. The model calculation unit of claim 10, wherein the processor core is arranged in a surface region of an integrated module.
19. The model calculation unit as recited in claim 12, wherein the processor core is formed in a surface area of an integrated module.



Instant application 16330906
Copending application 16330625
16. A control device comprising:
a microprocessor; and
one or more hardwired hardware-implemented model calculation units for calculating a multi-layer perceptron model and that each includes a processor core, a memory, and a DMA unit;
wherein:
12. A model calculation unit for calculating a multilayer perceptron model having a plurality of neuron layers, the model calculation unit being designed in hardware and being hardwired, the model calculation unit comprising:

     a processor core which is configured to calculate one or multiple output variables of an output variable vector of a neuron layer of the multilayer perceptron model having a number of neurons as a function of one or of multiple input variables of an input variable vector;
the memory includes, for each neuron layer, a configuration memory region for storing configuration parameters in a respective configuration memory segment* and a data storage region for storing the input quantities of the input quantity vector and the one or more output quantities in a respective data storage segment; and
     a memory which a data memory area is provided, in which each neuron layer is assigned a data memory section for storing the input variables of the input variable vector and a data memory section for storing the output variables of the output variable vector; and
the DMA unit is configured to successively instruct the processor core to calculate a respective neuron layer based on the configuration parameters of each configuration memory segment*, and to calculate the input quantities defined thereby of the input defined by the corresponding configuration parameters, of the data storage region.*
*See below for mapping of underlined limitations to dependent claim 16 of co-pending application


          wherein the data memory section for the input variable vector assigned to at least one of the neuron layers at least partially includes in each case the data memory sections of at least two of the output variable vectors of two different neuron layers of the neuron layers.

Claim 16, continued
…, a configuration memory region for storing configuration parameters in a respective configuration memory segment


calculate a respective neuron layer based on the configuration parameters of each configuration memory segment…defined by the corresponding configuration parameters, of the data storage region.






	Claims 12 and 16 of the copending application 16330625 includes all the limitation of claim 16 of the instant application 16330906 except for the underlined limitation identified in the table above. Claim 12 of the copending application 16330625 does not specify “a control device comprising: a microprocessor”. It would be obvious to a person of ordinary skill in the art that the calculation unit of claim 12 of the copending application 16330625 will be present and incorporated in a control device for the proper functioning of the control device.


Instant application 16330906
Copending application 16330625
18. A method comprising:
a control device controlling an engine system of a motor vehicle;
wherein:
the control device includes (a) a microprocessor and (b) one or more hardwired hardware-implemented model calculation units for calculating a multi-layer perceptron model and that each includes a processor core, a memory, and a DMA unit;


     a processor core which is configured to calculate one or multiple output variables of an output variable vector of a neuron layer of the multilayer perceptron model having a number of neurons as a function of one or of multiple input variables of an input variable vector;
the memory includes, for each neuron layer, a configuration memory region for storing configuration parameters in a respective configuration memory segment* and a data storage region for storing the input quantities of the input 

calculate a respective neuron layer based on the configuration parameters of each configuration memory segment*, and to calculate the input quantities defined thereby of the input quantity vector, and to store respectively resulting output quantities in a data storage segment, defined by the corresponding configuration parameters, of the data storage region*.
*See below for mapping of underlined limitations to dependent claim 16 of co-pending application
  a DMA unit which is configured to successively instruct the processor core to calculate a neuron layer, in each case based on input variables of the assigned input variable vector and to store the respectively resulting output variables of the output variable vector in the assigned data memory section;

          wherein the data memory section for the input variable vector assigned to at least one of the neuron layers at least partially includes in each case the data 

…, a configuration memory region for storing configuration parameters in a respective configuration memory segment


calculate a respective neuron layer based on the configuration parameters of each configuration memory segment…defined by the corresponding configuration parameters, of the data storage region.

16. The model calculation unit as recited in claim 12, wherein the memory for each of the neuron layers includes a configuration memory area for storing configuration parameters in a respective configuration memory section, and wherein DMA unit is configured to successively instruct the processor core to calculate a neuron layer in each case based on the configuration parameters of a respective configuration memory section and on the input variable vector defined as a result, and to store the respectively resulting output variable vector in a data memory section of the data memory area defined by the corresponding configuration parameters.



	Claims 12 and 16 of the copending application 16330625 includes all the limitation of claim 18 except for the underlined limitation identified in the table above. 
	Since claim 12 of copending application 16330625 is a device, and claim 18 of the instant application 16330906 is a method claim, that is, claim 12 of the copending application 16330625 recites “a model calculation unit” and claim 18 of the instant application 16330906 recites “a method comprising”. It would be obvious to a person of ordinary skill in the art that the device of claim 12 of the copending application 16330625 will perform the method of the instant application 16330906 when the device executes.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



5.	Claims 10, 11, 13, and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Maaninen (US20150199963) in view of Qiu et al. (Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, FPGA’ 16, February 21-23, 2016, ACM)

	Regarding claim 10, Maaninen teaches hardwired hardware-implemented model calculation unit for calculating a multi-layer perceptron model, the model calculation unit (implementing the hardware accelerator (model calculation unit) as a field-programmable gate array (FPGA) [0063] which is a hardware circuit); Hardware accelerator 312 may be configured to perform the processor-intensive calculations (e.g., algorithm 200) that are required for neural network computations [0041] of a multilayer perceptron model, Figure 1) comprising:
	a processor core (hardware accelerator 312 (model calculation unit) may include a MAC unit 10 to carry out matrix multiply and add operations, an activation function unit 12 to apply an activation function to the output of MAC unit 10 [0043]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]);
	a memory; (Hardware accelerator 312 may also include various buffers or registers (e.g., RAMs 13-18) to store, for example, weights, bias terms, activation function co-efficients, input data, and intermediate output data, etc. [0043]; internal memories (e.g., RAMs 14-18) in hardware accelerator 312 [0054]); and
	a DMA unit; (System bus 21 may, for example, contain a DMA engine [0044]);
	wherein: the processor core is configured to calculate one or more output quantities of a neuron layer of the multi-layer perceptron model having a number of neurons as a function of one or more input quantities of an input quantity vector; (Utilizing the hardware accelerator includes sending matrix data representing one or more frames of an audio signal as input data for a first layer of a neural network to the hardware accelerator, and using a multiplier-accumulator (MAC) unit in the hardware accelerator to perform multiply-accumulate operations. The multiply-accumulate operations include multiplying the received matrix data representing one or more frames of the audio signal with a weight matrix, adding a bias matrix to the multiplication results, and accumulating the addition results. The method further includes using circuitry in the hardware accelerator to pass the accumulated results through an activation function to generate an output matrix representing an output of the first layer of the neural network [0011]; Attention is directed here to speech recognition approaches that use a neural network model for probability computation. In such approaches, a properly-trained multi-layer neural network may be used for pattern recognition at each level [0025]);
	the memory includes, for each neuron layer, a configuration memory region for storing configuration parameters in a respective configuration memory segment (decoding or decompressing a weight matrix and bias terms (as configuration parameters) for the layer received from external memory e.g., weight and bias terms 24 into separate internal memory buffers e.g., RAMs 14, 15 and 16 (as configuration memory segments) [0048]) 
	and a data storage region for storing the input quantities of the input quantity vector (Similarly, at least one column of input matrix 26 may be cached or buffered at a time (e.g., in RAM 18) [0048]) 
	and the one or more output quantities in a respective data storage segment; (and the result (e.g., in 8-bit integer) stored into an intermediate output buffer (e.g., RAM 17-18) [0049]); 
	the processor core (processor core comprises MAC unit 10 and activation function unit 12 (Figure 4); multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244, Figure 2; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010])
	to calculate a respective neuron layer based on the configuration parameters of each configuration memory segment, and to calculate the input quantities defined thereby of the input quantity vector, (each layer of neural network 240 may be computed through a multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244 operation as P (AX+B), where X is the input matrix or vector, and where A, B and P are the weight matrix, the bias vector and the activation function, respectively, for the layer [0035]),
	and to store respectively resulting output quantities in a data storage segment, defined by the corresponding configuration parameters, of the data storage region. (The output of MAC unit (e.g., an N-bit integer) may be processed through activation function unit 12 and the result (e.g., in 8-bit integer) stored into an intermediate output buffer (e.g., RAM 17-18) [0049]) 
	Maaninen does not explicitly teach the model calculation comprises a DMA unit, DMA unit is configured to successively instruct the processor core.
	Qiu teaches the model calculation comprises a DMA unit (programmable logic, PL (as model calculation unit) comprises DMA, page 30 last para., left column, Fig. 4A, pg. 31)
(The configured DMA loads data and instructions to the controller, triggers a computation process on Programmable Logic (PL) (pg. 30, right col, Data Processing) which consist of Processing Elements (Pes) (as processor core) (Fig. 4A, pg. 31) which take charge of the majority of computational tasks in CNN, including CONV layers, Pooling layers, and FC layers, pg. 30, last para., left col-first para., right col)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Maaninen to incorporate the teachings of Qui for the benefit of an accelerator achieving the highest performance, resource efficiency, and power efficiency compared with previous designs (Qui, pg. 34, right col, last para.)

	Regarding claim 11, Maaninen modified by Qui teaches the model calculation unit of claim 10, Maaninen teaches wherein the configuration parameters of configuration memory segments (decoding or decompressing a weight matrix and bias terms (as configuration parameters) for the layer received from external memory e.g., weight and bias terms 24 into separate internal memory buffers e.g., RAMs 14, 15 and 16 (as configuration memory segments) [0048]) 
	successively taken into account indicate a data storage region for the resulting output quantities that correspond to the data storage region for the input quantities for the calculation of a subsequent neuron layer. (at least one column of input matrix 26 may be cached or buffered at a time (e.g., in RAM 18) [0048]; The output of MAC unit (e.g., an N-bit integer) may be processed through activation function unit 12 and the result (e.g., in 8-bit integer) stored into an intermediate output buffer (e.g., RAM 17-18) …, hardware accelerator 312 may start processing a next layer of the neural network by fetching a new bias vector, starting to decode the next weight matrix, etc., and using the preceding layer's output circulated through internal memory as MAC input for the next layer [0049])

	Regarding claim 13, Maaninen modified by Qui teaches the model calculation unit of claim 10, Maaninen teaches of the next neuron layer based on configuration parameters stored in a further configuration memory segment. (each layer of neural network 240 may be computed through a multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244 operation as P (AX+B), where X is the input matrix or vector, and where A, B and P are the weight matrix, the bias vector and the activation function, respectively, for the layer [0035]),
	Qui teaches wherein the processor core is configured to signal the end of the current calculation of the neuron layer to the DMA unit or to an external location, (In each computation phase, the Controller decodes a 16-bit instruction to generate control signals for on-chip buffers and PEs. One instruction is composed with the following signals, pg. 31, right col, 6.3.2 Controller System) Phase Type … Several phases need to be specifically taken care of. For example, for the last phase in the last layer in the last output image, no more weights or data should be loaded in, pg. 32, left col, first para.) 
	the DMA unit subsequently starting the calculation (The configured DMA loads data and instructions to the controller, triggers a computation process on Programmable Logic (PL) (pg. 30, right col, Data Processing) which consist of Processing Elements (Pes) (as processor core) (Fig. 4A, pg. 31))
	The same motivation to combine as independent claim 10 applies here.

	Regarding claim 15, Maaninen modified by Qui teaches the model calculation unit of claim 10, Qui further teaches wherein the processor core is arranged in a surface region of an integrated module.  (we place the Computing Complex on the (that is surface) FPGA chip and the Computing Complex consists of Processing Elements (PEs) which take charge of the majority of computation tasks in CNN, including CONV layers, Pooling layers, and FC layers, pg. 30, last para., left col-first para., right col)
	The same motivation to combine as independent claim 10 applies here.

	Regarding claim 16, Maaninen teaches a control device comprising: a microprocessor; and one or more hardwired hardware-implemented model calculation units for calculating a multi-layer perceptron model (implementing the hardware accelerator (model calculation unit) as a field-programmable gate array (FPGA) [0063] which is a hardware circuit); Hardware accelerator 312 may be configured to perform the processor-intensive calculations (e.g., algorithm 200) that are required for neural network computations [0041] of a multilayer perceptron model, Figure 1; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]) and that each includes 
	a processor core (hardware accelerator 312 (model calculation unit) may include a MAC unit 10 to carry out matrix multiply and add operations, an activation function unit 12 to apply an activation function to the output of MAC unit 10 [0043]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]);
	a memory, (Hardware accelerator 312 may also include various buffers or registers (e.g., RAMs 13-18) to store, for example, weights, bias terms, activation function co-efficients, input data, and intermediate output data, etc. [0043]; internal memories (e.g., RAMs 14-18) in hardware accelerator 312 [0054]); and
	a DMA unit; (System bus 21 may, for example, contain a DMA engine [0044]);
	wherein: the processor core is configured to calculate one or more output quantities of a neuron layer of the multi-layer perceptron model having a number of neurons as a function of one or more input quantities of an input quantity vector; (Utilizing the hardware accelerator includes sending matrix data representing one or more frames of an audio signal as input data for a first layer of a neural network to the hardware accelerator, and using a multiplier-accumulator (MAC) unit in the hardware accelerator to perform multiply-accumulate operations. The multiply-accumulate operations include multiplying the received matrix data representing one or more frames of the audio signal with a weight matrix, adding a bias matrix to the multiplication results, and accumulating the addition results. The method further includes using circuitry in the hardware accelerator to pass the accumulated results through an activation function to generate an output matrix representing an output of the first layer of the neural network [0011]; Attention is directed here to speech recognition approaches that use a neural network model for probability computation. In such approaches, a properly-trained multi-layer neural network may be used for pattern recognition at each level [0025]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]);
	the memory includes, for each neuron layer, a configuration memory region for storing configuration parameters in a respective configuration memory segment (decoding or decompressing a weight matrix and bias terms (as configuration parameters) for the layer received from external memory e.g., weight and bias terms 24 into separate internal memory buffers e.g., RAMs 14, 15 and 16 (as configuration memory segments) [0048]) 
	and a data storage region for storing the input quantities of the input quantity vector (Similarly, at least one column of input matrix 26 may be cached or buffered at a time (e.g., in RAM 18) [0048]) 
	and the one or more output quantities in a respective data storage segment; (and the result (e.g., in 8-bit integer) stored into an intermediate output buffer (e.g., RAM 17-18) [0049]);
	the processor core (processor core comprises MAC unit 10 and activation function unit 12 (Figure 4); multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244, Figure 2)
(each layer of neural network 240 may be computed through a multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244 operation as P (AX+B), where X is the input matrix or vector, and where A, B and P are the weight matrix, the bias vector and the activation function, respectively, for the layer [0035]),
	and to store respectively resulting output quantities in a data storage segment, defined by the corresponding configuration parameters, of the data storage region. (The output of MAC unit (e.g., an N-bit integer) may be processed through activation function unit 12 and the result (e.g., in 8-bit integer) stored into an intermediate output buffer (e.g., RAM 17-18) [0049]) 
	Maaninen does not explicitly teach the model calculation comprises a DMA unit, DMA unit is configured to successively instruct the processor core 
	Qiu teaches the model calculation comprises a DMA unit (programmable logic, PL (as model calculation unit) comprises DMA, page 30 last para., left column, Fig. 4A, pg. 31)
	DMA unit is configured to successively instruct the processor core (The configured DMA loads data and instructions to the controller, triggers a computation process on Programmable Logic (PL) (pg. 30, right col, Data Processing) which consist of Processing Elements (Pes) (as processor core) (Fig. 4A, pg. 31) which take charge of the majority of computational tasks in CNN, including CONV layers, Pooling layers, and FC layers, pg. 30, last para., left col-first para., right col)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Maaninen to incorporate the teachings of Qui for the benefit of an accelerator achieving the highest performance, resource efficiency, and power efficiency compared with previous designs (Qui, pg. 34, right col, last para.)

	Regarding claim 17, Maaninen modified by Qui teaches the control device of claim 16, Maaninen teaches wherein the control device is implemented as an integrated circuit. (Method 600 may include implementing the hardware accelerator as a field-programmable gate array (FPGA) [0063])

6.	Claims 12, 14, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Maaninen (US20150199963) in view of Qiu et al. (Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, FPGA’ 16, February 21-23, 2016, ACM) and further in view of Markert et al (US20150012575)

	Regarding claim 12, Maaninen modified by Qui teaches the model calculation unit of claim 10, Maaninen teaches of the neuron layer, the configuration parameters for a next one of the neuron layers, the calculation being terminated as a function of one or more configuration parameters. (Utilizing the hardware accelerator includes configuring or preparing the hardware accelerator for processing using parameters (e.g., number of layers, dimension or number of neurons, activation function co-efficients, etc.) of the multi-layered neural network (611), Fig. 6, 611-620)
	Modified Maaninen did not explicitly teach wherein the DMA unit is configured to provide to the processor core, after termination of the calculation 
	Markert teaches DMA unit is configured to provide to the processor core, after termination of the calculation (Upon conclusion of the final calculation, the end of the calculations may be communicated in a step S16 to main processing unit 2, depending on the characteristics of second DMA unit 7. This may be carried out by the generation of an interrupt of second DMA unit 7, which is either forwarded directly to main processing unit 2 or to first DMA unit 6, which then forwards the interrupt to main processing unit 2 [0046])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the calculation unit of modified Maaninen to incorporate the teachings of Markert for the benefit of significantly reducing the calculation time of the model calculation unit 3, (Fig. 1) as compared to a software algorithm. (Markert, [0030])

	Regarding claim 14, Maaninen modified by Qui teaches the model calculation unit of claim 10, Maaninen teaches wherein the processor core is configured to calculate, for a neuron layer of a multi-layer perceptron model having a number of neurons, an output quantity for each neuron as a function of one or more input quantities of an input quantity vector, a weighting matrix having weighting factors, (Utilizing the hardware accelerator includes sending matrix data representing one or more frames of an audio signal as input data for a first layer of a neural network to the hardware accelerator, and using a multiplier-accumulator (MAC) unit in the hardware accelerator to perform multiply-accumulate operations. The multiply-accumulate operations include multiplying the received matrix data representing one or more frames of the audio signal with a weight matrix, adding a bias matrix to the multiplication results, and accumulating the addition results. The method further includes using circuitry in the hardware accelerator to pass the accumulated results through an activation function to generate an output matrix representing an output of the first layer of the neural network [0011]; Attention is directed here to speech recognition approaches that use a neural network model for probability computation. In such approaches, a properly-trained multi-layer neural network may be used for pattern recognition at each level [0025]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010])
	weighted with a weighting factor determined by the neuron and the input quantity, and the result is transformed with an activation function in order to obtain the output quantity for the neuron. (The input layer distributes the values to each of the neurons in the hidden layer. Arriving at a neuron in the hidden layer, the value from each input neuron is multiplied by a weight (wjk), and the resulting weighted values are summed together and added to a weighted bias value producing a combined value. The combined value is passed through a transfer or activation function, which outputs a value hj [0029])
	Modified Maaninen did not explicitly teach and an offset value specified for each neuron, such that, for each neuron, the offset value assigned to the neuron is applied to 
	Markert teaches and an offset value specified for each neuron, such that, for each neuron, the offset value assigned to the neuron is applied to a sum of the values of the input quantities, (Configuration registers 32 of configuration register block 31 are configured to receive the parameters and address pointers necessary for calculating the function model …, the address range in which the hyperparameters and the node data for calculating the function model are situated. In addition, the initialization values for the loops to be calculated may be predefined, as well as an offset value, based on which the function value of the data-based function model is calculated [0033])
	The same motivation to combine as independent claim 12 applies here.

	Regarding claim 18, Markert teaches a method comprising: a control device controlling an engine system of a motor vehicle; (FIG.1 schematically shows the design of a control unit 1 (as control device), in particular for operating a physical unit, such as, for example, an internal combustion engine, in a motor vehicle [0027]) 
	wherein: the control device includes (a) a microprocessor (the control unit including a software-controlled main processor unit, a strictly hardware-based model calculation unit for calculating an algorithm, in particular for carrying out a Bayesian regression method (a machine learning method) based on configuration data, and a memory unit, a model memory area being defined in the memory unit to which a configuration register block is assigned for providing the configuration data in the model calculation unit [0009]) and 
(Control unit 1 includes a microcontroller as main processing unit 2, which together with a model calculation unit 3 is integrally configured. Model calculation unit 3 is essentially a hardware unit which is able to carry out hardware-based function calculations [0027], Fig. 1)
	a memory, (model calculation unit 3 may be configured to integrate configuration register block 31 having configuration registers 32 into the memory area of the system [0032]) and
	a configuration memory region for storing configuration parameters in a respective configuration memory segment (The configuration registers are used to deliver start values to the model calculation unit for the algorithm for calculating a function value, and further to define the address ranges in which parameters for parametric function models or hyperparameters and node data for the calculation of the nonparametric, data-based function model are stored [0010]) 
	and a data storage region for storing the input quantities of the input quantity vector and the one or more output quantities in a respective data storage segment; ( the configuration registers may be assigned to a contiguous memory area of the internal memory, the model memory area, so that they are transferred from the model memory area in which the configuration data for the model calculation unit are stored to the configuration register block of the model calculation unit by a simple incremental memory copying process [0011]) and
	based on the configuration parameters of each configuration memory segment, and to calculate the input quantities defined thereby of the input quantity vector,  (The configuration registers are used to deliver start values to the model calculation unit for the algorithm for calculating a function value, and further to define the address ranges in which parameters for parametric function models or hyperparameters and node data for the calculation of the nonparametric, data-based function model are stored [0010]) 
	Markert does not explicitly teach a multi-layer perceptron model, and that each includes a processor core, a DMA unit; the processor core is configured to calculate one or more output quantities of a neuron layer of the multi-layer perceptron model having a number of neurons as a function of one or more input quantities of an input quantity vector; the memory includes, for each neuron layer, the DMA unit is configured to successively instruct the processor core to calculate a respective neuron layer and to store respectively resulting output quantities in a data storage segment, defined by the corresponding configuration parameters, of the data storage region.
	Maaninen teaches a multi-layer perceptron model, (Different types of activation functions may be used for different layers of the multi-layer neural network. The output Y of the multi-layer neural network may be computed recursively through MAC module 242 using output of each layer as input for the next layer [0035])
	and that each includes a processor core, (Hardware accelerator 312 may be configured to perform the processor-intensive calculations (e.g., algorithm 200) that are required for neural network computations [0041]; hardware accelerator 312 (model calculation unit) may include a MAC unit 10 to carry out matrix multiply and add operations, an activation function unit 12 to apply an activation function to the output of MAC unit 10 [0043]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]);
	the processor core is configured to calculate one or more output quantities of a neuron layer of the multi-layer perceptron model having a number of neurons as a function of one or more input quantities of an input quantity vector; (Utilizing the hardware accelerator includes sending matrix data representing one or more frames of an audio signal as input data for a first layer of a neural network to the hardware accelerator, and using a multiplier-accumulator (MAC) unit in the hardware accelerator to perform multiply-accumulate operations. The multiply-accumulate operations include multiplying the received matrix data representing one or more frames of the audio signal with a weight matrix, adding a bias matrix to the multiplication results, and accumulating the addition results. The method further includes using circuitry in the hardware accelerator to pass the accumulated results through an activation function to generate an output matrix representing an output of the first layer of the neural network [0011]; Attention is directed here to speech recognition approaches that use a neural network model for probability computation. In such approaches, a properly-trained multi-layer neural network may be used for pattern recognition at each level [0025]; the hardware accelerator is implemented as an Application Specific Integrated Circuit (ASIC) core [0010]);
	the memory includes, for each neuron layer, (decoding or decompressing a weight matrix and bias terms for the layer received from external memory (e.g., weight and bias terms 24) into separate internal memory buffers (e.g., RAMs 14, 15 and 16) [0048])
 (each layer of neural network 240 may be computed through a multiple-accumulate (MAC) module 242 and a curve-fitting or activation function 244 operation as P (AX+B), where X is the input matrix or vector, and where A, B and P are the weight matrix, the bias vector and the activation function, respectively, for the layer [0035]),
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Markert to incorporate the teachings of Maaninen for the benefit of providing configuration settings for the hardware accelerator as well as provides bitstreams of compressed or uncompressed weights and bias terms for the neural network calculations to the hardware accelerator. (Maaninen, [0006])
	Qiu teaches the model calculation comprises a DMA unit (programmable logic, PL (as model calculation unit) comprises DMA, page 30 last para., left column, Fig. 4A, pg. 31)
	the DMA unit is configured to successively instruct the processor core (The configured DMA loads data and instructions to the controller, triggers a computation process on Programmable Logic (PL) (pg. 30, right col, Data Processing) which consist of Processing Elements (Pes) (as processor core) (Fig. 4A, pg. 31) which take charge of the majority of computational tasks in CNN, including CONV layers, Pooling layers, and FC layers, pg. 30, last para., left col-first para., right col)
(Qui, pg. 34, right col, last para.)

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MORIAM MOSUNMOLA GODO whose telephone number is (571)272-8670. The examiner can normally be reached Monday-Friday 7:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx 

/M.G./Examiner, Art Unit 2121                                    




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121