DETAILED ACTION
This office action is in response to the Application No. 16404232 filed on
4/11/2018. Claims 3 and 10 has been cancelled, claims 1, 2, 4-9 and 11-15 are presented for examination and are currently pending. Applicant’s arguments have been carefully and respectfully considered.

Claim Objections
2.	Claims 1 and 8 are objected to because the limitation “a general topology structure being designed” is incomplete. Appropriate correction is required.

				Response to Arguments
3.	In page 8 of the remarks, the applicant argues that “The method does not need to change an underlying circuit design of a hardware accelerator, and just need to know the topology structure of the convolution neural network and the parameter for each layer (see paragraph [0027])”. The Office respectfully disagrees with the above argument. Chakradhar teaches “domain-experts or users can specify arbitrary CNN networks by using a simple, high abstraction level software programming API through a host system 252 (other methods may be employed for specifying as well). A CNN compiler automatically translates the entire high abstraction network specification into a parallel microprogram (a sequence of low-level VLIW instructions) that is mapped, scheduled and executed by the controller 260 of coprocessor 202. Instructions to facilitate complex control and data flows, as well as on-the-fly packing of intermediate data to minimize off-chip memory transfers, are also natively supported by the coprocessor 202. In one example, using a 115 MHz prototype co-processor implementation (that may emulated, e.g., on an off-the-shelf PCI FPGA card), diverse CNN workloads were executed 30× to 40× faster than a parallel software implementation on a 2.2 GHz AMD Opteron processor” [0056].
	In page 8 of the remarks, the applicant argues that “It is not explicitly shown that preset of RBF function parameters could be used in the forward propagation phase and exhibit good performance”, referring to Zhang. The Office respectfully disagrees with the above argument. The applicant is arguing limitations that are not claimed and none of the claim recites forward propagation phase.
	In page 9 of the remarks, the applicant argues that “Regarding Claim 4 and Claim 11, Chakradhar uses only one type of non-linear function, that is tanh for the specific convolutional neural network. For comparison, the present application presets multiple functions in the including BatchNorm, Scale, Eltwise, ReLU, Sigmoid, Tanh, Pooling, max pooling, mean pooling, root mean square pooling, FC and Softmax in hardware design to suit for most custom CNN models. This could not be achievable with the method of Chakradhar modified by Zhang”. The Office respectfully disagrees with the above argument because claims 4 and 11 recites “one or more of the following functions: BatchNorm, Scale, Eltwise, ReLU, Sigmoid, Tanh, Pooling, max pooling, mean pooling, root mean square pooling, FC, and Softmax”.
	In page 10 of the remarks, the applicant argued that “Chakradhar does not disclose using the configuration parameters to determine the calling order of the selected function module at all”. LiKamWa teaches using the configuration parameters to determine the calling order of the selected function module in Figure 2, page 258, based on programming input into the control plane which generates flow control.
	In page 10 of the remarks, the applicant argued that “Esmaeilzadel teaches step size of the convolution calculation for the training phase. As mentioned before, the method according to the present application is applied to the forward propagation phase. Esmaeilzadel does not show how to configure the step size in a forward propagation phase. Besides, Esmaeilzadel teaches the method of Parrot transformation which can yield significant performance and energy improvements. However, the Parrot transformation has three key phases to selects and trains a suitable neural network [pg. 450, left col, first para], but does not explicitly show how to accelerate a custom CNN model with specific configurations”. The Office respectfully disagrees with the above argument. The applicant is arguing limitations that are not claimed and none of the claim recites forward propagation phase and accelerate a custom CNN model with specific configurations. Also, Esmaeilzadeh has been applied as a secondary reference which teaches ‘step size of the convolution calculation’. Esmaeilzadeh as a secondary reference does not need to teach all the limitations recited in the instant claim. Furthermore, Chakradhar modified by Esmaeilzadeh would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Esmaeilzadeh for the benefit of neural network that can be efficiently accelerated using dedicated hardware, yielding significant performance and energy improvements (Esmaeilzadeh, pg. 450, left col, first para.)
	In page 10, of the remarks, the applicant argued that “Amir teaches a method for mapping a neural network onto a neurosynaptic substrate. The system comprises a reordering unit for reordering at least one dimension of an adjacency matrix representation of the neural network [ABSTRACT]. The reordering unit works on the matrix, but not the synaptic weight. Besides, the synaptic weights are types of hardware-related constraints of the substrate [0033], which has different meanings with the weight parameter mentioned in the method of the present application”. The Office respectfully disagrees with the above argument because Amir stated in [0050] that the synaptic weights (e.g., in terms of their value, the number of different weights or dynamic range of weights). Therefore, the synaptic weight teaches the weight parameter in claim 2 and claim 9.
	In page 11 of the remarks, the applicant argued that “For comparison, Ando introduces two types of the row-wise SRAM buffers with "multi- port" and "multi-bank" data selection and concludes that "multi-bank" memories require more write accesses, but use lower energy and area. Ando does not explicitly show how to ensure the performance of convolution calculation module, make full use of local memory, and support different specifications of input data”. The Office respectfully disagrees with the above argument. According to MPEP 2144 (IV), It is not necessary that prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006) (motivation question arises in the context of the general problem confronting the inventor rather than the specific problem solved by the invention); As a result, it is not necessary that Ando suggest the performance of convolution calculation module, make full use of local memory, and support different specifications of input data. Rather, Chakradhar modified by Ando would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).
	In page 11 of the remarks, the applicant argued that “For comparison, Zhou teaches a method to replace the large weight matrices by combinations of multiple Kronecker products of smaller matrices [Abstract]. In particular, Formula (13) describes how to approximate the weight tensor with two smaller 4-dimensional tensors. However, the method is applied to weight matrices of convolutional networks, which is different from data to be identified in the method according to the present application”. The Office respectfully disagrees with the above argument because Zhou teaches Note that we need not to calculate the Kronecker product A ⊗ B explicitly. When the KFC (Kronecker fully-connected layer) layer is fed with inputs of batch k, we can forward the KFC layer efficiently, according to Eq. (2): page 4, last para. The inputs of batch k is the input data associated with each weight in the matrices of Zhou. The input data of Zhou is split into batches of k.
	In page 11 of the remarks, the applicant argued that “For comparison, Chakradhar teaches that a learning engine controller is used to enable the configuration of the input switch and the output switch in accordance with user input [0055]. However, the controller is used to configure the hardware architecture, which has different function with the second electronic equipment in the method according to the present application”. The Office respectfully disagrees with the above argument. Chakradhar states in [0055], that the Coprocessor 202 includes a learning engine controller 260 which supports configurations for the computational elements 141 and output branches 142, … Controller 260 may include a memory interface 266, which interfaces with memory subsystem 204. Memory system 204 includes instructions 271 (e.g., VLIW instructions) stored therein. These instructions 271 are fetched using an instruction fetch unit 268, and decoded and executed by a decode and execution unit 270. A data fetch unit 272 fetches the data from memory subsystem 204 in accordance with executed instructions. The learning engine controller 260, Figure 2, include a data fetch unit 272 that fetches that data from memory which performs the function of the second electronic equipment recited in the claim 15.
	In page 12 of the remarks, the argument of the applicant is not persuasive because the applicant claims “a parameter for each layer of the neural network model of the first electronic equipment which is trained from an opensource framework” in claim 15, which implies a training of a neural network model. However, the applicant has argued in page 12 that “In the present application, no models are trained but just extract parameters from pre-trained models” which contradicts the training recited in instant claim 15. As a result, claim 15 is met because Shan which modifies Chakradhar teaches extracting a topology structure (layered structure of neurons [0030]) and a parameter for each layer (parameters of the layers are extracted [0031]) of the neural network model (structure of a Deep Neural Network (DNN) [0009]) which is trained from an open source framework, (XGBoost (open-source software library) determines the initial parameters of the layer thereby commencing training) and based on the topology structure (layered structure of neurons [0030]) and the parameter for each layer which is extracted, (parameters of the layers are extracted [0031]) generating the configuration parameter (weight matrices [0020])
	The argument against Droppo reference is moot because the prior art rejection is no longer applicable.
	The office withdraws the 112(b) rejection for claim 7, in view of the amendments
made to claim 7.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

4.	Claims 1, 4, 8 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Chakradhar et al (US20110029471) in view of Zhang et al ("Intrusion detection using hierarchical neural networks." Pattern Recognition Letters 26.6 (2005): 779-791.) and further in view of LiKamWa et al. (RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture).

	Regarding claim 1, Chakradhar teaches a method (method generates a program to configure the coprocessor architecture [0013])
	for hardware acceleration (hardware acceleration is necessary… to accelerate the learning phase [0029])
	of a neural network model (convolutional neural network [0029])
	of a first electronic equipment;(FPGA [0056])
	without changing an underlying circuit design of the first electronic equipment, a general topology structure being designed (Domain-experts or users can specify arbitrary CNN networks by using a simple, high abstraction level software programming API through a host system 252 (other methods may be employed for specifying as well). A CNN compiler automatically translates the entire high abstraction network specification into a parallel microprogram (a sequence of low-level VLIW instructions) that is mapped, scheduled and executed by the controller 260 of coprocessor 202. Instructions to facilitate complex control and data flows, as well as on-the-fly packing of intermediate data to minimize off-chip memory transfers, are also natively supported by the coprocessor 202. In one example, using a 115 MHz prototype co-processor implementation (that may emulated, e.g., on an off-the-shelf PCI FPGA card), diverse CNN workloads were executed 30× to 40× faster than a parallel software implementation on a 2.2 GHz AMD Opteron processor [0056]) the method comprising:
	obtaining data to be identified (input images (Y1, Y2 and Y3)[0061])
	and a configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068])
	for the neural network model (of the CNN workload [0059])
	of the first electronic equipment;(FPGA [0056])
	proceeding the hardware acceleration (configuring a coprocessor to address accelerating CNNs [0003] and hardware architecture automatically analyzes CNN workloads and dynamically configures its hardware and software components to match the workload characteristics [0059])
	of a convolution calculation (convolution operation [0031]; convolved sum of Y images with Y corresponding kernels [0044]) 
	matched (the architecture is dynamically configured to match the workload characteristics [0059])
	with the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified (input images (Y1, Y2 and Y3)[0061], Fig. 5)
	according to the configuration parameter, (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068])
	and generating a convolution result (convolution of fewer than n images with their respective kernels results in an intermediate (partial)output image [0031])
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified; (input images (Y1, Y2 and Y3)[0061], Fig. 5)
	and proceeding the hardware acceleration of a function calculation (processing also includes employing functions (e.g., non-linear functions and sampling inputs [0076])
	for the convolution result (convolution of fewer than n images with their respective kernels results in an intermediate (partial)output image [0031])
	by calling one or more function modules (non-linear function 18 and non-linear function 22, Fig. 1) which involves calling non-linear functions; after the convolution step, the value of every hidden unit is subjected to a squashing function(non-linearity) [0011] which also involves calling a non-linear function)
	which match with the neural network model (the architecture is dynamically configured to match the workload characteristics [0059])
	of the first electronic equipment (FPGA [0056])
	from at least one function module which preset according to the configuration parameter (function module in Fig. 1 is preset in an order and the order is that non-linear function 18 is called first and non-linear function 22 is called second). The applicant discloses that “if Eltwise and ReLU are required, what are parameters of Eltwise, whether to call Eltwise first, or to call ReLU first. It is to be noted that, Function modules can be set in any order when preset in an equipment, but usually have a sequential requirement when called” (instant specification, [0050])
	and generating (generate select signals [0061])
	a recognition result (of one output element X1 [0061], Fig. 5)
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified (input images (Y1, Y2 and Y3)[0061], Fig. 5)
	Chakradhar does not explicitly teach at least one function module which preset according to the configuration parameter, wherein the hardware acceleration of the function calculation for the convolution result comprises: connecting one or more function modules by Bypass in the general topology structure and calling the functions in sequence according to the configuration parameter to skip over the function which is irrelevant to the configuration parameter in multiple functions.
	Zhang teaches from at least one function module which preset according to the configuration parameter, (function parameters of a neural network can be preset in accordance with the prior understanding of the training data or requirements of the output, pg. 782, left col. second para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar to incorporate the teachings of Zhang for the benefit of training new classifiers, and modifying their structures adaptively after new classifiers are trained (Zhang, abstract).
	LiKamWa teaches wherein the hardware acceleration of the function calculation for the convolution result comprises: connecting one or more function modules by Bypass in the general topology structure and calling the functions in sequence according to the configuration parameter to skip over the function which is irrelevant to the configuration parameter in multiple functions (Finally, the quantization module exports the digital representation of the RedEye ConvNet result. A flow control mechanism routes dataflow through RedEye modules, skipping modules as necessary (pg. 258, left col, section B: RedEye hardware architecture); If any layer is unneeded in a ConvNet dataflow, the bypass flow control of each module provides an alternate signal route to circumvent the corresponding module. For example, if pooling is not required, the module can be skipped entirely (starting from last sentence on pg. 258 to first para. on pg. 259); The ConvNet program includes the layer ordering, layer dimensions, and convolutional kernel weights. RedEye uses the digitally-clocked controller to load the program from
the SRAM into the cyclic signal flow, issuing the specified kernel weights and noise parameters, (pg. 259, right col, section C. ConvNet Programming Interface); The ConvNet structure of the framework maps to a sequential execution of RedEye modules (pg. 259, right col, section D: RedEye ConvNet simulation framework); To compare against specialized digitalized hardware acceleration, ... In comparison, when performing 7 layers of convolutions in a Depth4 configuration, RedEye consumes 1.3 mJ per frame (pg. 263, RedEye with hardware acceleration section, starting from last para. of left col. to right col., second full para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Chakradhar to incorporate the teachings of LiKamWa for the benefit of using RedEye which accelerates execution for the CPU from 1.83 fps to 3.36 fps and maintains GPU timing, i.e., “real-time” 30 fps. Thus when paired with the GPU and CPU, using RedEye can save 44.3% and 45.6% of the energy per frame, respectively (pg. 263, left col, RedEye with CPU/GPU execution section)

	Regarding claim 4, Modified Chakradhar teaches the method of claim 1, Chakradhar teaches wherein at least one function module which preset comprises one or more of the following functions: BatchNorm, Scale, Eltwise, ReLU, Sigmoid, Tanh, Pooling, max pooling, mean pooling, root mean square pooling, FC, and Softmax (tanh, [0031])

	Regarding claim 8, Chakradhar teaches a device (a computer [0028])
	for hardware acceleration (hardware acceleration is necessary … to accelerate the learning phase [0029])
	of a neural network model (convolutional neural network [0029]) 
	of a first electronic equipment (FPGA [0056]) 
	without changing an underlying circuit design of the first electronic equipment, a general topology structure being designed, (Domain-experts or users can specify arbitrary CNN networks by using a simple, high abstraction level software programming API through a host system 252 (other methods may be employed for specifying as well). A CNN compiler automatically translates the entire high abstraction network specification into a parallel microprogram (a sequence of low-level VLIW instructions) that is mapped, scheduled and executed by the controller 260 of coprocessor 202. Instructions to facilitate complex control and data flows, as well as on-the-fly packing of intermediate data to minimize off-chip memory transfers, are also natively supported by the coprocessor 202. In one example, using a 115 MHz prototype co-processor implementation (that may emulated, e.g., on an off-the-shelf PCI FPGA card), diverse CNN workloads were executed 30× to 40× faster than a parallel software implementation on a 2.2 GHz AMD Opteron processor [0056]) the device comprising:
	an acquisition module (the input switch 220 is constructed from a simple … that accepts inputs [0060], Fig. 5)
	obtaining data to be identified (input images (Y1, Y2 and Y3) [0061])
	and a configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068])
	for the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	a convolution calculation module (2D Convolver, 150, Fig. 4) 
	used for proceeding the hardware acceleration (configuring a coprocessor to address accelerating CNNs [0003] and hardware architecture automatically analyzes CNN workloads and dynamically configures its hardware and software components to match the workload characteristics [0059])
	of a convolution calculation (convolution operation [0031]; convolved sum of Y images with Y corresponding kernels [0044]) 
	matched (the architecture is dynamically configured to match the workload characteristics [0059])
	with the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified (input images (Y1, Y2 and Y3)[0061], Fig. 5)
	according to the configuration parameter, (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068])
	and generating a convolution result (convolution of fewer than n images with their respective kernels results in an intermediate (partial)output image [0031])
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified; (input images (Y1, Y2 and Y3)[0061], Fig. 5)
	a function calculation module (functional Unit 280, Fig. 4)
	used for proceeding the hardware acceleration of a function calculation (processing also includes employing functions (e.g., non-linear functions and sampling inputs [0076])
	for the convolution result (convolution of fewer than n images with their respective kernels results in an intermediate (partial)output image [0031])
	by calling one or more function modules (non-linear function 18 and non-linear function 22, Fig. 1) which involves calling non-linear functions; after the convolution step, the
value of every hidden unit is subjected to a squashing function(non-linearity) [0011] which also involves calling a non-linear function)
	which match with the neural network model (the architecture is dynamically configured to match the workload characteristics [0059])
	of the first electronic equipment (FPGA [0056])
	from at least one function module which preset according to the configuration parameter (function module in Fig. 1 is preset in an order and the order is that non-linear function 18 is called first and non-linear function 22 is called second). The applicant discloses that “if Eltwise and ReLU are required, what are parameters of Eltwise, whether to call Eltwise first, or to call ReLU first. It is to be noted that, Function modules can be set in any order when preset in an equipment, but usually have a sequential requirement when called” (instant specification, [0050])
	and generating (generate select signals [0061]) 
	a recognition result (of one output element X1 [0061], Fig. 5)
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])
	for the data to be identified (input images (Y1, Y2 and Y3) [0061], Fig. 5)
	wherein the function calculation module (functional Unit 280, Fig. 4) comprises
	Chakradhar does not explicitly teach at least one function module which preset according to the configuration parameter, a function skip module, each function is used for implementing a function calculation of a specific function, and the function skip module is used for connecting one or more function modules by Bypass in the general topology structure and calling the functions in sequence according to the configuration parameter to skip over the function which is irrelevant to the configuration parameter in multiple functions.
	Zhang teaches from at least one function module which preset according to the configuration parameter, (function parameters of a neural network can be preset in accordance with the prior understanding of the training data or requirements of the output, pg. 782, left col. second para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar to incorporate the teachings of Zhang for the benefit of training new classifiers, and modifying their structures adaptively after new classifiers are trained (Zhang, abstract).
	LiKamWa teaches a function skip module, each function is used for implementing a function calculation of a specific function, (The design exploits the stackable
structure of ConvNet layers to reuse RedEye modules for ConvNet processing in the analog domain. Of the four types of RedEye modules, enumerated in Figure
3, the convolutional and max pooling modules perform the operations of neurons in a ConvNet layer. Meanwhile, the storage module interfaces the captured or processed signal data with the processing flow. Finally, the quantization module exports the digital representation of the RedEye ConvNet result. A flow control mechanism routes dataflow through RedEye modules, skipping modules as necessary. We next describe the design of each RedEye module, each itself designed for reusability and programmability, pg. 258, left col, 1) Cyclic module reuse for deep execution) and 
	the function skip module is used for connecting one or more function modules by Bypass in the general topology structure and calling the functions in sequence according to the configuration parameter to skip over the function which is irrelevant to the configuration parameter in multiple functions (Finally, the quantization module exports the digital representation of the RedEye ConvNet result. A flow control mechanism routes dataflow through RedEye modules, skipping modules as necessary (pg. 258, left col, section B: RedEye hardware architecture); If any layer is unneeded in a ConvNet dataflow, the bypass flow control of each module provides an alternate signal route to circumvent the corresponding module. For example, if pooling is not required, the module can be skipped entirely (starting from last sentence on pg. 258 to first para. on pg. 259); The ConvNet structure of the framework maps to a sequential execution of RedEye modules (pg. 259, right col, section D: RedEye ConvNet simulation framework); To compare against specialized digitalized hardware acceleration, ... In comparison, when performing 7 layers of convolutions in a Depth4 configuration, RedEye consumes 1.3 mJ per frame (pg. 263, RedEye with hardware acceleration section, starting from last para. of left col. to right col., second full para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Chakradhar to incorporate the teachings of LiKamWa for the benefit of using RedEye which accelerates execution for the CPU from 1.83 fps to 3.36 fps and maintains GPU timing, i.e., “real-time” 30 fps. Thus, when paired with the GPU and CPU, using RedEye can save 44.3% and 45.6% of the energy per frame, respectively (pg. 263, left col, RedEye with CPU/GPU execution section)

	Regarding claim 11, Chakradhar modified by Zhang teaches the device of claim 8, Chakradhar teaches wherein at least one function module which preset comprises one or more of the following functions: BatchNorm, Scale, Eltwise, ReLU, Sigmoid, Tanh, Pooling, max pooling, mean pooling, root mean square pooling, FC, and Softmax (tanh, [0031])

6.	Claims 2 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Chakradhar et al (US20110029471) in view of Zhang et al ("Intrusion detection using hierarchical neural networks." Pattern Recognition Letters 26.6 (2005): 779-791.) in view of LiKamWa et al. (RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture) in view of Esmaeilzadeh et al. ("Neural acceleration for general purpose approximate programs." 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 2012.) and further in view of Amir et al (US2016335535)

	Regarding claim 2, Chakradhar modified by Zhang teaches the method of claim 1, Chakradhar teaches wherein the configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068]) comprises: 
	a weight parameter (small kernels of weights [0010] and hardware implementation will approximate kernel values (weights [0068]))
	of the neural network model (of convolutional neural networks [0029])
	of the first electronic equipment, (FPGA [0056])
	a convolution calculation parameter (convolutional kernels (2D array of weights) are of size 5X5 in the entire network [0037]) 
	and one or more of called function parameters which are required;(tanh [0031])
	the convolution calculation parameter comprises: specification of the data to be identified, (input images (Y1, Y2, and Y3) [0061], Fig. 5; n images I1…In as inputs [0031], Fig. 1)
quantity of a convolution kernel, (convolved with kernels k11…kn1 [0031]) 
size of the convolution kernel, (convolutional kernels (2D array of weights) are of size 5X5 in the entire network [0037]) 
	and one or more of number of layers of the neural network model; (first two layers 102 and 104, the third layer 106 is only a convolutional layer while the last layer 108 is a traditional fully-connected layer [0037], Fig. 2)
	called function parameters which are required comprise: a function name, (tanh is the non-linear function [0031]) 
	a function parameter (tanh [0031]) 
	and a calling sequence, which is called by the neural network model (calling sequence is that non-linear function 18 is called first and non-linear function 22 is called second, Fig. 1)
	of the first electronic equipment (FPGA [0056])
	according to requirement (network specification, [0056]). Applicant discloses “Called function parameters which are required comprise ReLU and Pooling” (instant specification, [00109])
	Chakradhar modified by Zhang does not explicitly teach step size of the convolution calculation.
	Esmaeilzadeh teaches step size of the convolution calculation (the learning rate, a value between 0 and 1, is the step size of the gradient descent and identifies how much a single example affects the weights, pg. 452, left col, last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Esmaeilzadeh for the benefit of neural network that can be efficiently accelerated using dedicated hardware, yielding significant performance and energy improvements (Esmaeilzadeh, pg. 450, left col, first para.)
	Chakradhar modified by Zhang does not explicitly teach wherein the weight parameter is generated by rearranging an original weight parameter of the neural network model of the first electronic equipment based on a format needed by the first electronic equipment; 
	Amir teaches wherein the weight parameter (synaptic weight wij [0057])
	is generated by rearranging an original weight parameter (rearranges rows 120 and columns 110 of a matrix representation 100 [0059], and an entry in the matrix 100 at a particular row i and a particular column j is a synaptic weight wij [0057])
	of the neural network model (of a neural network algorithm)
	of the first electronic equipment based on a format needed by the first electronic equipment; (field-programmable gate arrays (FPGA) [0127])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Amir for the benefit of reordering and/or algorithm-level iteration to reduce resource utilization (Amir, [0074]).

	Regarding claim 9, Chakradhar modified by Zhang teaches the device of claim 8,
Chakradhar teaches wherein the configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068]) comprises: 
	a weight parameter (small kernels of weights [0010] and hardware implementation will approximate kernel values (weights [0068]))
	of the neural network model (of convolutional neural networks [0029])
	of the first electronic equipment (FPGA [0056])
	a convolution calculation parameter (convolutional kernels (2D array of weights) are of size 5X5 in the entire network [0037]) 
	and one or more of function parameters which need to be called (tanh [0031])
	the convolution calculation parameter comprises: specification of the data to be identified, (input images (Y1, Y2, and Y3) [0061], Fig. 5; n images I1…In as inputs [0031], Fig. 1)
	quantity of a convolution kernel (convolved with kernels k11…kn1 [0031]) 
	size of the convolution kernel, (convolutional kernels (2D array of weights) are of size 5X5 in the entire network [0037]) 
	and one or more of number of layers of the neural network model; (first two layers 102 and 104, the third layer 106 is only a convolutional layer while the last layer 108 is a traditional fully-connected layer [0037], Fig. 2)
	function parameters called which are required comprise: a function name, (tanh is the non-linear function [0031]) 
	a function parameter (tanh [0031]) 
	and a calling sequence, which is called by the neural network model (calling sequence is that non-linear function 18 is called first and non-linear function 22 is called second, Fig. 1)
	of the first electronic equipment (FPGA [0056])
	according to requirement (network specification, [0056]). Applicant discloses “Called function parameters which are required comprise ReLU and Pooling” (instant specification, [00109])
	Chakradhar modified by Zhang does not explicitly teach step size of the convolution calculation.
	Esmaeilzadeh teaches step size of the convolution calculation (the learning rate, a value between 0 and 1, is the step size of the gradient descent and identifies how much a single example affects the weights, pg. 452, left col, last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Esmaeilzadeh for the benefit of neural network that can be efficiently accelerated using dedicated hardware, yielding significant performance and energy improvements (Esmaeilzadeh, pg. 450, left col, first para.)
	Chakradhar modified by Zhang does not explicitly teach wherein the weight parameter is generated by rearranging an original weight parameter of the neural network model of the first electronic equipment based on a format needed by the first electronic equipment; 
	Amir teaches wherein the weight parameter (synaptic weight wij [0057])
	is generated by rearranging an original weight parameter (rearranges rows 120 and columns 110 of a matrix representation 100 [0059], and an entry in the matrix 100 at a particular row i and a particular column j is a synaptic weight wij [0057])
	of the neural network model (of a neural network algorithm)
	of the first electronic equipment based on a format needed by the first electronic equipment; (field-programmable gate arrays (FPGA) [0127])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Amir for the benefit of reordering and/or algorithm-level iteration to reduce resource utilization (Amir, [0074]).


7.	Claims 5, 6, 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Chakradhar et al (US20110029471) in view of Zhang et al ("Intrusion detection using hierarchical neural networks." Pattern Recognition Letters 26.6 (2005): 779-791.) in view of LiKamWa et al. (RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture) and further in view of Ando et al ("A multithreaded CGRA for convolutional neural network processing." Circuits and Systems 8.6 (2017): 149-170.)

	Regarding claim 5, Modified Chakradhar teaches the method of claim 1, Chakradhar teaches wherein the data to be identified (input images (Y1, Y2, and Y3) [0061], Fig. 5)
	and the configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068]) 
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])comprises: 
	Chakradhar modified by Zhang is silent about reading the data to be identified and the configuration parameter of the neural network model of the first electronic equipment from an external memory, and writing the data to be identified and the configuration parameter of the neural network model of the first electronic equipment which is read into a local memory.
	Ando teaches reading the data to be identified (input data are transferred from the external memory, pg. 165, second to the last para.)
	and the configuration parameter of the neural network model (data size (including weights of convolutional layers as parameters) read from external memory, pg.167, third para.)
	of the first electronic equipment (CNN accelerator built on FPGA, pg. 167, second para.)
	from an external memory, (from the external memory, pg. 165, second to the last para.)
	and writing the data to be identified and the configuration parameter of the neural network model of the first electronic equipment which is read into a local memory. (write port of SRAM (as local memory) acquires the data, pg. 165, second to the last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).

	Regarding claim 6, Chakradhar modified by Zhang and further modified by Ando teaches the method of claim 5, Ando teaches wherein when the data (input data, pg. 165, fourth para.)
	to be identified is reading and writing, (read/write. Pg. 165 fourth para)
	each separate data file is read and written only once. (simple 1-write 1-read, pg. 165, second to the last para., input data are stored only once, pg. 165, fourth para. Input data are transferred from the external memory only once, pg. 165, second to the last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).

	Regarding claim 12, Chakradhar modified by Zhang teaches the device of claim 8, Chakradhar teaches and the configuration parameter (hardware architecture determines the optimal configuration of processing elements and memory architecture [0059] and hardware implementation will approximate kernel values (weights) as parameters [0068]) 
	of the neural network model (of the CNN workload [0059])
	of the first electronic equipment (FPGA [0056])comprises: 
	Chakradhar modified by Zhang is silent about a read and write control module used for reading the data to be identified and the configuration parameter of the neural network model of the first electronic equipment from an external memory, and writing the data to be identified and the configuration parameter of the neural network model of the first electronic equipment which is read into a local memory.
	Ando teaches a read and write control module (-read/write SRAM, pg. 165, third para.)
	reading the data to be identified (input data are transferred from the external memory, pg. 165, second to the last para.)
	and the configuration parameter of the neural network model (data size (including weights of convolutional layers as parameters) read from external memory, pg.167, third para.)
	of the first electronic equipment (CNN accelerator built on FPGA, pg. 167, second para.)
	from an external memory, (from the external memory, pg. 165, second to the last para.)
	and writing the data to be identified and the configuration parameter of the neural network model of the first electronic equipment which is read into a local memory. (write port of SRAM (as local memory) acquires the data, pg. 165, second to the last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).

	Regarding claim 13, Chakradhar modified by Zhang and further modified by Ando teaches the device of claim 12, Ando teaches wherein the read and write control module is used to implement that (-read/write SRAM, pg. 165, third para.)
	when the data (input data, pg. 165, fourth para.)
	to be identified is reading and writing, (read/write. Pg. 165 fourth para)
	each separate data file is read and written only once. (simple 1-write 1-read, pg. 165, second to the last para., input data are stored only once, pg. 165, fourth para. Input data are transferred from the external memory only once, pg. 165, second to the last para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).

8.	Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Chakradhar et al (US20110029471) in view of Zhang et al ("Intrusion detection using hierarchical neural networks." Pattern Recognition Letters 26.6 (2005): 779-791.) in view of LiKamWa et al. (RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture) in view of Ando et al ("A multithreaded CGRA for convolutional neural network processing." Circuits and Systems 8.6 (2017): 149-170.) and further in view of Zhou et al. ("Exploiting local structures with the kronecker layer in convolutional networks." arXiv preprint arXiv:1512.09194 (2015))

	Regarding claim 7, Chakradhar modified by Zhang and further modified by Ando teaches the method of claim 5, Ando teaches wherein if specification of the data to be identified (input data, pg. 165, fourth para.)
	which is read (read/write. Pg. 165 fourth para)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).
 	However they do not explicitly teach M*N*K, according to a split method of M*(N 1+N2)*(Kl+K2) a data to be identified is split into some small three-dimensional matrix at the time of writing when the data to be identified is writing; for a picture file, M is a width of a picture, N is a height of the picture, K is number of channels of the picture; K1+K2=K, N1+N2=N.
	Zhou teaches a data to be identified is split into some small three-dimensional matrix at the time of writing (ROXCXHXW, pg. 8, first para.)
	when the data to be identified is writing, for a picture file, M is a width of a picture, N is a height of the picture, K is number of channels of the picture; (where c is the number of channels, h is the height, w is the width, pg. 7, Shape Selection)
	K1+K2=K, N1+N2=N (spatial dimensions of the kernel, h1+h2−1 = h, w1+w2-1 = w, c1c2 =c, pg. 8, first para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang modified by Ando to incorporate the teachings of Zhou for the benefit of reducing computational time and model size with minor loss in accuracy (Zhou, pg. 14, conclusion).

	Regarding claim 14, Chakradhar modified by Zhang and further modified by Ando teaches the device of claim 12, Ando teaches if specification of the data (input data, pg. 165, fourth para.)
	which is read by the read and write control module (-read/write SRAM, pg. 165, third para.)
	when the read and write control module is writing (-read/write SRAM, pg. 165, third para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang to incorporate the teachings of Ando for the benefit of CNN accelerators that reduce external memory access and maximally utilizing the locality of data for low-power embedded applications (Ando, pg. 167, third para.).
	However they do not explicitly teach M*N*K, according to a split method of M*(N 1+N2)*(Kl+K2) the processing data is split into some small three-dimensional matrix when the read and write control module is writing; for a picture file, M is a width of a picture, N is a height of the picture, K is number of channels of the picture; K1+K2=K, N1+N2=N.
	Zhou teaches the processing data is split into some small three-dimensional matrix at the time of writing (ROXCXHXW, pg. 8, first para.)
	when the data to be identified is writing, for a picture file, M is a width of a picture, N is a height of the picture, K is number of channels of the picture; (where c is the number of channels, h is the height, w is the width, pg. 7, Shape Selection)
	K1+K2=K, N1+N2=N. (spatial dimensions of the kernel, h1+h2−1 = h, w1+w2-1 = w, c1c2 =c, pg. 8, first para.)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Chakradhar modified by Zhang modified by Ando to incorporate the teachings of Zhou for the benefit of reducing computational time and model size with minor loss in accuracy (Zhou, pg. 14, conclusion).

9.	Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Chakradhar et al (US20110029471) in view of Zhang et al ("Intrusion detection using hierarchical neural networks." Pattern Recognition Letters 26.6 (2005): 779-791.) in view of LiKamWa et al. (RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture) and further in view of Shan et al (US20180060728)

	Regarding claim 15, Chakradhar modified by Zhang teaches a method for the hardware acceleration of the neural network model of the first electronic equipment according to claim 1, Chakradhar teaches a method (method for processing convolutional neural networks [0014])
	an auxiliary (supports configurations for the computational elements [0055])
	acceleration of a neural network model (configuring a coprocessor to address accelerating CNNS. [003])
	of a second electronic equipment (learning engine controller 260 [0055]) comprising: 
	of the first electronic equipment (FPGA [0056])
	Chakradhar modified by Zhang does not explicitly teach extracting a topology structure and a parameter for each layer of the neural network model, which is trained from an open source framework, and based on the topology structure and the parameter for each layer which is extracted, generating the configuration parameter 
	Shan teaches extracting a topology structure (layered structure of neurons [0030])
	 and a parameter for each layer (parameters of the layers are extracted [0031])
	of the neural network model (structure of a Deep Neural Network (DNN) [0009])
	which is trained from an open source framework, (XGBoost (open-source software library) determines the initial parameters of the layer thereby commencing training)
	and based on the topology structure (layered structure of neurons [0030])
	and the parameter for each layer which is extracted, (parameters of the layers are extracted [0031])
	generating the configuration parameter (weight matrices [0020])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Chakradhar modified by Zhang to incorporate the teachings of Shan for the benefit of superior performance both in accuracy and runtime speed (Shan, [0033])

Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MORIAM MOSUNMOLA GODO whose telephone number is (571)272-8670. The examiner can normally be reached Monday-Friday 7:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/M.G./Examiner, Art Unit 2121                                                                                                                                                                                                        

/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121