DETAILED ACTION
This action is in response to the claims filed April 20th 2019. Claims 1-20 are pending and have been examined. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding Claim 1
Step 1 Analysis: Claim 1 is directed to a computer system method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The Claim recites a computer system method. Each of the following limitations:
parsing, by a processor, at least one item of information related to a neural network operation from an input neural network model
determining, by the processor, information of at least one dedicated hardware device; 
and generating, by the processor, a reshaped neural network model by changing information of the input neural network model 
determining the information of the at least one dedicated hardware device

As drafted, is a process that, under its broadest reasonable interpretation, covers an abstract idea, but for the recitation of a generic computing system method and processor. The above limitations in the context of this claim encompasses parsing, determining, generating, and determining (mental processes). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. The claim is not patent eligible.

Regarding Claim 2/3/4/5/8/9/10/11/12
Step 1 Analysis: The rejection of Claim 1 is incorporated, therefore Claim 2/3/4/5/8/9/10/11/12 is directed to a computer system method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The Claim recites a computer system method.

The claim depends on Claim 1. As such the incorporated rejection is directed to an abstract idea. Therefore Claim 2/3/4/5/8/9/10/11/12 recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a tool to perform an abstract idea. Further, “at least one item of information includes at least one item among a size of a kernel applied to the input neural network model and a data type of a kernel applied to the input neural network model,” “Computing resource information”, “computing context information” and “executing the input neural network model” only generally links the judicial exception to a particular field. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic 

Regarding Claim 6
Step 1 Analysis: The rejection of Claim 1 is incorporated, therefore Claim 6 is directed to a computer system method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The Claim recites a computer system method.
receiving at least one item of user selection information through the user interface,

As drafted, is a process that, under its broadest reasonable interpretation, covers an abstract idea, but for the recitation of a generic computing system method. The above limitations in the context of this claim encompasses receiving (mental processes). As such the claim recites an abstract idea.
Furthermore, the claim depends on Claim 1. As such the incorporated rejection is directed to an abstract idea. Therefore Claim 6 recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Further, “providing, by the processor, a user interface for a user of an electronic system employing the neural network system;” simply appends conventional activities to a high level of generality MPEP 2106.05(d). As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic computer to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer. Mere instructions to apply an exception using a generic computer cannot provide an inventive concept. The claim is not patent eligible.

Regarding Claim 7
Step 1 Analysis: The rejection of Claim 6 is incorporated, therefore Claim 7 is directed to a computer system method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: 
assigning, by the processor, the reshaped neural network model to a dedicated hardware device selected by the user
As drafted, is a process that, under its broadest reasonable interpretation, covers an abstract idea, but for the recitation of a generic computing system method. The above limitations in the context of this claim encompasses assigning (mental processes). As such the claim recites an abstract idea.
Furthermore, the claim depends on Claim 6. As such the incorporated rejection is directed to an abstract idea. Therefore Claim 7 recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea, or merely uses a computer as a tool to perform an abstract idea. Further, “at least one item of user selection information includes information indicating a kind of the dedicated hardware device by which the input neural network model is to be executed.” only generally links the judicial exception to a particular field. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed 

Regarding Claims 13-20
	Claims 13-20 are rejected for the same reason as stated above.



Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claims 1-5, 8-9, 13-18, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yu et al “Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism” hereinafter Yu.

Regarding Claim 1
	Yu teaches, A method of operating a neural network system, the method comprising: a processor (Abstract “we propose Scalpel that customizes DNN pruning to the underlying hardware by matching the pruned network structure to the data-parallel hardware organization. Scalpel consists of two techniques: SIMD-aware weight pruning and node pruning” Examiner notes that the Scapel method that operates on a DNN to match underlying hardware using pruning techniques implicitly utilizes a processor.) parsing, by a processor, at least one item of information related to a neural network operation from an input neural network model; (Section 2.1 ¶004 “As shown in Figure 2, weight pruning removes redundant weights and the dense weight matrix W (Figure 2(A)) is converted into a sparse matrix Wsparse” Examiner notes that in order to perform the weight pruning on a dense weight matrix, the weight matrix is parsed from an original neural network. Wherein the at least one item of information is the dense weight matrix. Additionally, an “input neural network model” is taken to describe a neural network model that is input into the processor to be reshaped. For clarity, Input is not taken to describe the particular structure or function of the neural network itself, but only a nominal description.) determining, by the processor, information of at least one dedicated hardware device; (Section 3.1 ¶01 “Figure 6 shows the overview of Scalpel. The first step of Scalpel is profiling and determining the parallelism level of the hardware platform” 
    PNG
    media_image1.png
    264
    281
    media_image1.png
    Greyscale
Examiner notes that the flow chart depicted describes the process of profiling and determining information regarding the degree of hardware parallelism available to a dedicated hardware device.) and generating, by the processor, a reshaped neural network model by changing information of the input neural network model according to a result of determining the information of the at least one dedicated hardware device such that the reshaped neural network model is tailored for execution by the dedicated hardware device. (Section 3.1 ¶005 “By customizing pruning technique for different hardware platforms, Scalpel can reduce both the DNN model size and execution time across all the general-purpose processors without accuracy loss for DNNs” Examiner notes that the generated pruned DNN is the result of the Scapel process of Figure 6. The resulting pruned DNN is a reshaped neural network whose model size is smaller than the original model. Specifically designed based on “the information of the at least one dedicated hardware device”)

Regarding Claim 2
	Yu teaches the method of claim 1
Further Yu teaches, wherein, the at least one item of information includes at least one item among a size of a kernel applied to the input neural network model and a data type of a kernel applied to the input neural network model, (Section 3.3 ¶02 “The first step is weights grouping. All the weights are divided into aligned groups with the same size equal to the supported SIMD
width. Figure 8 (A) shows a simple example of weights grouping” 
    PNG
    media_image2.png
    202
    406
    media_image2.png
    Greyscale
Examiner notes that the first step in weight pruning is fetching the original input neural network model weight matrix which includes the size of the matrix and grouping the weights in order to create a reduced weight kernel. Yu teaches at least one item, that item being the size of the kernel.) and the generating of the reshaped neural network model includes changing at least one item among the size and the data type of the kernel of the input neural network model. (Section 3.3 ¶02 “SIMD-aware weight pruning can reduce both model sizes and execution time of DNNs on low-parallelism hardware” Examiner notes that the reduced model sizes generated by the SIMD aware weight pruning correspond to the reshaped neural network that has changed the size of the kernel from the input neural network model.)

Regarding Claim 3
	Yu teaches the method of claim 1
Further Yu teaches, wherein the information of the at least one dedicated hardware device includes at least one item among, computing resource information as static information before the input neural network model is executed, and computing context information as dynamic information generated during runtime of the input neural network model. (Section 3 ¶001 “Scalpel customizes the DNN pruning for different hardware platforms based on their parallelism.” Examiner notes that the information regarding particular dedicated hardware corresponds to their parallelism. This is information that is indicative of static computing resources information as it is characteristic of the underlying hardware. Yu teaches at least one item, that item being the computing resource information.)

Regarding Claim 4
	Yu teaches the method of claim 3
Further Yu teaches, wherein the computing resource information includes at least one of operation method information, kernel structure information, dataflow information, or data reuse information of the at least one dedicated hardware device. (Section 3 ¶01 “It consists of two methods: SIMDaware weight pruning and node pruning [operation method]” Examiner notes that the chosen operation method is determined by the degree of parallelism of the dedicated hardware. Where the choice method is depicted in the flow chart of Fig. 6)

Regarding Claim 5
	Yu teaches the method of claim 3
Further Yu teaches, wherein the computing context information includes at least one of, information related to execution of the input neural network model during runtime, information about a change in computing resource status with respect to the at least one dedicated hardware device during runtime, and information about an application executed during runtime. (Section 3.3 ¶004 “SIMD-aware weight pruning works layer by layer. The execution time for each layer will be generated at the beginning, and the pruning process starts with the layer of the highest execution time. Every time after retraining the pruned DNN [during runtime], the execution time of each layer will be updated. The new slowest layer will be pruned in the next iteration if the retrained DNN does not lose the original accuracy” Examiner notes that the claim presents the limitation in the alternative wherein “at least one of” the underlined portion appears to be taught by Yu. The context information related to runtime execution corresponds to the execution time of each layer.)

Regarding Claim 8
	Yu teaches the method of claim 1
Further Yu teaches, wherein the generating of the reshaped neural network model comprises: generating N reshaped neural network models from the input neural network model, the N reshaped neural network models respectively corresponding to N different dedicated hardware devices, N being an integer greater than 1. (Section 4 ¶01 “We test Scalpel on three different general purpose processors: microcontrollers, CPU and GPU. They are representatives of hardware with low, moderate and high parallelism, respectively.” As shown in Fig. 15, N=3 different hardware types with 3 different reshaped models respectively, are generated with the same original input neural network model.)

Regarding Claim 9
	Yu teaches the method of claim 1
Further Yu teaches, wherein the generating of the reshaped neural network model comprises: generating the reshaped neural network model from N different neural network models, the reshaped neural network model corresponding to a particular dedicated hardware device, N being an integer greater than 1. (Section 4 ¶06 “We compare the performance and the model size of five DNNs: LeNet-300-100 [32], LeNet-5 [32], ConvNet [28], Network-in-Network (NIN) [34] and AlexNet [29]” Examiner notes, in the experiments presented by Yu, a reshaped model for a particular dedicated hardware device is generated by 5 different original input neural network models. Table 3 presents the relative speedup of the reshaped version between the 5 different input neural network models.)

Regarding Claim 10
	Yu teaches the method of claim 1
Further Yu teaches, wherein, the parsing of the at least one item of information comprises: parsing, from the input neural network model, at least one item among, layer topology information, kernel information, data characteristic information, and compression method information; and the generating of the reshaped neural network model includes changing the parsed at least one item. (Section 3.4 ¶004 “One neuron in the fully-connected layers or one feature map in the convolutional layers is considered as one node. Removing nodes in DNNs only shrinks the size of each layer but will not incur sparsity into the network. The remaining DNN model [reshaped neural network model] after node pruning keeps the regular dense DNN structure and will not suffer from the overheads of network sparsity” Examiner notes as shown in Fig. 11, that in order for the method to shrink the size of each layer, the size of the original layer must be known corresponding to information relating to layer topology. Taking the DNN as input to the Node pruning method corresponds to parsing at least one item.)

Regarding Claim 13
Yu teaches, An application processor comprising: memory storing computer-executable instructions; and a processor configured to execute the computer-executable instructions such that the processor is configured to perform operations including (Examiner notes that Figure 16-20 depict plots generated from traditional DNN, input neural networks, and reshaped models. In order to reshape the input neural networks an implicit application processor that contains memory and instructions to implement the Scalpel method taught by Yu.) determining information of at least one dedicated hardware device,. (Section 3.1 ¶01 “Figure 6 shows the overview of Scalpel. The first step of Scalpel is profiling and determining the parallelism level of the hardware platform” 
    PNG
    media_image1.png
    264
    281
    media_image1.png
    Greyscale
Examiner notes that the flow chart depicted describes the process of profiling and determining information regarding the degree of hardware parallelism available to a dedicated hardware device.) and generating a reshaped neural network model by changing information of an input neural network model according to a result of determining the information of the at least one dedicated hardware device. (Section 3.1 ¶005 “By customizing pruning technique for different hardware platforms, Scalpel can reduce both the DNN model size and execution time across all the general-purpose processors without accuracy loss for DNNs” Examiner notes that the generated pruned DNN is the result of the Scalpel process of Figure 6. The resulting pruned DNN is a reshaped neural network whose model size is smaller than the original model. Specifically designed based on “the information of the at least one dedicated hardware device”)

Regarding Claim 14
	Yu teaches the method of claim 13
	Yu teaches, wherein the processor is further configured to execute the computer-executable instructions such that the processor is configured to (Examiner notes that the processor is discussed in the rejection of claim 13.) parse at least one item of information related to a neural network operation from the input neural network model (Section 2.1 ¶004 “As shown in Figure 2, weight pruning removes redundant weights and the dense weight matrix W (Figure 2(A)) is converted into a sparse matrix Wsparse” Examiner notes that in order to perform the weight pruning on a dense weight matrix, the weight matrix is parsed from an original neural network. Wherein, the at least one item of information is the dense weight matrix. Additionally, an “input neural network model” is taken to describe a neural network model to be input into the processor that is reshaped. Input is not taken to describe the particular structure or function of the neural network itself as stated previously) and generate the reshaped neural network model by changing at least one item of information among the parsed at least one item of information. (Section 3.1 ¶005 “By customizing pruning technique for different hardware platforms, Scalpel can reduce both the DNN model size and execution time across all the general-purpose processors without accuracy loss
for DNNs” Examiner notes that the generated pruned DNN is the result of the Scalpel process of Figure 6. The resulting pruned DNN is a reshaped neural network whose model size is smaller than the original model. Specifically designed based on “the information of the at least one dedicated hardware device”)

Regarding Claim 15
	Yu teaches the method of claim 14
Yu teaches, wherein the processor is further configured to execute the computer-executable instructions such that (Examiner notes that the processor is discussed in the rejection of claim 13.) the parsed at least one item of information includes kernel size information of the input neural network model, (Section 3.3 ¶02 “The first step is weights grouping. All the weights are divided into aligned groups with the same size equal to the supported SIMD width. Figure 8 (A) shows a simple example of weights grouping” 
    PNG
    media_image2.png
    202
    406
    media_image2.png
    Greyscale
Examiner notes that the first step in weight pruning is fetching the original input neural network model weight matrix which includes the size of the matrix and grouping the weights in order to create a reduced weight kernel. Yu teaches at least one item, that item being the size of the kernel) related to a convolution operation (Section 2.1 ¶01 “The fundamental building block of all DNNs is the neuron… DNNs integrate convolutional (CONV) layers and fully-connected (FC) layers into an end-to-end multi-layer network” Examiner notes that the input neural network corresponds to the Convolutional network described) and the processor generates the reshaped neural network model by changing a kernel size according to a dedicated processor to which the reshaped neural network model will be assigned. (Section 3.3 ¶02 “SIMD-aware weight pruning can reduce both model sizes and execution time of DNNs on low-parallelism hardware” Examiner notes that the reduced model sizes generated by the SIMD aware weight pruning corresponds to the reshaped neural network that has changed the size of the kernel from the input neural network model.)
Regarding Claim 16
	Yu teaches the method of claim 13
Yu teaches, wherein the processor is further configured to, determine at least one item among, data reuse information of the reshaped neural network model, and layer type information of the reshaped neural network model, (Section 3.5 ¶03 “To apply both SIMD-aware weight pruning and node pruning to the same network, we will first use node pruning to remove redundant nodes in the convolutional layers” Examiner notes that the pruning method described corresponds to producing a reshaped neural network. In order to remove redundant nodes in the convolutional layers, the processor is aware or determines what type of layer is being processed by the node pruning method.) execute the computer-executable instructions such that the processor is configured to assign the reshaped neural network model to one of a plurality of dedicated hardware devices according to a result of. (Section 3 ¶01 “To address the challenges from traditional pruning techniques, we propose Scalpel in this paper. It consists of two methods: SIMD aware weight pruning and node pruning. Scalpel customizes the DNN pruning for different hardware platforms based on their parallelism.” Examiner notes that the Scalpel method is a method executed by a computer that assigns a pruned DNN, reshaped neural network, to dedicated hardware.)
Regarding Claim 17
Yu teaches, A neural network system comprising: a neural network adaptor module configured to, (Examiner notes that the system described, Scalpel, takes as input a neural network model, which corresponds to a neural network adaptor) receive a first neural network model, determine information parsed from (Section 3.1 ¶01 “Figure 6 shows the overview of Scalpel. The first step of Scalpel is profiling and determining the parallelism level of the hardware platform” 
    PNG
    media_image1.png
    264
    281
    media_image1.png
    Greyscale
Examiner notes that the flow chart depicted describes the process of profiling and determining information regarding the degree of hardware parallelism available to a dedicated hardware device.) and generate a second neural network model by changing the first neural network model based on a result of the determining, the second neural network model being assigned to the first dedicated hardware device; and a neural network device including the first dedicated hardware device, (Section 3.1 ¶005 “By customizing pruning technique for different hardware platforms [assigning to dedicated hardware], Scalpel can reduce both the DNN model size and execution time across all the general-purpose processors without accuracy loss for DNNs” Examiner notes that the generated pruned DNN is the result of the Scalpel process of Figure 6. The resulting pruned DNN is a reshaped neural network whose model size is smaller than the original model. Specifically designed based on “the information of the at least one dedicated hardware device”) the neural network device configured to perform an operation on input data according to the second neural network model to generate an information signal. (Section 5 ¶01 “Table 3 is the overview of the results. Scalpel achieves mean speedups of 3.54x, 2.61x, and 1.25x on the microcontroller, CPU and GPU, respectively.” Examiner notes that in order to access results of the second neural network model, the reshaped model, and output information signal must be produced based on input data.)

Regarding Claim 18
Yu teaches the method of claim 17
Yu teaches, wherein the neural network adaptor module is configured to, parse various items of information including kernel size information from the first neural network model (Section 3.3 ¶02 “The first step is weights grouping. All the weights are divided into aligned groups with the same size equal to the supported SIMD width. Figure 8 (A) shows a simple example of weights grouping” 
    PNG
    media_image2.png
    202
    406
    media_image2.png
    Greyscale
Examiner notes that the first step in weight pruning is fetching the original input neural network model weight matrix which includes the size of the matrix and grouping the weights in order to create a reduced weight kernel. Yu teaches at least one item, that item being the size of the kernel.) manage at least one item of information among static information of at least one hardware device, the static information including, operation method information, kernel structure information, dataflow information, and data reuse information of the at least one hardware device (Section 3 ¶01 “It consists of two methods: SIMDaware weight pruning and node pruning [operation method]” Examiner notes that the chosen operation method is determined by the degree of parallelism of the dedicated hardware, a static attribute of the specific hardware. Where the choice method is depicted in the flow chart of Fig. 6) and generate the second neural network model by using, at least one of the parsed various items of information, and the at least one item of information among the static information of at least one hardware device. (Section 3.1 ¶01 “Figure 6 shows the overview of Scalpel. The first step of Scalpel is profiling and determining the parallelism level of the hardware platform” 
    PNG
    media_image1.png
    264
    281
    media_image1.png
    Greyscale
Examiner notes that the flow chart depicted describes the process of using static information regarding the operation method of specific hardware related to the degree of hardware parallelism available to a dedicated hardware device to generate a pruned DNN)

Regarding Claim 20
Yu teaches the method of claim 17
Yu teaches, a processor configured to execute the neural network adaptor module, wherein the neural network adaptor module includes programs for reshaping the first neural network model to generate the second neural network model. (Section 3.1 ¶01 “Figure 6 shows the overview of Scalpel. The first step of Scalpel is profiling and determining the parallelism level of the hardware platform” 
    PNG
    media_image1.png
    264
    281
    media_image1.png
    Greyscale
Examiner notes that the Scalpel system corresponds to the neural network adapter module. Wherein the trained DNN, first network model, is reshaped to produce a pruned DNN.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6, 7, 11, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Yu. Further in view of Wang et al. “Real-Time Meets Approximate Computing: An Elastic CNN Inference Accelerator with Adaptive Trade-off between QoS and QoR” hereinafter Wang.

Regarding Claim 6
	Yu teaches the method of claim 1
	Yu does not explicitly teach, providing, by the processor, a user interface for a user of an electronic system employing the neural network system; and receiving at least one item of user selection information through the user interface, wherein the generating of the reshaped neural network model includes changing the information of the input neural network model by further using the at least one item of user selection information. 
	Wang, however, when addressing issues related a configurable neural network inference device teaches, providing, by the processor, a user interface for a user of an electronic system employing the neural network system; (Introduction “so that the accelerator [neural network system] can operate in the most energy-efficient point within the user-specified constraint of accuracy and speed” Examiner notes that a device that is mediated by a user specified constraint corresponds to “a user interface for a user of an electronic system”) and receiving at least one item of user selection information through the user interface, (Section 3.1 ¶01 “ELNA comprises a reconfigurable CNN accelerator that switches in different operation modes and a software CNN synthesizer that searches for the proper CNN topologies and suitable parameters to meet the user-specified QoR and QoS constraint” Examiner notes that in order for the system to synthesizer topologies that are suitable the user selection must have been received) wherein the generating of the reshaped neural network model includes changing the information of the input neural network model by further using the at least one item of user selection information. (Section 3.1 ¶01 “re-optimizes [changing] the neural network model [generating the reshaped neural network model] without impairing its function and searches for the perfect software/ hardware configuration that meets the requirement ofQoR and QoS [using at least one item of user selection information]”)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate for modifying and existing neural network model in accordance with parameters specified by a user selection as taught by Wang to the disclosed invention of Yu.
One of ordinary skill in the arts would have been motivated to make this modification in order to have “a better chance of satisfying real-time processing (Wang Conclusion)

Regarding Claim 7
	Yu/Wang teaches the method of claim 6.
	Further Wang teaches, assigning, by the processor, the reshaped neural network model to a dedicated hardware device selected by the user, wherein the at least one item of user selection information includes information indicating a kind of the dedicated hardware device by which the input neural network model is to be executed. (Section 3.3 ¶03 “the accelerator can operate in different precision/throughput modes. For example in Fig. 3, each PE can work in unison-mode as a single-issue 16-bit PE or in separate mode as double-issue 8-bit PE to suit the precision-mode of the reshaped CNN model” Examiner notes that the kind of dedicated hardware for the reshaped model to be run on is mediated by the users selected accuracy and speed parameters. The parameters determine the appropriate hardware configuration.)

Regarding Claim 11
	Yu/Wang teaches the method of claim 1.
Further Wang teaches, further comprising: determining, by a processor, a kind of an application currently being executed during runtime of the input neural network model, wherein the reshaped neural network model is generated by changing accuracy of the input neural network model according to a result of determining the kind of the application. (Section 3.5 ¶02 “Fig. 4 gives a detailed view about how the system manager orchestrates all these components to generate the QoS and QoR guaranteed configuration. It is shown that the user-specified QoS/QoR constraint will be used to obtain the performance/ accuracy headroom” 
    PNG
    media_image3.png
    319
    772
    media_image3.png
    Greyscale
Examiner notes that the constraint manager tunes the input neural network model in accordance with the specified user constraints. During training or runtime, the kind of application is determined by the decision tree of figure 4 which generates the reshaped model by modifying the accuracy of the input neural network model as a result of the actions taken along the decision tree to balance performance and accuracy.)

Regarding Claim 12
	Yu/Wang teaches the method of claim 11.
Further Yu teaches, executing the input neural network model using a first hardware device before generating the reshaped neural network model, wherein a kernel size used in an operation of the first hardware device is different from a kernel size used in an operation of the at least one dedicated hardware device.  (Section 5 ¶01 “Table 3 is the overview of the results. Scalpel achieves mean speedups of 3.54x, 2.61x, and 1.25x on the microcontroller, CPU and GPU, respectively” Examiner notes that in order to perform the comparison between the original DNN model and the pruned dedicated hardware model on the microcontroller, The input network model has been first executed on a first hardware device. The corresponding reshaped model is executed on dedicated hardware, the microcontroller. Section 3.3 ¶04 “One neuron in the fully-connected layers or one feature map in the convolutional layers is considered as one node. Removing nodes in DNNs only shrinks the size of each layer” Examiner notes that the described method describes reshaping the input neural network. Shrinking the size of a layer corresponds to reducing the kernel size.)

Regarding Claim 19
Yu teaches the method of claim 17
Yu teaches, wherein the neural network adaptor module is configured to, parse various items of information comprising kernel size information from the first neural network model, (Examiner notes that this limitation is addressed in the rejection of claim 18) manage information regarding execution of the first neural network model and information about a change in computing resource status of at least one hardware device during runtime; (Section 3.3 ¶004 “SIMD-aware weight pruning works layer by layer. The execution time for each layer will be generated at the beginning, and the pruning process starts with the layer of the highest execution time. Every time after retraining the pruned DNN [during runtime], the execution time of each layer will be updated. The new slowest layer will be pruned in the next iteration if the retrained DNN does not lose the original accuracy” Examiner notes, the SIMD-aware system manages information about computing resource status during runtime execution of the hardware.)
Yu does not explicitly teach, provide, using a user application programming interface (API), a user interface for a user of an electronic system employing the neural network system, receive, via the API, at least one item of 
Wang, however, when addressing issues related a configurable neural network inference device teaches, provide, using a user application programming interface (API), a user interface for a user of an electronic system employing the neural network system, (Introduction “so that the accelerator [neural network system] can operate in the most energy-efficient point within the user-specified constraint of accuracy and speed” Examiner notes that a device that is mediated by a user specified constraint corresponds to “a user interface for a user of an electronic system”) receive, via the API, at least one item of user selection information (Section 3.1 ¶01 “ELNA comprises a reconfigurable CNN accelerator that switches in different operation modes and a software CNN synthesizer that searches for the proper CNN topologies and suitable parameters to meet the user-specified QoR and QoS constraint” Examiner notes that in order for the system to synthesizer topologies that are suitable the user selection must have been received) and generate the second neural network model using, at least one of the parsed various items of information, the at least (Section 3.5 ¶02 “Fig. 4 gives a detailed view about how the system manager orchestrates all these components to generate the QoS and QoR guaranteed configuration. It is shown that the user-specified QoS/QoR constraint will be used to obtain the performance/ accuracy headroom” 
    PNG
    media_image3.png
    319
    772
    media_image3.png
    Greyscale
Examiner notes that the constraint manager tunes the input neural network model in accordance with the specified user constraints. During training or runtime, the kind of application is determined by the decision tree of figure 4 which generates the reshaped model by modifying the accuracy of the input neural network model as a result of the actions taken along the decision tree to balance performance and accuracy.)
Wang to the disclosed invention of Yu.
One of ordinary skill in the arts would have been motivated to make this modification in order to have “a better chance of satisfying real-time processing requirement in the embedded system through computation approximation, or work in more energy-efficient mode” (Wang Conclusion)

Conclusion
Prior art
US document ID US 20160328644 A1, teaches a method for adaptively selecting local processing units based on assessment of system resources and performance specifications. And selecting between a current and new configuration based on dynamic assessment.  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 4:00 pm (EST).

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
	
/J.R.G./Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122