DETAILED ACTION 
Status of Claims 
This action is in response to the amendment filed on 9/1/2022 for application 16/001,923. Claim 1, 3 – 9, 11 – 17 are pending and have been examined. 
Claim rejection under 35 U.S.C 112 has been withdrawn in light of the applicant’s amendment and remarks.
 
Notice of Pre-AIA  or AIA  Status 
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
 
Information Disclosure Statement 
The information disclosure statement (IDS) submitted on 4/13/2020 and 11/16/2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner. 

Claim Rejections - 35 USC § 112 
 
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.


Claim 1, 3 – 9 and 11 – 17 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claims 1 and 9 amended with limitation of “analyzing an intra-layer and inter-layer sparsity of the DNN model”. However, examiner does not find evidence of support in the original disclosure. For the examination purpose, the limitation is interpreted as analysis of single-layer sparsity and multi-layer sparsity of the DNN model.  
Dependent claims 3-8 and 11-17 are rejected with the same reason.

Response to Argument
Applicant's argument filed on 9/1/2022 regarding claim rejection under 35 U.S.C. 112b and 35 U.S.C. 103 has been fully considered but they are not persuasive. 
Regarding claim rejection under 35 U.S.C. 112b, applicant failed to point out the location in the original disclosure for the rejected limitation especially the terms “inter-layer sparsity” and “intra-layer sparsity”. Thus, with the same reason set forth, the claims are rejected under 35 U.S.C. 112b. 
Regarding claim rejection under 35 U.S.C. 103, applicant’ state that Han in view of Han2 fail to teach compression using low-rank approximation in combination with pruning and quantization and sparsity analysis. Especially, the diagram from https://faculty. ucmerced. edu/mcarreira-perpinan/papers/ij cnn2 l c. pdf and the low rank approximation technique disclosed in https://openaccess.thecvfcom/content ICCV 2017/papers/He Channel Pruning for IC CV 2017 paper.pdf. Examiner respectfully disagree. The two mentioned reference are not part of the original disclosure of the application. Instead, paragraph 0031 in the specification of the instant application discloses: a low-rank approximation method to the hidden layers and the output later to reduce in the pre-trained DNN model according to an analysis result. As mentioned above, the pre-trained DNN model comprises a plurality of neurons, each neuron corresponding to multiple parameters, e.g. the weight w and the bias b. Among these parameters, some are redundant and do not contribute a lot to the output. If the neurons could be ranked in the network according to the contribution, the low ranking neurons from the network could be removed to generate a smaller and faster network, i.e. the reconfigured model. At least, Han2 discloses prune the low-weight connections (Han2, section 3, paragraph 2, line 1); after pruning connections, neurons with zero connections or zero output connections maybe safely pruned (Han2, section 3.5, paragraph 1, line 1 – 2). In other word, the neurons that have low weights/ranks in the input connections and/or output connections do not contribute a lot to the output and thus could be removed. Thus, Han in view of Han2 teach compression, low-rank approximation, pruning, quantization and sparsity analysis as point out in the prior art rejection section.  
Applicant further state that in the instant application, the low rank approximation is performed on the pruned and quantized network; the sparsity analysis taught by the instant application is carried out on an uncompressed network and the LRA is then carried out on a pruned and quantized network and thus Han in view of Han2 do not teach the claimed limitation in Claim 1 and 9. Examiner respectfully disagree. These mentioned feature applicant relies are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).   

Claim Rejections - 35 USC § 103 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.   
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 
 
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 
1. Determining the scope and contents of the prior art. 
2. Ascertaining the differences between the prior art and the claims at issue. 
3. Resolving the level of ordinary skill in the pertinent art. 
4. Considering objective evidence present in the application indicating obviousness or nonobviousness. 


Claims 1, 3, 5 – 9, 11 and 13 – 17 are rejected under 35 U.S.C. 103 as being unpatentable over Han (Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding, arXiv, Feb, 2016)  in view of Han 2 (Learning both weights and connection for efficient neural networks, arXiv, 2015). 

Regarding Claims 1, Han discloses a self-tuning model compression methodology (sec. 2, para. 1, where the system learns the connectivity for prune connections of the model [self-tuning, model compression]) for reconfiguring a Deep Neural Network (DNN), comprising: 
receiving a DNN model and a data set (sec. 5.1, para. 1, where MNIST dataset and LeNet-5 network [DNN]), wherein the DNN model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the DNN model comprise a plurality of neurons (sec. 5.1, para. 1, & tbl. 3 where LeNet-5 [DNN] is convolutional network that has two convolutional layers and two fully connected layers, the first layer is the input layer and the last layer is the output layer, hidden layers are between input and output layers. Each layer has at least one neuron and thus LeNet-5 has a plurality of neurons); 
compressing the DNN model into a reconfigured model according to the data set (tbl. 3, where original model [DNN model] in column 1 is pruned and quantized base on the MINST dataset into compressed model [reconfigured model] in column 2 and 3), wherein the reconfigured model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the reconfigured model comprise a plurality of neurons, and a size of the reconfigured model is smaller than a size of the DNN model  (tbl. 3, where after compression each layer has less weights but model still remain multiple layers: first layer [input layer], last layer [output layer] and middle layers[hidden layers]. The compressed model [reconfigured model] has less weight and bits [smaller size] then the original model [DNN model]);
 wherein the step of compressing the DNN model into a reconfigured model according to the data set comprises: analyzing an intra-layer sparsity of DNN model to generate an analysis result, pruning and quantizing a network redundancy of the DNN model (Han, sec. 2, para. 1, fig. 1 & fig. 4, where by learning the connectivity via normal network training, … prune the small-weight connections: all connection with weights below a threshold are removed; i.e., the weight sparsity within a layer is analyzed by checking if the weight is close to zero; fig. 1, where network is pruned and quantized); and applying a low-rank approximation method to said at least one hidden layer and the output layer of the DNN model according to the analysis result (Han, abs. ln. 6 – 7, where prunes the network by learning only the important connections; i.e., approximate the network model by using only the low ranked connections; tbl. 5, where all layers of DNN including hidden layer and last layer are pruned)
and executing the reconfigured model on a user terminal for an end-user application (sec. 6.3, para. 3, where the pruned sparse model [reconfigured model] is benchmarked [execute an end-user application] on off-the-shelf hardware [user terminal] ).
Han does not explicitly discloses: 
analyzing an intra-layer sparsity of DNN model to generate an analysis result 
Han 2 explicitly discloses:
analyzing an intra-layer sparsity of DNN model to generate an analysis result (Han2, section 3.5, where neurons with zero input connections [from prior layer] or zero output connections [to following layer] may be safely pruned)
Han and Han 2 both teach neural network compression technique and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Han’s teaching neural network compression with weights pruning and quantization with Han 2’s teaching of neural network compression with neuron pruning to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to reduce the storage and computation of neural network by an order of magnitude without affecting their accuracy (Hen 2, abs. ln. 4 – 5). 
 
Regarding Claims 3, Han in view Han 2 further disclose: 
wherein a number of the plurality of neurons of the reconfigured model is less than a number of the plurality of neurons of the DNN model (Han 2, fig. 3, where after pruning model [reconfigured model] has less neurons than before pruning model [DNN model]). 

Regarding Claims 5, Han further discloses: retraining the reconfigured model with the data set (abs. ln. 8, where after pruning and quantization retrain [retraining] the network to fine tune the remaining connections and the quantized centroids). 

Regarding Claims 6, Han in view of Denton further discloses: wherein the DNN model is used for computer vision targeted application models including AlexNet, a VGG16, a ResNet, and a MobileNet (Han, See section 5.2 and 5.3 for AlexNet and VGG-16 on ImageNet [vision targeted]; In the specification para. 0021 of the instant application, AlexNet, VGG16, ResNet or MobileNet are example of pre-trained model, examiner interpret the claim as an alternative limitation of any one of the examples) and natural language understanding application models (Han 2, See sec. 1, para. 1 for speech recognition and natural language processing [natural language understanding application model]). 
Han and Han 2 both teach neural network compression technique and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Han’s teaching of neural network compression with weights pruning and quantization with Han 2’s teaching of using DNN on natural language processing to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to solve the language processing needs utilizing DNN model (Hen 2, sec. 1, para. 1).

Regarding Claims 7, Han further discloses: wherein each of said at least one hidden layer and the output layer of the reconfigured model is a convolutional layer or a fully-connected layer (sec. 5.1 & tbl 3, where in LeNet-5 one middle layer is convolutional layer and the last layer [output layer] is fully connected layers). 

Regarding Claims 8, Han in view of Han 2 further discloses: wherein the end-user application is a visual recognition application or a speech recognition application (Han 2, intro, ln. 1, where neural network application including speech recognition). 
The reason for combination is the same as  Claim 6. 

Regarding Claim 9, Claim 9 is the electronic device claim corresponding to Claim 1 without the last limitation. Han further discloses: an electronic device, comprising: a storage device, arranged to store a program code; and a processor, arranged to execute the program code; wherein when loaded and executed by processor, the program code instructs the processor to execute the following steps (sec. 6.3 para. 3 – 4, where the method is programmed using BLAS [program code] loaded in DRAM [memory device] and executed by CPU [processor] in an off-the-shelf hardware [electronic device]). Claim 9 is rejected with the same reason as Claim 1.  

Regarding Claim 11, 13 – 15 and 17, Claim 11, 13 – 15 and 17 are the electronic device claim corresponding to Claim 3, 5 – 7 and 8. Claim 11, 13 – 15 and 17 are rejected with the same reason as Claim 3, 5 – 7 and 8.

Regarding Claim 16, Claim 16 is the electronic device claim corresponding to Claim 1. Claim 16 is rejected with the same reason as Claim 1. 

Claim 4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Han (Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding, arXiv, Feb, 2016) in view of Han 2 (Learning both weights and connection for efficient neural networks, arXiv, 2015) further in view of Judd, US20170357891, Accelerator for Deep Neural Networks, further in view of and Porrmann, Implementation of Artificial Neural Networks on a Reconfigurable Hardware Accelerator, Proceedings of the 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, 2002. 
 
Regarding Claims 4, Han in view of Han 2 shows compressing and neural network according to analyze result (See Han, Page 2, 2. Network Pruning for connection below threshold are removed from network); 
however, Han in view of Han 2 does not specify the neuron node to be realized by the logic circuit and multiplexer or adder and also does not explicitly disclose: wherein each of the plurality of neurons of the reconfigured model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, each of the plurality of neurons of the DNN model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, 
Judd explicitly discloses the logic circuit and multiplexer or adder hardware realization for Han (in view of Han 2): wherein each of the plurality of neurons of the reconfigured model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder (Judd, para. 0112, 0113 & fig. 13, where SIP 270a [logic circuit] comprising multiplexer 1327 and adder 1330; SIP 270a is used to accelerator for neuron operations as it multiplies weights WR1220 with input 1320 at 1310, sums the product at 1330 and 1340, and produces neuron output nbout), each of the plurality of neurons of the DNN model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder (Judd, para. 0112, 0113 & fig. 13, where SIP 270a [logic circuit] comprising multiplexer 1327 and adder 1330; SIP 270a is used to accelerator for neuron operations as it multiplies weights WR1220 with input 1320 at 1310, sums the product at 1330 and 1340, and produces neuron output nbout),  
Han (in view of Han 2) and Judd both disclose neural network acceleration application and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Han (in view of Han 2)’s teaching of neural network compression method with Judd’s teaching of acceleration apparatus to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to improve energy efficiency and reduce computation demand (Judd, para. 0002, ln. 6 – 9). 
Han in view of Han 2 and Judd did not explicitly disclose: removing a portion of the logic circuits in the DNN model according to the analysis result so that a number of logic circuits in the reconfigured model is less than a number of logic circuits in the DNN model. 
Porrmann explicitly disclose: removing a portion of the logic circuits in the DNN model according to the analysis result so that a number of logic circuits in the reconfigured model is less than a number of logic circuits in the DNN model (Porrmann, sec. 1, para. 3, ln. 12 – 22, where reconfigurable hardware accelerator … system can be reconfigured for the different task within one application … hardware can always be mapped optimally; i.e., hardware can be reconfigured after pruning and quantization [according analysis result] which include less [a portion] neurons [logic circuits] than the model before pruning [DNN model]). 
Han (in view of Han 2 and Judd) and Porrmann both teach neural network hardware implementation and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Han (in view of Han 2 and Judd)’s teaching of neural network compression method and apparatus with Porrmann’s teaching of reconfigurable neural network hardware accelerator to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to improve resource efficiency in respect to speed compactness and power consumption (Porrmann, sec. 6 ln. 8 – 11).
 
Regarding Claim 12, Claim 12 is the electronic device claim corresponding to Claim 4. Claim 12 is rejected with the same reason as Claim 4.

Conclusion 
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354.  The examiner can normally be reached on Monday- Friday 9 am - 5 pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 


/S.C./Examiner, Art Unit 2122 

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122