DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1, 2, 9, 10, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Han et al., “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding” (herein Han) in view of 

Regarding claims 1, 9, and 17, taking claim 9 as exemplary, Han teaches an apparatus for compressing a neural network, the apparatus comprising: 
at least one processor [Application running on a processor (e.g. mobile processor). Han at section 6.3]; and 
a memory storing instructions, the instructions when executed by the at least one processor, cause the at least one processor to perform operations [DRAM (i.e. memory) storing the application. Han at section 6.3], the operations comprising: 
acquiring a to-be-compressed trained neural network [Receiving a pruned and trained neural network to be quantized (i.e. compressed). See Han at FIG. 1 and Section 3.1]; 
selecting at least one layer from layers of the neural network as a to-be-compressed layer [Quantization is performed for each layer of the neural network. Han at Section 3.1. Therefore, a given layer of the neural network is selected each iteration/round of quantization.]; 
performing following processing steps sequentially on each of the to-be-compressed layers [Quantization is performed for each layer of the neural network. Han at Section 3.1.]: 
quantifying parameters of the to-be-compressed layer based on a specified number [The weights (i.e. parameters) of each layer are quantized into k (i.e. specified number) clusters. Han at section 3 and 3.1; FIGS. 1 and 3], and 
training the quantified neural network based on a preset training sample using a machine learning method [The neural network is retrained using backpropagation. See Han at section 3.3; FIG. 1]; and 
determining the neural network obtained after performing the processing steps on the selected at least one to-be-compressed layer as a compressed neural network, and storing the compressed neural network [The quantized model is a compressed neural network that is stored in DRAM/memory of the mobile device. See Han at section 5 and 6.3].
Further regarding claim 17, Han teaches a non-transitory computer-readable storage medium storing a computer program, the computer program when executed by one or more processors, causes the one or more processors to perform operations above [DRAM (i.e. computer-readable storage medium) storing the application to execute on a processor. Han at section 6.3]
Han doesn’t explicitly teach that the processing steps are performing sequentially on each of the to-be-compressed layers in descending order of a number of level of the to-be-compressed layer in the neural network. In the same field of compressing neural networks, Wu teaches performing quantization sequentially on each of the to-be-compressed layers in descending order of a number of level of the to-be-compressed layer in the neural network [Previous layers are quantized before following layers (i.e. in descending order of a number of level). Wu at section 3.3.3]. Wu teaches that quantization of previous layers before following layers (i.e. in descending order of a number level) reduces the accumulation of errors, thereby increasing accuracy and performances of the compressed neural network [Wu at section 3.3, 1st and 2nd paragraphs; section 3.3.3]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the apparatus of Han so that the processing steps are performing sequentially on each of the to-be-compressed layers in descending order of a number of level of the to-be-compressed layer in the neural network, as taught by Wu, in order to increase performance of the compressed neural network.

Regarding claims 2 and 10, taking claim 10 as exemplary, Han and Wu teach the apparatus according to claim 9, wherein the selecting at least one layer from layers of the neural network as a to-be-compressed layer comprises: 
selecting, in response to the neural network comprising a convolutional layer and a fully connected layers, at least one of at least one convolutional layer or at least one fully connected layers as the to-be-compressed layer [The neural network is a convolutional neural network (CNN) comprising fully-connected layers and convolutional layers, each of which are quantized and, thereby, selected. Han at section 3 and 3.1].


Allowable Subject Matter
Claims 3-8 and 11-16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: the prior art of record doesn’t teach or suggest “stopping execution of the quantifying training operations in response to determining that an accuracy of the current trained neural network is not lower than a preset accuracy; and expanding the specified number and re-executing the quantifying training operations in response to determining that the accuracy of the currently trained neural network is lower than the preset accuracy” after quantizing a layer and training the quantized neural network, as recited in claims 3 and 11.
Wu et al., “Quantized Convolutional Neural Network for Mobile Devices”, (herein Wu) teaches compressing a convolutional neural network (CNN) by quantizing the fully-connected layers and convolutional layers of the CNN while minimizing the estimation error of each layer. However, Wu doesn’t teach or suggest “stopping execution of the quantifying training operations in response to determining that an accuracy of the current trained neural network is not lower than a preset accuracy; and expanding the specified number and re-executing the quantifying training operations in response to determining that the accuracy of the currently trained neural network is lower than the preset accuracy” after quantizing a layer and training the quantized neural network, as recited in claims 3 and 11.
Zhou et al., “Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights”, (herein Zhou) compressing a CNN by dividing the weights a first group that is quantized to a low-precision and a second group that compensates for the accuracy loss from the quantization and is re-trained. However, Zhou doesn’t teach or suggest “stopping execution of the quantifying training operations in response to determining that an accuracy of the current trained neural network is not lower than a preset accuracy; and expanding the specified number and re-executing the quantifying training operations in response to determining that the accuracy of the currently trained neural network is lower than the preset accuracy” after quantizing a layer and training the quantized neural network, as recited in claims 3 and 11.
Gong et al., “Compressing Deep Convolutional Networks Using Vector Quantization”, (herein Gong) teaches compressing CNNs using different vector quantization methods. However, Gong doesn’t teach or suggest “stopping execution of the quantifying training operations in response to determining that an accuracy of the current trained neural network is not lower than a preset accuracy; and expanding the specified number and re-executing the quantifying training operations in response to determining that the 
Claims 4-8 and 12-16 depend from claims 3 and 11, respectively, and are considered allowable for at least the reasons given above for claims 3 and 11.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN P GEIB whose telephone number is (571)272-8628.  The examiner can normally be reached on Monday - Friday 8:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on (571)270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/BENJAMIN P GEIB/Primary Examiner, Art Unit 2123