Detailed Action
This action is in response to Applicant's communications filed 06 September 2017.  
Claims 1-20 are pending in this Application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08 November 2017 was filed prior to the FAOM.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claim(s) 1-2, 8-9, 11-12, and 18-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Han et al. (Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, hereinafter "Han").

Regarding Claim 1,
Han teaches a method for network quantization using multi-dimensional vectors, comprising: constructing multi-dimensional vectors representing network parameters ("Weight sharing is illustrated in Figure 3. Suppose we have a layer that has 4 input neurons and 4 output neurons, the weight is a 4 × 4 matrix. On the top left is the 4 × 4 weight matrix" sec. 3, p. 3) from a trained neural network model ("Network quantization and weight sharing further compresses the pruned network by reducing the number of bits required to represent each weight." sec. 3, p. 3); 
quantizing the multi-dimensional vectors to obtain shared quantized vectors as cluster centers ("We use k-means clustering to identify the shared weights for each layer of a trained network, so that all the weights that fall into the same cluster will share the same weight." sec. 3.1, p. 4); 
fine-tuning the shared quantized vectors/cluster centers ("The centroids of the one-dimensional k-means clustering are the shared weights. There is one level of indirection during feed forward phase and back-propagation phase looking up the weight table. An index into the shared weight table is stored for each connection. During back-propagation, the gradient for each shared weight is calculated and used to update the shared weight. This procedure is shown in Figure 3." sec. 3.3, p. 5); and 
encoding using the shared quantized vectors/cluster centers ("A Huffman code is an optimal prefix code commonly used for lossless data compression(Van Leeuwen, 1976). It uses variable-length codewords to encode source symbols. The table is derived from the occurrence probability for each symbol. More common symbols are represented with fewer bits." sec. 4, p. 5; "We pruned, quantized, and Huffman encoded four networks" sec. 5, p. 5).

Regarding Claim 2,
Han teaches the method of claim 1.  Han further teaches wherein encoding comprises: creating binary codewords using entropy coding on the shared quantized vectors/cluster centers ("A Huffman code is an optimal prefix code commonly used for lossless data compression(Van Leeuwen, 1976). It uses variable-length codewords to encode source symbols. The table is derived from the occurrence probability for each symbol. More common symbols are represented with fewer bits." sec. 4, p. 5; "We pruned, quantized, and Huffman encoded four networks" sec. 5, p. 5; Examiner notes that Huffman coding is a common entropy encoding technique).

Regarding Claim 8,
(Figure 3, cluster index, p.3; "A Huffman code is an optimal prefix code commonly used for lossless data compression(Van Leeuwen, 1976). It uses variable-length codewords to encode source symbols. The table is derived from the occurrence probability for each symbol. More common symbols are represented with fewer bits." sec. 4, p. 5).

Regarding Claim 9,
Han teaches the method of claim 8.  Han further teaches wherein converting the quantized vectors into symbols comprises: using a lookup table to find symbols corresponding to the quantized vectors (Figure 3, centroids, cluster index; each symbol of 0, 1, 2, and 3 correspond with a centroid value; " A Huffman code is an optimal prefix code commonly used for lossless data compression(Van Leeuwen, 1976). It uses variable-length codewords to encode source symbols. The table is derived from the occurrence probability for each symbol. More common symbols are represented with fewer bits." sec. 4, p. 5).

Regarding Claim(s) 11-12 and 18,
Claim(s) 11-12 and 18 recite(s) an apparatus including a processor (Han: "processor", sec. 6.3, p. 9) and non-transitory computer-readable media (Han: "DRAM", sec. 6.3, p. 9) storing instructions for performing functions corresponding to the method steps recited in claim(s) 1-2 and 9, respectively.  Han teaches the limitations of 

Regarding Claim(s) 19,
Claim(s) 19 recite(s) a method for manufacturing a chipset (Han: comparison of hardware performance in sec. 6.3, p. 9) comprising  corresponding to the method steps a processor (Han: "processor", sec. 6.3, p. 9) and non-transitory computer-readable media (Han: "DRAM", sec. 6.3, p. 9) storing instructions for performing functions corresponding to the method steps recited in claim(s) 1, respectively.  Han teaches the limitations of claim(s) 19 as set forth above in connection with claim(s) 1.  Therefore, claim(s) 19 is/are rejected under the same rationale as respective claim(s) 1.

Regarding Claim(s) 20,
Claim(s) 20 recite(s) a method for testing an apparatus (Han: comparison of hardware performance in sec. 6.3, p. 9) including a processor (Han: "processor", sec. 6.3, p. 9) and non-transitory computer-readable media (Han: "DRAM", sec. 6.3, p. 9) storing instructions for performing functions corresponding to the method steps recited in claim(s) 1, respectively.  Han teaches the limitations of claim(s) 20 as set forth above in connection with claim(s) 1.  Therefore, claim(s) 20 is/are rejected under the same rationale as respective claim(s) 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 3-7, 10, and 13-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, hereinafter "Han") in view of Zamir et al. (On Universal Quantization by Randomized Uniform/Lattice Quantizers, hereinafter "Zamir").

Regarding Claim 3,
Han teaches the method of claim 1.  Han does not explicitly teach wherein quantizing the multi-dimensional vectors comprises: using lattice quantization to quantize the multi-dimensional vectors.
Zamir teaches wherein quantizing the multi-dimensional vectors comprises: using lattice quantization to quantize the multi-dimensional vectors ("Uniform quantization with dither, or more generally lattice quantization with dither, followed by a universal lossless source encoder (entropy encoder) is a simple procedure for universal coding with distortion of a source that may take continuous real values.  This procedure is universal since it does not depend on the source statistics.  Due to the dither, the distortion in this procedure is independent of the source value.  In this correspondence, we consider the rate performance, i.e., the entropy, of the uniform (lattice) randomized quantizer as compared with the optimal rate given by the rate-distortion function of the source." sec. I.A., p. 429).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a 

Regarding Claim 4,
Han teaches the method of claim 1.  Han does not explicitly teach wherein quantizing the multi-dimensional vectors comprises: using an iterative algorithm to solve an entropy-constrained vector quantization.
Zamir teaches wherein quantizing the multi-dimensional vectors comprises: using an iterative algorithm to solve an entropy-constrained vector quantization ("The quantizers above can be used to encode a source vector x E Rn as follows.  In the scalar case, a dither is added independently to each source component and the result is then quantized component by component.  In the lattice case, we assume that K divides n and the input is considered as a concatenation of n/K K-dimensional vectors, quantized independently, using independent vector samples of the dither.  In both cases the entropy coder will then take into accoun tthe statistical properties of the entire n-dimensional vector." sec. I.A, p. 429; "under these conditions it was shown that the optimal, entropy-constrained, quantizer becomes a uniform (or lattice) quantizer" sec. II.C, p. 433).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a 

Regarding Claim 5,
Han teaches the method of claim 1.  Han does not explicitly teach wherein quantizing the multi-dimensional vectors comprises: using universal quantization to quantize the multi-dimensional vectors.
Zamir teaches using universal quantization to quantize the multi-dimensional vectors ("Uniform quantization with dither, or more generally lattice quantization with dither, followed by a universal lossless source encoder (entropy encoder) is a simple procedure for universal coding with distortion of a source that may take continuous real values.  This procedure is universal since it does not depend on the source statistics.  Due to the dither, the distortion in this procedure is independent of the source value.  In this correspondence, we consider the rate performance, i.e., the entropy, of the uniform (lattice) randomized quantizer as compared with the optimal rate given by the rate-distortion function of the source." sec. I.A., p. 429).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a 

Regarding Claim 6,
Han teaches the method of claim 5.  Han does not explicitly teach randomizing the network parameters using dithering before quantizing.
Zamir teaches randomizing the network parameters using dithering before quantizing ("Uniform quantization with dither, or more generally lattice quantization with dither, followed by a universal lossless source encoder (entropy encoder) is a simple procedure for universal coding with distortion of a source that may take continuous real values.  This procedure is universal since it does not depend on the source statistics.  Due to the dither, the distortion in this procedure is independent of the source value.  In this correspondence, we consider the rate performance, i.e., the entropy, of the uniform (lattice) randomized quantizer as compared with the optimal rate given by the rate-distortion function of the source." sec. I.A., p. 429

    PNG
    media_image1.png
    501
    694
    media_image1.png
    Greyscale
, sec. I.A, p. 429).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a source that may take continuously many values, as suggested by Zamir (Zamir: Abstract, pp. 428-429).

Regarding Claim 7,
Han teaches the method of claim 6.  Han does not explicitly teach wherein randomizing the network parameters using dithering comprises: using a pseudo-random number generator to generate random dithering values.
Zamir teaches using a pseudo-random number generator to generate random dithering values ("Uniform quantization with dither, or more generally lattice quantization with dither, followed by a universal lossless source encoder (entropy encoder) is a simple procedure for universal coding with distortion of a source that may take continuous real values.  This procedure is universal since it does not depend on the source statistics.  Due to the dither, the distortion in this procedure is independent of the source value.  In this correspondence, we consider the rate performance, i.e., the entropy, of the uniform (lattice) randomized quantizer as compared with the optimal rate given by the rate-distortion function of the source." sec. I.A., p. 429

    PNG
    media_image1.png
    501
    694
    media_image1.png
    Greyscale
, sec. I.A, p. 429).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a 

Regarding Claim 10,
Han teaches the method of claim 8.  Han does not explicitly teach wherein encoding comprises: using universal coding.
Zamir teaches wherein encoding comprises: using universal coding ("Uniform quantization with dither, or more generally lattice quantization with dither, followed by a universal lossless source encoder (entropy encoder) is a simple procedure for universal coding with distortion of a source that may take continuous real values.  This procedure is universal since it does not depend on the source statistics.  Due to the dither, the distortion in this procedure is independent of the source value.  In this correspondence, we consider the rate performance, i.e., the entropy, of the uniform (lattice) randomized quantizer as compared with the optimal rate given by the rate-distortion function of the source." sec. I.A., p. 429).
Han and Zamir are analogous art because both are directed to quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the compression methods of Han with the lattice quantizer of Zamir.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement a simple procedure with universal coding with distortion of a source that may take continuously many values, as suggested by Zamir (Zamir: Abstract, pp. 428-429).

Regarding Claim(s) 13-17,
Claim(s) 13-17 recite(s) an apparatus including a processor (Han: "processor", sec. 6.3, p. 9) and non-transitory computer-readable media (Han: "DRAM", sec. 6.3, p. 9) storing instructions for performing functions corresponding to the method steps recited in claim(s) 3-7, respectively.  The Han/Zamir combination teaches the limitations of claim(s) 13-17 as set forth above in connection with claim(s) 3-7.  Therefore, claim(s) 13-17 is/are rejected under the same rationale as respective claim(s) 3-7.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES C KUO whose telephone number is (571)270-7477.  The examiner can normally be reached on M-F: 9:00 a.m. - 6:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information 






/CHARLES C KUO/Examiner, Art Unit 2126  
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126