DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present application is filed on 09/222/2017. 
This action is in response to arguments and/or remarks filed on 02/23/2021. In the current amendments claims 1, 6-11 and 13 have been amended. Claims 4-5, 12 and 14 have been cancelled and claims 15-19 have been added. Claims 1-3, 6-11, 13 and 15-19 are currently pending and have been examined. 
In response to amendments and/or arguments filed on 02/23/2021, the 35 USC 101 rejections made in the previous Office Action has been withdrawn. 
In response to amendments and/or arguments filed on 02/23/2021, the 35 USC 112(b) rejections made in the previous Office Action has been withdrawn. 

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. JP2016-188412, filed on 09/27/2016.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 

Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder 
Claim 1: 
“division unit configured to divide a weight parameter…”
“encoding unit configured to approximate the weight…”
Claim 2: 
“division unit configured to divide a weight parameter…”
Claim 7: 
 “reconstruction unit reads and uses…”
Claims 9: 
“instruction unit configured to allow user to instruct…”
Claim 15: 
“encoding unit performs learning…”
Claim 16: 
“instruction unit receives, from user…”
Claim 17: 
“encoding unit encodes the weight parameter…”
Claim 18: 
“dividing unit divide the weight parameter…”
Claim 19: 
“reconstruction unit configured to reconstruct the weight parameter…”

“encoding unit, division unit, reconstruction unit and instruction unit” all have sufficient structure as shown below.
[0019] FIG. 1 is a diagram illustrating a functional configuration of an information processing apparatus according to this embodiment. The information processing apparatus includes a parameter division unit 101 which divides weight parameters of the neural network into parameters of a predetermined size and a parameter encoding unit 102 which performs codebook encoding on the individual divided parameters and which generates a codebook coefficient. The information processing apparatus further includes a codebook storage 103 which stores a codebook generated by the parameter encoding unit 102 and a codebook coefficient used for reconstruction of parameters. The information processing apparatus further includes a parameter reconstruction unit 104 which receives the codebook and the codebook coefficient and which performs approximate reconstruction on the weight parameters of the neural network and a neural network calculator 105 which receives the weight parameters and which performs calculation processes of the neural network.
[0020] The information processing apparatus further includes, as peripheral functions, a data input unit 106 which supplies data to be processed to the neural network and a result output unit 107 which outputs a result of a process performed in the neural network. The information processing apparatus further includes a neural network parameter storage 108 which stores parameters of the neural network before compression and which supplies the parameters to the parameter division unit 101 and a user instruction unit 109 which is used by a user to input various conditions when parameters are to be divided or encoded. 
[0021] The information processing apparatus includes a hardware configuration including a central processing unit (CPU), a read only memory (ROM), a random access memory (RAM), and a hard disk drive (HDD), and various functional configurations and processes in flowcharts described below are realized when the CPU executes programs stored in the ROM or a hard disk (HD), for example. The RAM includes a storage region functioning as a work area used by the CPU developing and executing the programs. The ROM includes a storage region which stores the programs to be executed by the CPU. The HD includes a storage region which stores various programs and various data including data on parameters to be used when the CPU executes processes. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-3, 6-11, 13 and 15-19 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim limitation “dividing unit” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the 
Claim 18 recites the limitation "the dividing unit" in line 2. There is insufficient antecedent basis for this limitation in the claim.
Claims 1, 11 and 13 all recites “…second layer of the convolutional neural network, which is higher than the first layer.” Since there have been no agreement on the actual meaning of which layer is “lower” or “higher” the term “higher” layer is ambiguous terminology and one of ordinary skill in the art would not have been able to clearly identify which layer is lower or higher in a convolutional neural network. For the purpose of examination the term “higher layer” has been interpreted as the layer next to the first layer in a convolutional neural network. 
Dependent claims 2-3, 6-10, and 15-19 are rejected for being dependency of independent claim 1. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 11, 13 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Gong).
Regarding claim 1 (Currently Amended)
Han teaches an information processing apparatus comprising: one or more processors, wherein the one or more processors function as: a division unit (pg. 9 section 6.3 “We compare three different off-the-shelf hardware: the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor.”) configured to divide a weight parameter of a convolutional neural network into a plurality of groups; (Examiner notes that the weight parameters are divided into four different colors see pg. 3 section 3 “Weight sharing is illustrated in Figure 3. Suppose we have a layer that has 4 input neurons and 4 output neurons, the weight is a 4 × 4 matrix. On the top left is the 4 × 4 weight matrix, and on the bottom left is the 4 × 4 gradient matrix. The weights are quantized to 4 bins (denoted with 4 colors), all the weights in the same bin share the same value, thus for each weight, we then need to store only a small index into a table of shared weights. During update, all the gradients are grouped by the color and summed together, multiplied by the learning rate and subtracted from the shared centroids from last iteration.”)
and an encoding unit (pg. 9 section 6.3 “We compare three different off-the-shelf hardware: the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor.”) configured to approximate and encode the weight parameter using a plurality of codebooks, (Examiner notes that quantize the weights with codebook corresponds to approximate the weight parameter with codebook see pg. 2 FIG. 1 “Figure 1: The three stage compression pipeline: pruning, quantization and Huffman coding. Pruning reduces the number of weights by 10×, while quantization further improves the compression rate: between 27× and 31×.” Also see pg. 4 section 3.1 “where weight sharing is determined by a hash function before the networks sees any training data, our method determines weight sharing after a network is fully trained, so that the shared weights approximate the original network.”)
Han does not teach wherein a codebook used by the encoding unit 
for performing encoding in a case where the encoding unit encodes the weight parameter of a first layer of the convolutional neural network is different from a codebook used by the encoding unit for performing encoding in a case where the encoding unit encodes the weight parameter of a second layer of the convolutional neural network, which is higher than the first layer.
Gong teaches wherein a codebook used by the encoding unit (pg. section 4.1 “The network was trained on 1 GPU for about 5 days after 70 epochs.”) for performing encoding in a case where the encoding unit encodes the weight parameter of a first layer of the convolutional neural network is different from a codebook used by the encoding unit for performing encoding in a case where the encoding unit encodes the weight parameter of a second layer of the convolutional neural network, (Examiner notes that Gong compressing image data into CNN corresponds to encoding weight parameters see abstract and Fig. 4 shows that the compression method of PQ where each layer 8 and 9 which corresponds to first and second layer having different vectors[corresponds to codebook] see pg. 7 section 4.4 “We also conducted additional analysis on the classification error rate for compressing each single layer while fixing other layers as uncompressed. The results are reported in Figure 4 (for accuracy@1 only). We found that compressing the eighth and ninth hidden layers did not usually lead to significant decrease of performance, but that compressing the tenth and final classification layer led to a much larger decrease of accuracy. Compressing all three layers together usually led to larger error, especially when the compression rate was high.”) which is higher than the first layer. (Examiner notes that Gong teaches CNN with 5 layers and 3 dense connected layers where the second layer is next to the first layer with filter size of 7, 5 etc see pg. section 4.1 “The convolutional neural network we used, from Zeiler & Fergus (2013), contains 5 convolutional layers and 3 dense connected layers. All of the input images were first resized to minimal dimensions of 257, after which we performed random cropping to 225×225 patches. Then the images were fed into 5 different convolutional layers with respective filter sizes of 7, 5, 3, 3, and 3.”)
Han and Gong are analogous art because they are both directed to neural network. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han to incorporate the teaching of Gong to include compressing convolutional neural network using vector quantization. 
One of ordinary skill in the art would have been motivated to make this modification in order to improve model storage issues using vector quantization methods for compressing parameters of CNN and apply k-means clustering to the weight to improve balance between model size and recognition accuracy as disclosed by Gong (abstract).

Regarding claim 11
Claim 11 recites analogous limitations to independent claim 1 and therefore is rejected on the same ground as independent claim 1.
Regarding claim 13
Claim 13 recites analogous limitations to independent claim 1 and therefore is rejected on the same ground as independent claim 1.

Regarding claim 2
Han in view of Gong teaches claim 1.  
Han further teaches wherein the division unit (pg. 9 section 6.3 “We compare three different off-the-shelf hardware: the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor.”)
divides the weight parameter into the plurality of groups after aligning the weight parameter by a predetermined method. (Examiner notes that the weight parameters are divided into four different colors see pg. 3 section 3 “Weight sharing is illustrated in Figure 3. Suppose we have a layer that has 4 input neurons and 4 output neurons, the weight is a 4 × 4 matrix. On the top left is the 4 × 4 weight matrix, and on the bottom left is the 4 × 4 gradient matrix. The weights are quantized to 4 bins (denoted with 4 colors), all the weights in the same bin share the same value, thus for each weight, we then need to store only a small index into a table of shared weights. During update, all the gradients are grouped by the color and summed together, multiplied by the learning rate and subtracted from the shared centroids from last iteration.”)

Regarding claim 18 (New)
Han in view of Gong teaches claim 1. 
Han further teaches wherein the dividing unit divides the weight parameter into the plurality of groups such that the weight parameter after division is equal in size. (Pg. 3 “For example, Figure 3 shows the weights of a single layer neural network with four input units and four output units. There are 4×4 = 16 weights originally but there are only 4 shared weights: similar weights are grouped together to share the same value.”)

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Gong) and further in view of Courbariaux et al. (“BinaryConnect: Training Deep Neural Networks with binary weights during propagations”, hereinafter: Courbariaux).
Regarding claim 3
Han in view of Gong teaches claim 1.
Han in view of Gong does not teach wherein the weight parameter has elements of a binary value or a ternary value.  
Courbariaux teaches wherein the weight parameter has elements of a binary value or a ternary value. (abstract “Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and powerhungry components of the digital implementation of neural networks… also see pg. section 2.4 “Since the binarization operation is not influenced by variations of the real-valued weights w when its magnitude is beyond the binary values ±1, and since it is a common practice to bound weights (usually the weight vector) in order to regularize them, we have chosen to clip the real-valued weights within the [−1, 1] interval right after the weight updates, as per Algorithm 1.”)
Han, Gong and Courbariaux are analogous art because they are all directed to neural network. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong to incorporate the teaching of Courbariaux to include binary weight during propagation in deep neural networks.
One of ordinary skill in the art would have been motivated in order to improve faster computation at both training and test time using a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights as disclosed by Courbariaux (abstract).


Claims 7, 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Gong) further in view of Chen et al. (“Adaptive Blurring Estimation for Learning-Based Super Resolution”, hereinafter: Chen).
Regarding claim 19 (New)
Han in view of Gong teaches claim 1. 
Han further teaches wherein each of the plurality of codebooks is composed of a plurality of codebook vectors, (Examiner notes that the matrices of codebook corresponds to codebook vectors see pg. 10 section 6.4 “Pruning makes the weight matrix sparse, so extra space is needed to store the indexes of non-zero elements. Quantization adds storage for a codebook. The experiment section has already included these two factors. Figure 11 shows the breakdown of three different components when quantizing four networks”)
a reconstruction unit; (pg. 9 section 6.3 “We compare three different off-the-shelf hardware: the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor.”)
Han in view of Gong does not teach and wherein the one or more processors further function as …configured to reconstruct the weight parameter by a linear sum of a codebook coefficient determined by the encoding unit 
 
Chen teaches wherein the one or more processors further function as a …configured to reconstruct the weight parameter by a linear sum of a codebook coefficient determined by the encoding unit (pg. 656 right col “Any local LR patch can be reconstructed by a weighted linear combination[corresponds to linear sum] of its nearer codes in the LR codebook. The calculated weight coefficient significantly affects the reconstruction errors of local patches, which in turn plays a key role in assuring the quality of the recovered HR image.”)
and a corresponding codebook vector that corresponds to the codebook coefficient. (Examiner notes that the input LR patches are images which are input as vectors see introduction and the input images are then calculate their coefficient see pg. 656 left col “In test step, we calculate the coefficient c for each input LR patches. As [8] indicates, LLC can be approximated by using M (M<<N) nearest neighbors of xi”)
Han, Gong and Chen are analogous art because they are all directed to data automation. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong to incorporate the teaching of Chen in order to improve generating high-resolution (HR) image from a single lo-resolution (LR) image.  
Chen (abstract).

Regarding claim 7 (Currently Amended)
Han in view of Gong with Chen teaches claim 19. 
Han further teaches the reconstruction unit reads (pg. 9 section 6.3 “We compare three different off-the-shelf hardware: the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor.”)
and uses different codebook sets depending on a layer of the neural network which is a reconstruction target of the weight parameter. (Pg. 3 second paragraph “To compress further, we store the index difference instead of the absolute position, and encode this difference in 8 bits for conv layer and 5 bits for fc layer. When we need an index difference larger than the bound, we the zero padding solution shown in Figure 2: in case when the difference exceeds 8, the largest 3-bit (as an example) unsigned number, we add a filler zero.” Examiner notes that at each layer under quantize the weights with code-book Han teaches retrain code book and loops back thus mean using different codebook sets on different layer)

Regarding claim 10 (Currently Amended)
Han in view of Gong with Chen teaches claim 5.
Gong further teaches wherein the neural network is a convolutional neural network. (Pg. 2 third paragraph “We are among the first to systematically explore vector quantization methods for compressing the dense connected layers of deep CNNs[corresponds to Convolutional Neural Network] to reduce storage;”)
Han, Gong and Chen are analogous art because they are both directed to neural network. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Chen to incorporate the teaching of Gong to include compressing convolutional neural network using vector quantization. 
One of ordinary skill in the art would have been motivated to make this modification in order to improve model storage issues using vector quantization methods for compressing parameters of CNN and apply k-means clustering to the weight to improve balance between model size and recognition accuracy as disclosed by Gong (abstract).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Gong) in view of Chen et al. (“Adaptive Blurring Estimation for Learning-Based Super Resolution”, hereinafter: Chen) and further in view of Wang et al. (“Small-Footprint High-Performance Deep Neural Network-Based Speech Recognition using Split-VQ”).
Regarding claim 6 (Currently Amended)
Han in view of Gong with Chen teaches claim 19. 
Han in view of Gong with Chen does not teach wherein the encoding unit determines a weight coefficient by optimizing a loss function including a loss term of approximation accuracy of the weight parameter of the neural network and a loss term as a sparse term of the weight coefficient in accordance with the plurality of codebooks.
Wang teaches wherein the encoding unit determines a weight coefficient by optimizing a loss function including a loss term of approximation accuracy of the weight parameter of the neural network and a loss term as a sparse term of the weight coefficient in accordance with the plurality of codebooks. (Pg. 4985 “The weight matrices A(l)’s and the bias vectors b(l)’s of the DNN can be estimated by minimizing the following cross entropy based loss function… where Xtr = (x1, . . . , xt) is a set of training feature vectors; st is the senone label of xt. The optimization is usually done by backpropagation using stochastic gradient descent. For example, given a mini-batch of training feature vectors, Xmb, and the corresponding labels, the weight matrix A(l) is updated using.” pg. 4986 further explain how the codebook can be fine tune using loss function see pg. 4986 “When an aggressive quantization is used, a significant WER increase will be observed. In this case, the codebook can be fine-tuned to minimize the cross entropy-based loss function in Eq. (6). The gradient of the loss function with respect to the codeword mk can be obtained using the chain rule) 
Han, Gong, Chen and Wang are analogous art because they are all directed to data automation. 
Han in view of Gong with Chen to incorporate the teaching of Wang to include weight coefficients in quantized matrix structure that can save portion of computation time in computing neural network model as disclosed by Wang (pg. 4987 right col section 5).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. in view of Chen et al. and further in view of Courbariaux et al. (“BinaryConnect: Training Deep Neural Networks with binary weights during propagations”, hereinafter: Courbariaux).
Regarding claim 8 (Currently Amended)
Han in view of Gong with Chen teaches claim 19.
Han in view of Gong with Chen teaches wherein at least one of the weight coefficient and the codebook vector has a binary value or a ternary value as an element.
Courbariaux teaches wherein at least one of the weight coefficient and the codebook vector has a binary value or a ternary value as an element. (Abstract “Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and powerhungry components of the digital implementation of neural networks… also see pg. section 2.4 “Since the binarization operation is not influenced by variations of the real-valued weights w when its magnitude is beyond the binary values ±1, and since it is a common practice to bound weights (usually the weight vector) in order to regularize them, we have chosen to clip the real-valued weights within the [−1, 1] interval right after the weight updates, as per Algorithm 1.”)
Han, Gong, Chen and Courbariaux are analogous art because they are all directed to data automation.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong with Chen to incorporate the teaching of Courbariaux to include binary weight during propagation in deep neural networks.
One of ordinary skill in the art would have been motivated in order to improve faster computation at both training and test time using a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights as disclosed by Courbariaux (abstract).

Claims 9 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Gong) in view of Chen et al. in view of Lane et al. (“DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices”, hereinafter: Lane). 
Regarding claim 9 (Currently Amended)10175431 01
Han in view of Gong with Chen teaches claim 19. 
Han in view of Gong with Chen does not teach wherein the one or more processors further function as an instruction unit configured to allow a user to instruct a constraint condition on a learning parameter.
Lane teaches wherein the one or more processors further function as an instruction unit (abstract “The foundation of DeepX is a pair of resource control algorithms, designed for the inference stage of deep learning, that: (1) decompose monolithic deep model network architectures into unit-blocks of various types, that are then more efficiently executed by heterogeneous local device processors (e.g., GPUs, CPUs);”) configured to allow a user to instruct a constraint condition on a learning parameter. (Pg. 9 left col “In comparison to these baselines, DeepX is free to use any supported unit, and has constrained use of RLC; specifically we only set ℰ𝑇𝐻 to allow expected accuracy drops of < 5%. To validate the accuracy drop, we use the original datasets used to train the respective models and run a large number of offline experiments with varying parametric settings used for RLC and DAD (See Algorithm 1).” Also see pg. 4 left col “The mobile CPU (or another constrained processor) supports initial model layers that have been compacted to meet its memory and computational limits. The remaining majority of model layers are then completed by GPU computation. Note, the model is compressed only where needed by resource constraints, instead of compression being applied across all layers”)
Han, Gong, Chen and Lane are analogous art because they are all directed to data automation. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong with Chen to incorporate the teaching of Lane to improve software accelerator for low-power deep learning inference on mobile devices.
One of ordinary skill in the art would have been motivated to make this modification in order to improve a method or system that can “automatically decompose a deep model across available processors to maximize energy-efficiency and execution time, within fluctuating mobile resource constraints such as computation and memory” as disclosed by Lane (pg. 1 right col third paragraph).

Regarding claim 15 (New)
Han in view of Gong with Chen and Lane teaches claim 9. 
Lane further teaches wherein the encoding unit (abstract “The foundation of DeepX is a pair of resource control algorithms, designed for the inference stage of deep learning, that: (1) decompose monolithic deep model network architectures into unit-blocks of various types, that are then more efficiently executed by heterogeneous local device processors (e.g., GPUs, CPUs);”) performs learning such that the constraint condition instructed by the instruction unit is satisfied (pg. 6 right col “But due to the large number of units and layers that comprise typical deep models, a large variety of potential decompositions exist. Consequently, the search for this plan must balance the speed and efficiency it identifies the plan, along with the need to satisfy user performance goals.”)
and then encodes the weight parameter based on a result of the learning. (pg. 6 right col “Algorithm 1 details the approach by DAD to cope with these competing concerns. Three specific techniques are employed, each narrow the search space by encoding an understanding of the deep learning algorithms and how they execute on within the resource limits presented by hardware. First, the architecture of each deep learning model includes a series of dependencies based on factors such as layer type, which determines the units must be computed in series. This limits groups of layers (Algorithm 1, line 2−7) and units that can be packed together to maximize desirable properties like parallel execution. Second, hardware resource limits dictate if a unit-block of the model is viable or not (line 5 and 7).”)
Han, Gong, Chen and Lane are analogous art because they are all directed to data automation. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong with Chen to incorporate the teaching of Lane to improve software accelerator for low-power deep learning inference on mobile devices.
One of ordinary skill in the art would have been motivated to make this modification in order to improve a method or system that can “automatically decompose a deep model across available processors to maximize energy-efficiency and execution time, within fluctuating mobile resource constraints such as computation and memory” as disclosed by Lane (pg. 1 right col third paragraph).

Regarding claim 16 (New)
Han in view of Gong with Chen and Lane teaches claim 15. 
Han further teaches …and wherein the encoding unit encodes the weight parameter such that the weight parameter after compression coding becomes able to be stored into the memory. (Examiner notes that after compression is done Han teaches storing the index difference see pg. 3 “To compress further, we store the index difference instead of the absolute position, and encode this difference in 8 bits for conv layer and 5 bits for fc layer. When we need an index difference larger than the bound, we the zero padding solution shown in Figure 2: in case when the difference exceeds 8, the largest 3-bit (as an example) unsigned number, we add a filler zero.”)
Lane further teaches wherein the instruction unit (pg. 4 right col section 4.1 “The matrix operations can be efficiently computed using, e.g., a GPU, while applying new vectorization techniques [21].”)
receives, from the user, (Examiner notes that DeepX is run by user and the intent of a user see pg. 5 “There are two key components to RLC. First, a dimensionality reduction process (§IV-A) used to lower the computations required as one layer feeds into the next. Second, an estimator (§IV-B) that regulates the level of dimensionality reduction to be applied before model accuracy is effected beyond the intent of the DeepX user.”) an instruction of the  (pg. 4 right col “Selection of a decomposition plan is strongly influenced by the current available resources. Via OS hooks DeepX receives current resource usage levels before performing an inference. But better decisions can be made using accurate predictions of resource load, and planning for predicted levels”)
Han, Gong, Chen and Lane are analogous art because they are all directed to data automation. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Han in view of Gong with Chen to incorporate the teaching of Lane to improve software accelerator for low-power deep learning inference on mobile devices.
One of ordinary skill in the art would have been motivated to make this modification in order to improve a method or system that can “automatically decompose a deep model across available processors to maximize energy-efficiency and execution time, within fluctuating mobile resource constraints such as computation and memory” as disclosed by Lane (pg. 1 right col third paragraph).


Claims 17 is rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, hereinafter: Han) in view of Gong et al. (“Compressing Deep Convolutional Networks Using Vector Quantization”, hereinafter: Shie et al. (“Transfer Representation Learning for Medical Image Analysis”, hereinafter: Shie). 
Regarding claim 17 (New)
Han in view of Gong teaches claim 1. 
Han in view of Gong does not teach wherein the encoding unit encodes the weight parameter using a codebook that differs depending on a pixel size of a convolution calculation in a convolution layer of the convolutional neural network.
Shie teaches wherein the encoding unit encodes the weight parameter using a codebook that differs depending on a pixel size of a convolution calculation in a convolution layer of the convolutional neural network. (Pg. 712 left col “For each image input, we obtain a feature vector using the codebook. The information of the image moves from the input layer to the output layer through the inner layers. Each layer is a weighted combination of the previous layer and stands for a feature representation of the input image. Since the computation is hierarchical, higher layers intuitively represent higher concepts. For images, the neurons from lower levels describe rudimental perceptual elements like edges and corners, while higher layers represent object parts such as contours and categories. To capture higher-level abstractions, we extract transfer-learned features of OM images from the fifth, sixth and seventh layer, denoted as pool5, fc6 and fc7 in Fig.1, respectively.” Examiner notes that Fig. 1 shows multiple layers of CNN that takes different image/pixel size through its convolution calculations.)
Han, Gong and Shie are analogous art because they are all directed to neural network. 
Han in view of Gong to incorporate the teaching of Shie to improve training classifier classify images using deep learning.
One of ordinary skill in the art would have been motivated in order to overcome major challenges in developing classifier to perform automatic disease diagnosis using codebook and train supervised learning algorithm to achieve faster detection accuracy as disclosed by Shie (abstract).


Response to Arguments
Applicant asserts that “Claim Interpretation 35 U.S.C. 112(f) The Office Action has interpreted claims 1, 2, 5, 7, 9 and 14 under 35 U.S.C. 112(f). Claim 1 has been amended to avoid invoking the interpretation under 35 U.S.C. 112(f).” (Remarks pg. 6)
Examiner’s response:
The Examiner respectfully disagrees. Even though the amended claim 1 recites “processor” the nonce terms from independent claim 1 would still invoke 112f because there isn’t enough hardware components or there is no mention of a memory or instructions being performed by the claim language.

Claim Rejections - 35 U.S.C. § 101 
Claims 1-14 were rejected under 35 U.S.C. 101 as allegedly being directed to an abstract idea without significantly more. Weight parameters in higher layers in a convolutional neural network (CNN), in particular, tend to be sparse and Attorney Docket: 10175431US01…zero values are used in higher layers; therefore, by using different codebooks suited respectively for weight parameters having different properties from one another depending on these layers, it is possible to improve approximation accuracy without increasing the amount of memory used." Therefore, as compared with related art, claim 1 provides an inventive concept of "improving approximation accuracy without increasing the amount of memory used in encoding in a convolutional neural network", which is significantly more than the judicial exception, as described in, for example, [0058] of the specification. Therefore, Applicant believes that claim 1 is patent eligible subject matter.” (Remarks pg. 6)
Examiner’s response:
In response to amendments and/or arguments filed on 02/23/2021, the 35 USC 101 rejections made in the previous Office Action has been withdrawn. 

Claim Rejections - 35 U.S.C. § 102 & § 103 
Claim 1 as amended includes the following features: "divide a weight parameter of a convolutional neural network into a plurality of groups", "approximate and encode the weight parameter using a plurality of codebooks" and "wherein a Amendment for Application No.: 15/713470 unit for performing encoding in a case where the encoding unit encodes the weight parameter of a second layer of the convolutional neural network, which is higher than the first layer".  None of the cited references discloses or suggests the above described features. Han discloses that encoding is performed using a representative value (geometric center) for each group. As can be read from the following description, Han discloses that the number of quantization bits for conv layer and the number of quantization bits for fc layer are different from each other when "index difference" is encoded: "To compress further, we store the index difference instead of the absolute position, and encode this difference in 8 bits for conv layer and 5 bits for fc layer. When we need an index difference larger than the bound, we the zero padding solution shown in Figure 2: in case when the difference exceeds 8, the largest 3- bit (as an example) unsigned number, we add a filler zero. To compress further, we store the index difference instead of the absolute position, and encode this difference in 8 bits for conv layer and 5 bits for fc layer. When we need an index difference larger than the bound, we the zero padding solution shown in Figure 2: in case when the difference exceeds 8, the largest 3-bit (as an example) unsigned number, we add a filler zero." However, Han does not disclose that "index difference" is encoded using a codebook(s). Therefore, Han does not disclose that "wherein a codebook used by the encoding unit for performing encoding in a case where the encoding unit 

Examiner’s response:
Applicant’s arguments with respect to claim(s) 1-3, 6-11 and 13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Han et al.
Xie et al. (“Sparse deep feature learning for facial expression recognition”) teaches a feature sparseness-based regularization that learns deep features with better generalization capability. 
Chen et al. (“Compressing Convolutional Neural Networks in the Frequency Domain”) teaches a network architecture, Frequency-Sensitive Hashed Nets (FreshNets), which exploits inherent redundancy in both convolutional layers and fully-connected layers of a deep learning model, leading to dramatic savings in memory and storage consumption. 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN C MANG whose telephone number is (571)270-7598.  The examiner can normally be reached on Mon - Fri 8:00-5:00pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on 5712729767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/V.M./Examiner, Art Unit 2126
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126