Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
	Claims 1-18, 21, and 24 are currently pending and under consideration, in accordance with the preliminary amendment filed with the application.  

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  
Claims 18 and 21 invoke § 112(f) because they include the following terms that are interpreted under § 112(f):
“parameter quantizer configured to…” recited in claim 18
“neural network interface configured to…” recited in claim 18
“sample quantizer configured to…” recited in claim 18
The terms “parameter quantizer” and “sample quantizer” are used as a substitute for “means” because they are generic placeholders that have no specific structural meaning. In this context, “quantizer” is equivalent in to “means for quantizing. The term “neural network interface” is also a generic placeholder because it similarly does not have specific structural meaning, especially given that paragraph 29 of the specification describes system 200, the object that neural network interface 120 interacts with, as being “any system,” which is not a specific type of object. Accordingly, “interface” in the present context is not regarded as having a specific structural meaning of, for example, a communications-network hardware interface. Since these terms are modified by functional language and are not modified by sufficient structure, material, or acts for performing the claimed function, they are regarded as invoking § 112(f).
Support for the above limitations is provided by FIG. 11, which illustrates hardware in the form of a general purpose computer, and the algorithms shown in FIG. 3 and described in the associated descriptions in paragraphs 44-97 of the specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 4-5, 17-18, 21 and 24 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Yao (US 2018/0046894 A1) (cited by applicant).
As to claim 1, Yao teaches an artificial neural network (ANN) quantization method for generating an output ANN by quantizing an input ANN, [[0013]: “a method for optimizing an Artificial Neural Network (ANN)”; Abstract: “fix-point quantization and compiling the neural network model”; [0102]: “the proposed quantization flow mainly consists of two phases: Step 610: the weight quantization phase, and Step 620: the data quantization phase.” In general, the reference teaches a method for optimizing an ANN by quantization, which generates a quantized ANN.] the ANN quantization method comprising:
obtaining second parameters by quantizing first parameters of the input ANN; [[0103]: “In step 610, the weight quantization phase aims to find the optimal fl for weights in one layer, as shown in Equation 10:                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            W
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                    
                                                    -
                                                    W
                                                    
                                                        
                                                            b
                                                            w
                                                            ,
                                                            
                                                                
                                                                    f
                                                                
                                                                
                                                                    l
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                             
                        
                    … where W is a weight and W(bw; fl) represents the fixed-point format of W under the given bw and fl.” [0099]: “bw is the bit width of the number and fl is the fractional length.” Note that W(bw; fl) refers to the quantized weights for a given fractional length fl. The weights that are being quantized correspond to the claim limitation of “first parameters of the input ANN.”]  
obtaining a sample distribution from an intermediate ANN [In describing the data quantization phase, which is the second quantization phase, [0110] teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     as given in equation 12, where “x+ represents the result of a layer when we denote the computation of a layer as                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            =
                            A
                            ⋅
                            x
                        
                    .” Here, “x+” reads on the limitation of a “sample distribution,” since it is a feature map (which has a distribution of values) that has been obtained as a result of data input through the neural network.] in which the obtained second parameters have been applied to the input ANN; [[0114]: “step 610 is conducted before step 620. That is, it finishes weight quantization of all CONV layers and FC layers of the ANN, and then conducts data quantization for each feature map set on the basis of the quantized CONV layers and FC layers.” That is, the above weight quantization phase results in an “intermediate ANN,” described as the “quantized CONV layers and FC layers.” This intermediate ANN serves as the basis for the feature map set in step 620, which is the data quantization phase noted above (see [0108]: “In step 620, the data quantization phase aims to find the optimal fl for a set of feature maps between two layers.”).] and
obtaining a fractional length for the obtained sample distribution by quantizing the obtained sample distribution. [For the data quantization phase, paragraph [0110], teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     (equation 10), where fl is the “fractional length,” as taught in [0099] (“fl is the fractional length.”). An optimal fractional length (fl) is selected from candidates, as described in claim 11 of Yao (“designating an initial value for fl, searching for an optimal f1 in the adjacent domains of the initial value”) (see also [0105], which describes the same process but for the weight quantization step; claim 11 of Yao teaches that the same method is also used for the data quantization step). The act of “quantizing the obtained sample distribution” is taught by the operation                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                    , which is the fixed-point (i.e., quantized) format of the feature map, based on the description in paragraph [0104].]

As to claim 2, Yao teaches the ANN quantization method of claim 1, wherein the obtaining the second parameters comprises:
obtaining quantized parameters by quantizing the first parameters according to a given fractional length, [In equation (10) in paragraphs [0103]-[0104], i.e., the equation                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            W
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                    
                                                    -
                                                    W
                                                    
                                                        
                                                            b
                                                            w
                                                            ,
                                                            
                                                                
                                                                    f
                                                                
                                                                
                                                                    l
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                    , the term                         
                            W
                            
                                
                                    b
                                    w
                                    ,
                                    
                                        
                                            f
                                        
                                        
                                            l
                                        
                                    
                                
                            
                        
                     corresponds to quantizing the weights according to a given fractional length                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                        
                    ] and calculating quantization errors between the first parameters and the obtained quantized parameters; [In the above equation, the                         
                            
                                
                                    W
                                
                                
                                    f
                                    l
                                    o
                                    a
                                    t
                                
                            
                            -
                            W
                            
                                
                                    b
                                    w
                                    ,
                                    
                                        
                                            f
                                        
                                        
                                            l
                                        
                                    
                                
                            
                        
                     values are the quantization errors between the weights and the quantized weights.]
calculating an evaluation value of the given fractional length, based on the calculated quantization errors; [In the above equation, the sum absolute-value quantization error                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    W
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                            
                                            -
                                            W
                                            
                                                
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     corresponds to an “evaluation value”] and
obtaining a final fractional length for the second parameters, based on a plurality of evaluation values corresponding to a plurality of fractional lengths. [[0103]: “In step 610, the weight quantization phase aims to find the optimal fl for weights in one layer, as shown in Equation 10.” The “optimal fl” corresponds to a “final fractional length.” In equation (10), there is a plurality of summed quantization errors that are summed, and the “argmin” operator indicates that the best fractional length is selected as one that has the lowest summed quantization error.]

As to claim 4, Yao teaches the ANN quantization method of claim 1, wherein:
the input ANN comprises layers and channels, each having at least one parameter; [[0037]: “As shown in FIG. 1A, a typical CNN consists of a number of layers that run in sequence.” [0038]: “The parameters of a CNN model are called ‘weights’. The first layer of a CNN reads an input image and outputs a series of feature maps. The following layers read the feature maps generated by previous layers and output new feature maps…” [0130]; “Assuming the input feature map is N*N, having C channels. For example, RGB image has three channels, and assuming the ANN accelerator can process D channels of input feature maps of M*M at one time”] and
the first parameters comprise one or more parameters from among the at least one parameter of each of the layers and the channels. [As noted in the rejection of claim 1, in the “weight quantization phase,” weights are quantized. See [0103]: “…find the optimal fl for weights in one layer, as shown in Equation 10:                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            W
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                    
                                                    -
                                                    W
                                                    
                                                        
                                                            b
                                                            w
                                                            ,
                                                            
                                                                
                                                                    f
                                                                
                                                                
                                                                    l
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                             
                        
                    … where W is a weight and W(bw; fl) represents the fixed-point format of W under the given bw and fl.” ([0103]). Thus, quantization is performed on one or more weights of a layer.]

As to claim 5, Yao teaches the ANN quantization method of claim 4, wherein the first parameters comprise at least one of weights, biases, and thresholds. [As noted in the rejection of claims 1 and 4, above, the “weight quantization phase” in Yao includes the quantization of weights. The phrase “at least one of weights, biases, and thresholds” is regarded as an alternate expression requiring “weights,” “biases,” and/or “thresholds” and is met on at least the basis of the item of “weights.”]

As to claim 17, Yao teaches the ANN quantization method of claim 1, further comprising obtaining a fixed-point ANN as the output ANN, based on the obtained second parameters and the obtained fractional length. [Yao, [0093]: “Using short fixed-point numbers instead of long floating-point numbers is efficient for implementations on the FPGA platform and can significantly reduce memory footprint and bandwidth requirements.” That is, the purpose of Yao’s method is to generate a fixed-point neural network defined by the fractional length. Specifically, the two quantization phrases for weight and data quantization result in a fixed-point ANN. See [0104] (“W(bw; fl) represents the fixed-point format of W under the given bw and fl”; [0109] (“the intermediate data of the fixed-point CNN model”); [0151] (“converts floating-point numbers into fixed-point numbers”).]

As to claim 18, Yao teaches an apparatus for generating an output artificial neural network (ANN) by quantizing an input ANN, [[0013]: “optimizing an Artificial Neural Network (ANN)”; Abstract: “fix-point quantization and compiling the neural network model”; [0102]: “the proposed quantization flow mainly consists of two phases: Step 610: the weight quantization phase, and Step 620: the data quantization phase.” In general, the reference teaches a method for optimizing an ANN by quantization, which generates a quantized ANN.] the apparatus comprising: 
a memory storing computer-executable instructions; [[0015]: “an external memory, configured for storing weights and instructions of the ANN and input data to be processed by said ANN”] and
at least one processor configured to execute the stored computer-executable instructions to implement: [[0015]: “a deep processing unit (DPU)…comprising: a CPU, configured for scheduling a programmable logic module…a plurality of processing elements (PEs), configured for performing operations on the basis of the instructions, weights, and data; an input buffer, configured for preparing the input data, weights and instructions for the computing complex.”]
a parameter quantizer configured to obtain second parameters by quantizing first parameters of the input ANN; [[0103]: “In step 610, the weight quantization phase aims to find the optimal fl for weights in one layer, as shown in Equation 10:                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            W
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                    
                                                    -
                                                    W
                                                    
                                                        
                                                            b
                                                            w
                                                            ,
                                                            
                                                                
                                                                    f
                                                                
                                                                
                                                                    l
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                             
                        
                    … where W is a weight and W(bw; fl) represents the fixed-point format of W under the given bw and fl.” [0099]: “bw is the bit width of the number and fl is the fractional length.” Note that W(bw; fl) refers to the quantized weights for a given fractional length fl. The weights that are being quantized correspond to the claim limitation of “first parameters of the input ANN.”]  
a neural network interface configured to obtain a sample distribution from an intermediate ANN [In describing the data quantization phase, which is the second quantization phase, [0110] teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     as given in equation 12, where “x+ represents the result of a layer when we denote the computation of a layer as                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            =
                            A
                            ⋅
                            x
                        
                    .” Here, “x+” reads on the limitation of a “sample distribution,” since it is a feature map (which has a distribution of values) that has been obtained as a result of data input through the neural network.] in which the obtained second parameters have been applied to the input ANN; [[0114]: “step 610 is conducted before step 620. That is, it finishes weight quantization of all CONV layers and FC layers of the ANN, and then conducts data quantization for each feature map set on the basis of the quantized CONV layers and FC layers.” That is, the above weight quantization phase results in an “intermediate ANN,” described as the “quantized CONV layers and FC layers.” This intermediate ANN serves as the basis for the feature map set in step 620, which is the data quantization phase noted above (see [0108]: “In step 620, the data quantization phase aims to find the optimal fl for a set of feature maps between two layers.”).] and
a sample quantizer configured to obtain a fractional length for the obtained sample distribution. [For the data quantization phase, paragraph [0110], teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     (equation 10), where fl is the “fractional length,” as taught in [0099] (“fl is the fractional length.”). An optimal fractional length (fl) is selected from candidates, as described in claim 11 of Yao (“designating an initial value for fl, searching for an optimal f1 in the adjacent domains of the initial value”) (see also [0105], which describes the same process but for the weight quantization step; claim 11 of Yao teaches that the same method is also used for the data quantization step). The act of “quantizing the obtained sample distribution” is taught by the operation                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                    , which is the fixed-point (i.e., quantized) format of the feature map, based on the description in paragraph [0104].]

As to claim 21,  Yao teaches the apparatus of claim 18, wherein:
the neural network interface is further configured to provide at least one candidate fractional length for the sample distribution, received from the sample quantizer, [Yao, [0110] and equation (12) teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                    , where fl is the “fractional length” as taught in [0099] (“fl is the fractional length.”). The argmin operation refers to selecting a best value of fl from among candidate fractional lengths. An optimal fl is selected from candidates, as described in claim 11 of Yao (“designating an initial value for fl, searching for an optimal f1 in the adjacent domains of the initial value”) (see also [0105], which describes the same process but for the weight quantization step; claim 11 of Yao teaches that the same method is also used for the data quantization step). Here, the search domain is the “adjacent domains,” which includes a plurality of candidate fractional lengths.] to a test ANN, and to obtain a test sample distribution from the test ANN; [The                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                     term in the above equation corresponds to a test sample distribution obtained from a test ANN, since “x+” is the output result of a neural network layer. The “test ANN” is taught in the form of an ANN that is already subject to the weight quantization ([0114]: “…and then conducts data quantization for each feature map set on the basis of the quantized CONV layers and FC layers.”) and further configured with the given bit width bw and the fractional length fl, as denoted by                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                    .] and
the sample quantizer is further configured to determine one from among the at least one candidate fractional length as the fractional length for the sample distribution, based on the obtained test sample distribution. [As noted above, Yao teaches that an optimal fl is selected from candidate fractional lengths. The “argmin” operation in equation (12) refers to selecting a best value of fl that minimizes the stated error measure. This operation is “based on the obtained test sample distribution” because it is based on the various values of                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            
                                
                                    b
                                    w
                                    ,
                                    
                                        
                                            f
                                        
                                        
                                            l
                                        
                                    
                                
                            
                        
                    .]

As to claim 24, Yao teaches a method of quantizing a floating-point neural network, [[0013]: “a method for optimizing an Artificial Neural Network (ANN)”; Abstract: “fix-point quantization and compiling the neural network model”; [0102]: “the proposed quantization flow mainly consists of two phases: Step 610: the weight quantization phase, and Step 620: the data quantization phase.” The quantization processes quantizes a “floating-point neural network,” as taught in [0013] (“weight quantization step, for converting weights…from floating-point numbers into fixed-point numbers… data quantization step, for converting data of feature map sets j from floating-point numbers into fixed-point numbers”).] the method comprising:
obtaining quantized parameters by quantizing parameters in a same category in the floating-point neural network; [[0103]: “In step 610, the weight quantization phase aims to find the optimal fl for weights in one layer, as shown in Equation 10:                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            W
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                    
                                                    -
                                                    W
                                                    
                                                        
                                                            b
                                                            w
                                                            ,
                                                            
                                                                
                                                                    f
                                                                
                                                                
                                                                    l
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                             
                        
                    … where W is a weight and W(bw; fl) represents the fixed-point format of W under the given bw and fl.” [0099]: “bw is the bit width of the number and fl is the fractional length.” Note that W(bw; fl) refers to the quantized weights for a given fractional length fl. The weight values that are being quantized are parameters in a same category, namely the weights in the original floating-point neural network. Note that                         
                            
                                
                                    W
                                
                                
                                    f
                                    l
                                    o
                                    a
                                    t
                                
                            
                        
                     refers to floating-point weights.]  
obtaining a sample distribution [In describing the data quantization phase, which is the second quantization phase, [0110] teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     as given in equation 12, where “x+ represents the result of a layer when we denote the computation of a layer as                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            =
                            A
                            ⋅
                            x
                        
                    .” Here, “x+” reads on the limitation of a “sample distribution,” since it is a feature map (which has a distribution of values) that has been obtained as a result of data input through the neural network.] from a semifixed-point artificial neural network (ANN) in which the obtained quantized parameters have been applied to the floating-point neural network; [[0114]: “step 610 is conducted before step 620. That is, it finishes weight quantization of all CONV layers and FC layers of the ANN, and then conducts data quantization for each feature map set on the basis of the quantized CONV layers and FC layers.” That is, the above weight quantization phase results in an “intermediate ANN,” described as the “quantized CONV layers and FC layers.” This intermediate ANN serves as the basis for the feature map set in step 620, which is the data quantization phase noted above (see [0108]: “In step 620, the data quantization phase aims to find the optimal fl for a set of feature maps between two layers.”). Regarding the limitation of “semifixed-point artificial neural network (ANN),” the Examiner recognizes that paragraph 36 of the specification states that “an ANN obtained by applying quantized parameters to the floating-point neural network may be referred to as a semifixed-point neural network.” Since the neural network in Yao has had the weights quantized, it meets the instant limitation.] and
obtaining a fractional length for the obtained sample distribution. [For the data quantization phase, paragraph [0110], teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                     (equation 10), where fl is the “fractional length,” as taught in [0099] (“fl is the fractional length.”). An optimal fractional length (fl) is selected from candidates, as described in claim 11 of Yao (“designating an initial value for fl, searching for an optimal f1 in the adjacent domains of the initial value”) (see also [0105], which describes the same process but for the weight quantization step; claim 11 of Yao teaches that the same method is also used for the data quantization step).]

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

1.	Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Yao in view of Tremeau et al. “Color quantization error in terms of perceived image quality,” Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing, 1994, pp. V/93-V/96 vol.5 (“Tremeau”).
As to claim 3, Yao teaches the ANN quantization method of claim 2, wherein:
the calculating the evaluation value comprises calculating, as the evaluation value, a sum of […] the calculated quantization errors; [Yao, Equation (10), teaches the sum of calculated quantization errors                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    W
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                            
                                            -
                                            W
                                            
                                                
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     corresponds to an “evaluation value”] and
the obtaining the final fractional length comprises determining, as the final fractional length, a fractional length corresponding to a minimum evaluation value from among the plurality of evaluation values. [As note in the rejection of claim 2, in Equation (10), there is a plurality of summed absolute-value quantization errors, and the “argmin” operator indicates that the best fractional length is selected as one that has the lowest summed absolute-value quantization error.]
Yao does not teach the limitation that the measure of quantization error is “a sum of squares of the calculated quantitation errors.”
Tremeau, in an analogous art, teaches the above limitation, which is well-known measure of error. Tremeau generally pertains to “color quantization error in terms of perceived image quality” (see title), and is therefore at least reasonably pertinent to the problems faced by the inventors of the claimed invention, namely quantization. 
In particular, Tremeau teaches “a sum of squares of the calculated quantitation errors” [§ I, paragraph 1: “The perceived quality of a quantization process is generally studied through the use of the sum-of-squared error (SSE measure). This measure corresponds to: [see equation in the text] where h represents the original image, f its quantized representation.” Note that h and f are analogous to                         
                            
                                
                                    W
                                
                                
                                    f
                                    l
                                    o
                                    a
                                    t
                                
                            
                        
                     and                         
                            W
                            
                                
                                    b
                                    w
                                    ,
                                    
                                        
                                            f
                                        
                                        
                                            l
                                        
                                    
                                
                            
                        
                     in Yao.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao with the teachings of Tremeau by modifying the measure of quantization error to be “a sum of squares of the calculated quantitation errors,” such that the operation of calculating the evaluation value comprises calculating, as the evaluation value, “a sum of squares of the calculated quantitation errors.” The motivation for doing so would have been to evaluate the quality of a quantization process using a commonly used error measure, to yield the predictable result of evaluating the quantization process based on such metric, as suggested by Tremeau (see part quoted above).

2.	Claims 6-10, 12-13, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Yao in view of Lin et al. (US 2016/0328646 A1) (“Lin”) (cited by applicant).
 As to claim 6, Yao teaches the ANN quantization method of claim 1, but does not teach that “the obtaining the fractional length for the obtained sample distribution” comprises the further limitations recited in the instant claim.
Lin, in an analogous art, teaches the further limitations. Lin teaches “floating point neural network quantization” (title). Therefore, Lin is in the same field of endeavor as the claimed invention, namely machine learning, and is also specifically pertinent to the problem of quantizing neural networks.
In particular, Lin teaches:
splitting the obtained sample distribution into a plurality of sample groups; [[0064]: “FIGS. 5A and 5B illustrate distributions of activation values and weights in different layers of an exemplary deep convolutional network. FIG. 5A shows the activation values for convolution layers zero to five (conv0, . . . , conv5) and fully connected layers one and two (fc1, fc2).” As shown in FIG. 5A, the is a plurality of sample groups across multiple layers. The Examiner notes that the instant claim does not specifically define what the “sample distribution” is or what it is output from. Therefore, the entire set of activations in different layers may be regarded as a sample distribution, and the determination of individual distributions for respective layers may be regarded as the act of splinting the sample distribution”] 
approximating the plurality of sample groups to a plurality of continuous probability distributions (CPDs), respectively; [[0078]: “μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′.” See also claim 7: “determining the quantizer step size…based at least in part on a mean value and standard deviation of the bias, the weight, and/or the activation values from the floating point machine learning network.” The computation of a mean and standard deviation constitutes approximating the distribution as a continuous probability distribution, such as a Gaussian distribution.]
obtaining a plurality of step sizes, based on the plurality of sample groups and the plurality of CPDs; [[0078]: “Application of quantization to the weights, biases, and activation values in artificial neural networks includes the determination of a step size. For fixed point representations, the step sizes may be limited to powers of 2… Equations for determining step size may be specified as…where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′. Next, an effective step size is computed based on the effective sigma value σ′ as follows.” That is, a step size is computed for each distribution based on the effective sigma value, which is derived from the mean and standard deviation.] and
selecting the fractional length, based on the obtained plurality of step sizes. [[0078]: “Determining the step size that is a power of the 2 may correspond to determining the number of fractional bits in the fixed point number representation…Finally, a closest power of 2 is determined for the step size as follows: n = … (16) where n is the number of fractional bits that may be specified to represent the quantizer input and 2−n may be specified as the step size.” Note that “the closest power of 2” refers to “n”, which is the number of fractional bits, which is the fractional length. The Examiner note that the term “based on” does not require the plurality of step sizes to be utilized in a specific manner to determine the fractional length, and is satisfied when the fractional length is determined based on any part of the set of step sizes denoted by the plurality of step sizes. Furthermore, while Lin teaches obtaining a particular fractional length for a layer, the obtained fractional length may be regarded as analogous to the “initial value for f1” in claim 11 of Yao, which serves as the basis for further values fractional lengths to be searched.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao and Lin by modifying the operation of obtaining the fractional length for the obtained sample distribution to include “splitting the obtained sample distribution into a plurality of sample groups; approximating the plurality of sample groups to a plurality of continuous probability distributions (CPDs), respectively; obtaining a plurality of step sizes, based on the plurality of sample groups and the plurality of CPDs; and selecting the fractional length, based on the obtained plurality of step sizes.” The motivation would have been to obtain suitable fractional lengths that account for the distribution characteristic of an input, as suggested by Lin ([0010]: “determining quantizer parameters for quantizing values of the floating point machine learning network based at least in part on the at least one selected moment of the input distribution of the floating point machine learning network to obtain corresponding values of the fixed point machine learning network.”), so as to improve the quantization of the artificial neural network ([0062]: “improving quantization of the weights, biases, and/or activations in artificial neural networks by applying various optimizations”).

As to claim 7, the combination of Yao and Lin teaches the ANN quantization method of claim 6, wherein the splitting the obtained sample distribution into the plurality of sample groups comprises splitting the obtained sample distribution into a first sample group including negative samples and zero and a second sample group including positive samples. [Lin, FIG. 5A shows multiple distributions of the activation values (i.e., multiple sample groups) that include samples that are negative, zero, and positive, as denoted by the x-axis labels. For example, the sample group labeled “conv1” has “negative samples and zero,” and the sample group labeled “conv2” has “positive samples.” The instant claim does not limit the sample groups to consist of only the recited types of samples.] 

As to claim 8, the combination of Yao and Lin teaches the ANN quantization method of claim 6, wherein: the splitting of the sample distribution into the plurality of sample groups comprises splitting the sample distribution into a first sample group and a second sample group; and the first sample group includes negative samples, and the second sample group includes zero and positive samples. [Lin, FIG. 5A shows multiple distribution of the activation values (i.e., multiple sample groups) that include samples that are negative, zero, and positive, as denoted by the x-axis labels. For example, the sample group labeled “conv1” (a first sample group) has “negative samples,” and the sample group labeled “conv2” (a second sample group) has “zero and positive samples.” The instant claim does not limit the sample groups to consist of only the recited types of samples.]

As to claim 9, the combination of Yao and Lin teaches the ANN quantization method of claim 6, wherein: the splitting the sample distribution into the plurality of sample groups comprises splitting the sample distribution into a first sample group and a second sample group; and the first sample group includes negative samples, and the second sample group includes positive samples. [Lin, FIG. 5A shows multiple distribution of the activation values (i.e., multiple sample groups) that include samples that are negative, zero, and positive, as denoted by the x-axis labels. For example, the sample group labeled “conv1” (a first sample group) has “negative samples,” and the sample group labeled “conv2” (a second sample group) has “positive samples.” The instant claim does not limit the sample groups to consist of only the recited types of samples.]

As to claim 10, the combination of Yao and Lin teaches the ANN quantization method of claim 6, wherein the approximating the plurality of sample groups comprises approximating each of the plurality of sample groups to a generalized gamma distribution, a Gaussian distribution, or a Laplacian distribution. [Lin, [0064]: “For example, the step sizes of a symmetric uniform quantizer for Gaussian, Laplacian, and Gamma distributions may be calculated with a deterministic function of the standard deviation of the input distribution, if it is assumed that the distributions have zero mean and unit variance… In one configuration, both the weights and activation values are assumed to have Gaussian distributions”].

As to claim 12, the combination of Yao and Lin teaches the ANN quantization method of claim 6, wherein the selecting the fractional length comprises:
obtaining candidate fractional lengths, [Yao, [0110] and equation (12) teaches that the optimization target is                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                    , where fl is the “fractional length” as taught in [0099] (“fl is the fractional length.”). The argmin operation refers to selecting a best value of fl from among candidate fractional lengths. An optimal fl is selected from candidates, as described in claim 11 of Yao (“designating an initial value for fl, searching for an optimal f1 in the adjacent domains of the initial value”) (see also [0105], which describes the same process but for the weight quantization step; claim 11 of Yao teaches that the same method is also used for the data quantization step). Here, the search domain is the “adjacent domains,” which includes a plurality of candidate fractional lengths.] based on the obtained plurality of step sizes; [Lin, in the combination of Yao and Lin set forth above, teaches determining a fractional length based on the step size, as discussed in the rejection of claim 6. See, e.g., Lin, [0078]: “Determining the step size that is a power of the 2 may correspond to determining the number of fractional bits in the fixed point number representation… n is the number of fractional bits that may be specified to represent the quantizer input and 2−n may be specified as the step size.”] and
selecting, as the fractional length, one from among the obtained candidate fractional lengths. [As noted in the rejection of claim 1, Yao teaches that an optimal fl is selected from candidates. The “argmin” operation in equation (12) refers to selecting a best value of fl that minimizes the stated error quantity.]

As to claim 13, the combination of Yao and Lin teaches the ANN quantization method of claim 12, wherein the obtaining the candidate fractional lengths comprises:
obtaining fractional lengths corresponding to step sizes adjacent to the obtained plurality of step sizes; [Lin teaches obtaining fraction lengths, as discussed in the rejections of the parent claims. Paragraph [0078] teaches the ceiling operation is used on the quantity “log2sfloat” in equation (16) and the last sentence of this paragraph teaches that other operators are also possible: “Other rounding functions may be used to obtain an integer n in addition to                          
                            
                                
                                    ⋅
                                
                            
                        
                    (ceiling operation), including round (•) and                         
                            
                                
                                    ⋅
                                
                            
                        
                     (floor operation).” Since different rounding functions may return the same value corresponding to a different input of a different rounding function (e.g., the ceiling of one value may be equal to the “round” of another value), the limitation of “corresponding to adjacent step sizes” is met. The Examiner notes that the claim language of “corresponding to” does not require the use of a specific sequence of mathematical operations in relation to the “step sizes adjacent to.” Therefore, by teaching fractional lengths that are integers derived from step sizes by, it is understood that these factional lengths may correspond to various other possible adjacent step sizes, so as to satisfy the instant limitation.] and
determining a range of fractional lengths, based on the obtained fractional lengths. [Yao, [0105]: “the fl is initialized to avoid data overflow…we search for the optimal fl in the adjacent domains of the initial fl.”]

As to claim 15, the combination of Yao and Lin teaches the ANN quantization method of claim 13, wherein the selecting the one from among the obtained candidate fractional lengths comprises:
calculating errors corresponding to the obtained candidate fractional lengths, based on the obtained candidate fractional lengths [Yao, [0110] and equation (12) teaches the optimization process                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                    , where the term                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                                
                                                    +
                                                
                                            
                                            -
                                            
                                                
                                                    x
                                                
                                                
                                                    +
                                                
                                            
                                            (
                                            b
                                            w
                                            ,
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                     is the summed absolute-value error corresponding to the candidate fractional lengths fl and are calculated as a function of candidate fractional lengths f1, as indicated by                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                    .] and the plurality of CPDs; [As set forth in the rejection of claim 6, Lin teaches the use of CPDs to obtain fractional lengths, which correspond to the initial fractional length in the search method of Yao.] and
selecting the one candidate fractional length, based on the calculated errors. [The above optimization process of Yao finds the optimal fractional length ([0108]: “In step 620, the data quantization phase aims to find the optimal fl”) that minimizes the error term                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                                
                                                    +
                                                
                                            
                                            -
                                            
                                                
                                                    x
                                                
                                                
                                                    +
                                                
                                            
                                            (
                                            b
                                            w
                                            ,
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    , as indicated by the “argmin” operator of equation (12)].

As to claim 16, the combination of Yao and Lin teaches the ANN quantization method of claim 13, wherein the selecting the one from among the obtained candidate fractional lengths comprises:
obtaining test sample distributions from test ANNs respectively depending on the obtained candidate fractional lengths; [Yao, [0110] and equation (12) teaches the optimization process                         
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            =
                            
                                
                                    
                                        
                                            argmin
                                        
                                        
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                                ⁡
                                
                                    
                                        ∑
                                        
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            f
                                                            l
                                                            o
                                                            a
                                                            t
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    -
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            +
                                                        
                                                    
                                                    (
                                                    b
                                                    w
                                                    ,
                                                    
                                                        
                                                            f
                                                        
                                                        
                                                            l
                                                        
                                                    
                                                    )
                                                
                                            
                                        
                                    
                                
                            
                        
                    , where                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                     is the output of the layer as a function of the candidate fractional length fl. Here,                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                     is a sample distribution because it is a feature map (which has a distribution of values) that has been obtained as a result of data input through the neural network ] 
calculating errors corresponding to the obtained test sample distributions, based on the obtained test sample distributions [In the above equation, the term                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                                
                                                    +
                                                
                                            
                                            -
                                            
                                                
                                                    x
                                                
                                                
                                                    +
                                                
                                            
                                            (
                                            b
                                            w
                                            ,
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                     is the summed absolute-value error corresponding to the candidate fractional lengths fl and are calculated as a function of candidate the test sample distribution                         
                            
                                
                                    x
                                
                                
                                    +
                                
                            
                            (
                            b
                            w
                            ,
                            
                                
                                    f
                                
                                
                                    l
                                
                            
                            )
                        
                    .] and the plurality of CPDs; [As set forth in the rejection of claim 6, Lin teaches the use of CPDs to obtain fractional lengths, which correspond to the initial fractional length in the search method of Yao.] and
selecting the one candidate fractional length, based on the calculated errors. [The above optimization process of Yao finds the optimal fractional length ([0108]: “In step 620, the data quantization phase aims to find the optimal fl”) that minimizes the error term                         
                            
                                ∑
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    f
                                                    l
                                                    o
                                                    a
                                                    t
                                                
                                                
                                                    +
                                                
                                            
                                            -
                                            
                                                
                                                    x
                                                
                                                
                                                    +
                                                
                                            
                                            (
                                            b
                                            w
                                            ,
                                            
                                                
                                                    f
                                                
                                                
                                                    l
                                                
                                            
                                            )
                                        
                                    
                                
                            
                        
                    , as indicated by the “argmin” operator of equation (12)].

3.	Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Yao in view of Lin, and further in view of Hui et al., “When is overload distortion negligible in uniform scalar quantization?” Proceedings of IEEE International Symposium on Information Theory, 1997, p. 517 (“Hui”) and P. Kabal, "Quantizers for the gamma distribution and other symmetrical distributions," in IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 4, pp. 836-841, August 1984 (“Kabal”). 
As to claim 11, the combination of Yao and Lin teaches the ANN quantization method of claim 6, but does not teach that “the obtaining the plurality of step sizes” comprises the further limitations of the instant claim.
Hui, in an analogous art, teaches the limitations of “obtaining an overload distortion and a granular.” Hui generally pertains to “uniform scalar quantizers” (see abstract), and is therefore at least reasonably pertinent to the problem of quantization faced by the inventors of the claimed invention.
In particular, Hui teaches “obtaining an overload distortion and a granular distortion according to a step size for each of the plurality of CPDs” [§ II, paragraph 1: “The MSE [i.e., mean square error] DN of an optimal uniform quantizer for p(x) can be written as the sum of granular distortion DN,gran and overall distortion DN,over, which are, respectively, the contributions from within and outside the support interval” Note that “DN,over” is defined as the “overload distortion” the bottom line of the first column.”]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao and Lin with the teachings of Hui by modifying the operation of “obtaining the plurality of step sizes” to include “obtaining an overload distortion and a granular distortion.” The motivation would have been to compute an error measure that is relevant to the characteristics of a quantizer, including the step size, as suggested by Hui. See Hui, § 1: “asymptotics of optimal step size ΔN and the minimum mean-squared error (MSE) DN for generalized Gaussian source densities were found… the minimum MSE for almost any source density p(x) is given by [a function that maps N, the number of quantization points, to DN, the error]” and Hui, § II, Theorem 1: “the optimal ΔN satisfies…”).
The thus-far combination of references does not explicitly teach the remaining limitations that the obtaining of the distortions is “according to a step size for each of the plurality of CPDs” and that the obtaining the plurality of step sizes further comprises “obtaining each of the plurality of step sizes, based on” said distortions.
Kabal, in an analogous art, teaches or suggests the remaining limitations. Kabal teaches “quantizers for the gamma distribution and other symmetrical distributions” (title), and is therefore at least reasonably pertinent to the problem of quantization faced by the inventors of the claimed invention.  
In particular Kabal suggests obtaining the distortions “according to a step size for each of the plurality of CPDs” [Page 839, bottom right paragraph (portion below equation 16): “The step size and offset were calculated using a two-dimensional minimization with the mean-square error as the objective function. The three numbers at the bottom of each entry in the table are the mean-square error …This table shows that for N even, the offset for non-symmetric quantizers is nearly equal to one half of the step size.” Thus, the mean square error is calculated for combinations of step size and offsets, as shown in Table II on page 840. Note that the “mean-square error” here is analogous to the overload and granular distortions, which are components of the error.] and that the obtaining the plurality of step sizes further comprises “obtaining each of the plurality of step sizes, based on” the overload and granular distortions [As noted above, Kabal teaches that “The step size and offset were calculated using a two-dimensional minimization with the mean-square error as the objective function.” See also page 839, bottom right paragraph (portion above equation 16): “the optimal quantizers…Table II compares symmetric and nonsymmetric uniformly spaced quantizers. The table entries are the interval between levels, Δ and the offset of the quantizer relative to a symmetrical quantizer, e.” That is, the an optimal step size is calculated based on the minimization with the mean-square error. Since the “mean-square error” here is analogous to the overload and granular distortions, which are components of the error, Kabal suggests obtaining step sizes based on the overload and granular distortions.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao, Lin, and Hui with the teachings of Kabal by further modifying the operation of “obtaining the plurality of step sizes” such that the obtaining of the overload and granular distortions is “according to a step size for each of the plurality of CPDs” and that this operation further comprises “obtaining each of the plurality of step sizes, based on” said distortions. The motivation would have been to obtain optimal step sizes for optimizing the quantizer, as suggested by Kabal, § I, paragraph 2 (“tables for quantizers for the gamma distribution giving both the optimal quantizers and the best symmetric quantizers are presented.”)

4.	Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Yao in view of Lin, and further in view of Iizuka et al. (US 2012/0134555 A1) (“Iizuka”) and Lin et al. (US 2016/0328644 A1) (hereinafter referred to as “Lin ‘644” to differentiate from the previously-applied “Lin” reference) (cited by applicant).
As to claim 14, the combination of Yao and Lin teaches the ANN quantization method of claim 13, but does not teach the further limitations of the instant claim that the determining the range of the fractional lengths comprises “determining, as a lower limit of the range, a value obtained by subtracting a first margin from a minimum fractional length among the obtained fractional lengths, and determining, as an upper limit of the range, a value obtained by adding a second margin from a maximum fractional length among the obtained fractional lengths” and “the first margin and the second margin are determined based on a performing ability of the ANN quantization method.”
Iizuka, in an analogous art, teaches or suggests the first limitation quoted above. Iizuka generally relates to image analysis and search based on image characteristics (see abstract). Since Iizuka is in the field of data analysis, it is considered to be in the same general field of endeavor as the claimed invention.
In particular, Iizuka teaches “determining, as a lower limit of the range, a value obtained by subtracting a first margin from a minimum fractional length among the obtained fractional lengths, and determining, as an upper limit of the range, a value obtained by adding a second margin from a maximum fractional length among the obtained fractional lengths” [[0049]: “the minimum value and the maximum value are determined on each coordinate axis of coordinates of the ROI [region of interest] image and a range surrounded by these maximum and minimum values (m-dimensional rectangle) is defined as a range…a range R2 of image characteristics is derived by extending the range R1 of image characteristics by a predetermined margin (a predetermined value is subtracted from the minimum value and the predetermined value is added to the maximum value).” That is, Iizuka teaches a technique of extending a range of interest by adding and subtracting a margin, as stated above. Note that the margin described here corresponds to both the first margin and the second margin of the claim, as the claim does not require the two margins to be different from one another in terms of numerical value.] 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao and Lin with the teachings of Iizuka by modifying the operation of “the determining the range of the fractional lengths” to include “determining, as a lower limit of the range, a value obtained by subtracting a first margin from a minimum fractional length among the obtained fractional lengths, and determining, as an upper limit of the range, a value obtained by adding a second margin from a maximum fractional length among the obtained fractional lengths,” in order to extend a range of interest, as suggested by Iizuka ([0049], parts quoted above). Furthermore, the instant limitation would also have been obvious as a simple combination of known elements yielding predictable results because extending a range of values by adding and subtracting a margin to the maximum and minimum values is a known mathematical method, as taught by Iizuka, and extending the range of the candidate fractional lengths in Yao would have yielded the predictable result of taking a greater range of fractional lengths into consideration when finding an optimal fractional length.
Lin ‘644, in an analogous art, teaches the remaining limitation that “the first margin and the second margin are determined based on a performing ability of the ANN quantization method.” Lin ‘644 teaches “adaptive selection of artificial neural networks” (title) and is therefore in the same field of endeavor as the claimed invention.
In particular, Lin ‘644 “the first margin and the second margin are determined based on a performing ability of the ANN quantization method.” [Abstract: “A method of adaptively selecting a configuration for a machine learning process includes determining current system resources and performance specifications of a current system. A new configuration for the machine learning process is determined based at least in part on the current system resources and the performance specifications.” For example, the process may include “a multi-dimensional optimization problem” ([0057]), such as “hyper-parameter optimization” ([0065]), which is a proses of model optimization analogous to the context of determining an optimal fractional length. The Examiner notes that the instant application refers to “a performing ability (or computing resources) of the quantization system 100” in paragraph 84 of the specification. Thus, “performing ability” includes the computing resources of a system.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yao, Lin, and Iizuka with the teachings of Lin ‘644 by implementing the feature that “the first margin and the second margin are determined based on a performing ability of the ANN quantization method.” The motivation would have been to adapt a computational process to the available system resources, as suggested by Lin ‘644 ([0053]: “adaptively selecting configurations for a machine learning process based on factors such as system resources and performance specifications.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The following document depicts the state of the art.
	Gysel et al., “Hardware-oriented approximation of convolutional neural networks,” arXiv:1604.03168v3 [cs.CV] 20 Oct 2016 teaches quantization with the selection of a fractional length.
	Lin et al., “Fixed Point Quantization of Deep Convolutional Networks,” arXiv:1511.06393v3 [cs.LG] 2 Jun 2016 teaches quantization techniques similar to those described in the “Lin” patent document.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to YAO DAVID HUANG whose telephone number is (571)270-1764. The examiner can normally be reached Monday - Friday 9:00 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Y.D.H./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124