Detailed Action
This action is in response to Applicant's communications filed 26 September 2021.  
Claim(s) 1 and 12 was/were amended.  Claim 7 was cancelled. Claims 21 and 22 were added.  Therefore, claims 1-6 and 8-22 are pending in this Application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Arguments
Applicant's arguments, filed 26 September 2021, regarding the rejections of claims 1 and 12 under 35 USC 102 have been fully considered but are moot because the arguments do not apply to any of the references being used in the current rejection.
Applicant's arguments, filed 26 September 2021, regarding claims 21 and 22 under 35 USC 102 are regarding newly added claims and are addressed in the current rejection. 


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6 and 8-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1 Analysis:  Claims 1-6 and 8-22 are within the four statutory categories.  
Claims 1-6, 8-11, and 21 are drawn to a computation method used in a convolutional neural network, which is within the four statutory categories (i.e. process).  Claims 12-20 and 22 are drawn to a computation device used in a convolutional neural network, which is within the four statutory categories (i.e. machine).

Step 2 Analysis: Claims 1-6 and 8-22 are directed to an abstract idea, do not recite additional elements that would integrate the judicial exception into a practical application, and do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding Independent Claims 1,
	Claim 1 recites:
1. A computation method used in a convolutional neural network, comprising: 
receiving original data; 
determining a first optimal quantization step size according to a distribution of the original data, wherein the step of determining the first optimal quantization step size comprises: calculating a mean and a variance of the distribution of the original data; calculating a first quantization parameter according to the mean and variance of the distribution of the original data; and determining the first optimal quantization step size according to the first quantization parameter; 
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data; 
inputting the first data to a first layer of the convolutional neural network to generate first output data; 

performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data; and 
inputting the second data to a second layer of the convolutional neural network; 
wherein before performing the fixed-point processing to the first output data according to the second optimal quantization step size, the first output data is output to a rectified linear (ReLU) layer, 
wherein the ReLU layer is implemented by using a Signoid function or a Tanh function.

Step 2A Prong One Analysis: The limitations of determining a first optimal quantization step size comprising calculating mean and variance, calculating a first quantization parameter, and determining optimal step size, performing fixed-point processing, determining a second optimal quantization step size comprising calculating a mean and a variance, calculating a second quantization parameter, and determining the second optimal quantization step size, performing the fixed-point processing and implementing a reLU layer using a Signoid function or Tanh function, given the broadest reasonable interpretation, cover the abstract idea of a mathematical concept. Any limitations not identified above as part of the abstract idea(s) are deemed “additional elements,” and will be discussed in further detail below.
Step 2A Prong Two Analysis:  The claim does not recite additional elements that would integrate the judicial exception into a practical application.  The additional elements of a 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, the additional elements are not sufficient to amount to significantly more than the judicial exception. Claim 1 is not patent eligible.	

Regarding Dependent Claims 2-6 and 8-11,
Step 2A Prong One Analysis: Dependent Claims 2-6 and 8-11 include other limitations.  For example, Claim 2 recites wherein the data format is a floating-point format, Claim 3 wherein the first and second optimization step sizes are different, Claim 5 recites wherein the first input data comprises image data, weight data and/or first bias data, Claim 6 recites fixed-point processing using a non-uniform quantization method, Claim 9 recites additional steps for determining fixed-point format and performing the fixed-point processing, Claim 10 recites wherein the original data comprises original image data, weight data and/or bias data,  Claim 11 recites converting data from a fixed-point format to a floating-point format. However, these limitations only serve to further limit the abstract Claim 1.
Step 2A Prong Two Analysis:  Dependent Claims 2-6 and 8-11 are not integrated into a practical application.  Dependent Claims 4 and 8 contain additional elements. Claim 4 recites wherein the first layer and the second layer are convolutional layers of the convolutional neural network, and Claim 8 recites wherein the determination is performed offline.  However, these additional elements merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: Dependent Claims 2-6 and 8-11 do not include additional elements that are sufficient to amount to "significantly more" than the judicial exception.  As discussed with respect to Step 2A Prong Two, the additional elements in Claims 4 and 8 amount to no more than indicating a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, the additional elements are not sufficient to amount to significantly more than the judicial exception. Claims 2-6 and 8-11 are not patent eligible.	

Regarding Independent Claims 12,
	Claim 12 recites:
12. A computation device used in a convolutional neural network, comprising: 
one or more processors; and 

receiving original data; 
determining a first optimal quantization step size according to a distribution of the original data, wherein the step of determining the first optimal quantization step size comprises: calculating a mean and a variance of the distribution of the original data; calculating a first quantization parameter according to the mean and variance of the distribution of the original data; and determining the first optimal quantization step size according to the first quantization parameter; 
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data; 
inputting the first data to a first layer of the convolutional neural network to generate first output data; 
determining a second optimal quantization step size according to a distribution of the first output data, wherein the step of determining the second optimal quantization step size comprises: calculating a mean and a variance of the distribution of the first output data; calculating a second quantization parameter according to the mean and variance of the distribution of the first output data; and determining the second optimal quantization step size according to the second quantization parameter; 
performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data; and 
inputting the second data to a second layer of the convolutional neural network; 
wherein before performing the fixed-point processing to the first output data according to the second optimal quantization step size, the first output data is output to a rectified linear (ReLU) layer; 
wherein the ReLU layer is implemented by using a Signoid function or a Tanh function.


Step 2A Prong One Analysis: The limitations of determining a first optimal quantization step size comprising calculating mean and variance, calculating a first quantization parameter, and determining optimal step size, performing fixed-point processing, determining a second optimal quantization step size comprising calculating a mean and a variance, calculating a second quantization parameter, and determining the second optimal quantization step size, performing the fixed-point processing and implementing a reLU layer using a Signoid function or Tanh function, given the broadest reasonable interpretation, cover the abstract idea of a mathematical concept, but for the recitation of generic computer components (i.e. processor, computer readable storage media). Any limitations not identified above as part of the abstract idea(s) are deemed “additional elements,” and will be discussed in further detail below.
Step 2A Prong Two Analysis:  The claim does not recite additional elements that would integrate the judicial exception into a practical application.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, the additional elements are not 

Regarding Dependent Claims 13-20,
Step 2A Prong One Analysis: Dependent Claims 13-20 include other limitations.  For example, Claim 13 recites wherein the data format is a floating-point format and fixed-point format, Claim 14 wherein the first and second optimization step sizes are different, Claim 16 recites wherein the first input data comprises image data, weight data and/or first bias data, Claim 17 recites fixed-point processing using a non-uniform quantization method, and Claim 20 recites additional steps for determining fixed-point format and performing the fixed-point processing. However, these limitations only serve to further limit the abstract idea, and hence are nonetheless directed towards fundamentally the same abstract idea as independent Claim 12.
Step 2A Prong Two Analysis:  Dependent Claims 13-20 are not integrated into a practical application.  Dependent Claims 15, 18, and 19 contain additional elements. Claim 15 recites wherein the first layer and the second layer are convolutional layers of the convolutional neural network, Claim 18 recites outputting to a reLU layer and/or pooling layer, and Claim 19 recites wherein the determination is performed offline.  However, these additional elements merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: Dependent Claims 13-20 do not include additional elements that are sufficient to amount to "significantly more" than the judicial exception.  As discussed with respect to Step 2A Prong Two, the additional elements in Claims 15, 18, and 19 amount to no more than indicating a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, the additional elements are not sufficient to amount to significantly more than the judicial exception. Claims 13-20 are not patent eligible.	

Regarding Independent Claims 21,
	Claim 21 recites:
21. (New) A computation method used in a convolutional neural network, comprising: 
receiving original data; 
determining a first optimal quantization step size according to a distribution of the original data, wherein the step of determining the first optimal quantization step size comprises: calculating a mean and a variance of the distribution of the original data; calculating a first quantization parameter according to the mean and variance of the distribution of the original data and an adjustment function; and iteratively determining the first optimal quantization step size according to the first quantization parameter and a quantization error function; 
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data; 
inputting the first data to a first layer of the convolutional neural network to generate first output data; 
determining a second optimal quantization step size according to a distribution of the first output data, wherein the step of determining the second optimal quantization step size comprises: calculating a mean and a variance of the distribution of the first output data; calculating a second quantization parameter according to the mean and variance of the distribution of the first output 
performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data; and 
inputting the second data to a second layer of the convolutional neural network.

Step 2A Prong One Analysis: The limitations of determining a first optimal quantization step size comprising calculating mean and variance, calculating a first quantization parameter, and determining optimal step size, performing fixed-point processing, determining a second optimal quantization step size comprising calculating a mean and a variance, calculating a second quantization parameter, and determining the second optimal quantization step size, performing the fixed-point processing, given the broadest reasonable interpretation, cover the abstract idea of a mathematical concept. Any limitations not identified above as part of the abstract idea(s) are deemed “additional elements,” and will be discussed in further detail below.
Step 2A Prong Two Analysis:  The claim does not recite additional elements that would integrate the judicial exception into a practical application.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional elements of a 

Regarding Independent Claims 22,
	Claim 22 recites:
22. (New) A computation device used in a convolutional neural network, comprising: 
one or more processors; and 
one or more computer storage media for storing one or more computer-readable instructions, wherein the processor is configured to drive the computer storage media to execute the following tasks: 
receiving original data; 
determining a first optimal quantization step size according to a distribution of the original data, wherein the step of determining the first optimal quantization step size comprises: calculating a mean and a variance of the distribution of the original data; calculating a first quantization parameter according to the mean and variance of the distribution of the original data and an adjustment function; and iteratively determining the first optimal quantization step size according to the first quantization parameter and a quantization error function; 
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data; 
inputting the first data to a first layer of the convolutional neural network to generate first output data; 
determining a second optimal quantization step size according to a distribution of the first output data, wherein the step of determining the second optimal quantization step size comprises: calculating a mean and a variance of the distribution of the first output data; calculating a second 
performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data; and 
inputting the second data to a second layer of the convolutional neural network.

Step 2A Prong One Analysis: The limitations of determining a first optimal quantization step size comprising calculating mean and variance, calculating a first quantization parameter, and determining optimal step size, performing fixed-point processing, determining a second optimal quantization step size comprising calculating a mean and a variance, calculating a second quantization parameter, and determining the second optimal quantization step size, and performing the fixed-point processing, given the broadest reasonable interpretation, cover the abstract idea of a mathematical concept, but for the recitation of generic computer components (i.e. processor, computer readable storage media). Any limitations not identified above as part of the abstract idea(s) are deemed “additional elements,” and will be discussed in further detail below.
Step 2A Prong Two Analysis:  The claim does not recite additional elements that would integrate the judicial exception into a practical application.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional elements of a convolutional neural network is recited at a high level of generality as to be a black box, and thus merely indicates a technological environment in which to apply a judicial exception (see MPEP §2106.05(h)).  Accordingly, the additional elements are not sufficient to amount to significantly more than the judicial exception. Claim 22 is not patent eligible.	

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 21 and 22 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lin et al. (US 2016/0328646, hereinafter "Lin").

Regarding Claim 21,
("Convolutional neural networks (CNNs)" [0006]), comprising: 
receiving original data ("Quantization efficiency in artificial neural networks, according to aspects of the present disclosure, may be better understood by a review of quantization according to the probability distribution function 400 shown in FIG. 4. For example, an input to the quantizer may be uniformly distributed over [Xmin, Xmax], where Xmin and Xmax define the range of a fixed point representation." [0063]);
determining a first optimal quantization step size according to a distribution of the original data (FIG. 8, Determine quantizer parameters for quantizing values of the floating point network based on the selected moment(s) to obtain corresponding values of a fixed point machine learning network 804; "Application of quantization to the weights, biases, and activation values in artificial neural networks includes the determination of a step size. For example, the step sizes of a symmetric uniform quantizer for Gaussian, Laplacian, and Gamma distributions may be calculated with a deterministic function of the standard deviation of the input distribution, if it is assumed that the distributions have zero mean and unit variance." [0064]); 

    PNG
    media_image1.png
    509
    421
    media_image1.png
    Greyscale
  [0078]),

wherein the step of determining the first optimal quantization step size comprises: calculating a mean ("measuring the mean activations, µCl)" [0076])  and a variance (σ2, "Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ2 is the square of the standard deviation and the square of the standard deviation is the variance) of the distribution of the original data ("FIG. 7A illustrates an input distribution 700 of activation values for an exemplary deep convolutional network also having a mean (μ) and a variance (σ)" [0069]);
calculating a first quantization parameter according to the mean and variance of the distribution of the original data and an adjustment function ("Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ′ teaches a first quantization parameter);
and iteratively determining the first optimal quantization step size according to the first quantization parameter (
 
    PNG
    media_image2.png
    350
    569
    media_image2.png
    Greyscale
 [0078], the optimal step size as shown herein is 2-n, which is calculated using σ′ in equation 15) and a quantization error function ("a signal to quantization noise ratio (SQNR), assuming M is the number of integer bits is: 
    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale
 " [0063]; The signal to quantization noise ratio (SQNR) teaches a quantization error function as the noise is the deviation of points from the quantization step; "The additional adjustment factor, a, is a value that may be adjusted to improve the classification performance. For example, a may be specified to a value different from 1 in certain scenarios, such as: (1) the input distribution is not Gaussian ( e.g., potentially longer tails); or (2) the calculated fixed point representation for the DCN does not agree with the representation calculated based on consideration of the signal to quantization noise ratio (SQNR). In an exemplary DCN model directed towards scene detection, an a different from 1, such as a=l.5, improves performance." [0080]; this teaches that the SQNR is used to make adjustments to the fixed point representation)
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data (

    PNG
    media_image4.png
    234
    502
    media_image4.png
    Greyscale
 [0076]);
inputting the first data to a first layer of the convolutional neural network to generate first output data ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]);
determining a second optimal quantization step size according to a distribution of the first output data (

    PNG
    media_image5.png
    136
    677
    media_image5.png
    Greyscale
 [0078]; " The additional adjustment factor, α, is a value that may be adjusted to improve the classification performance." [0080]; "In addition, the step size adjustment factor a may be specified differently throughout the model. For example, a may be specified individually for weights and activations of each layer. In addition, the weights and biases may have very different dynamic ranges. For example, weights and biases may be specified to have different Q number representations and different bit-widths. Additionally, the bit-width of weights and biases in the same layer may be the same. In one configuration, for a given layer, weights have a format of Q 3.18 and biases have a format of Q 6.9." [0081]), 
wherein the step of determining the second optimal quantization step size comprises: calculating a mean ("measuring the mean activations, µCl)" [0076]) and a variance (σ2, "Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ2 is the square of the standard deviation and the square of the standard deviation is the variance) of the distribution of the first output data ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]; "the output of each layer may serve as an input of a succeeding layer in the deep convolutional network" [0054]);
calculating a second quantization parameter according to the mean and variance of the distribution of the first output data and the adjustment function ("Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ′ teaches a second quantization parameter);
and iteratively determining the second optimal quantization step size according to the second quantization parameter and the quantization error function (
 
    PNG
    media_image2.png
    350
    569
    media_image2.png
    Greyscale
 [0078], the optimal step size as shown herein is 2-n, which is calculated using σ′ in equation 15)
;
performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data (

    PNG
    media_image4.png
    234
    502
    media_image4.png
    Greyscale
 [0076]);
and inputting the second data to a second layer of the convolutional neural network ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]).

Regarding Claim(s) 22,
Claim(s) 22 recite(s) a computation device including a processor (Lin: "processor" [0089]) and memory (Lin: "memory" [0090]) storing instructions for performing functions corresponding to the method steps recited in claim(s) 21, respectively.  Lin teaches the limitations of claim(s) 22 as set forth above in connection with claim(s) 21.  Therefore, claim(s) 22 is/are rejected under the same rationale as respective claim(s) 21.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 1-5, 7-10, 12-16, and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. (US 2016/0328646, hereinafter "Lin") in view of Lin et al. (Fixed Point Optimization of Deep Convolutional Networks, hereinafter "Lin 2").

Regarding Claim 1,
Lin teaches a computation method used in a convolutional neural network ("Convolutional neural networks (CNNs)" [0006]), comprising:
receiving original data ("Quantization efficiency in artificial neural networks, according to aspects of the present disclosure, may be better understood by a review of quantization according to the probability distribution function 400 shown in FIG. 4. For example, an input to the quantizer may be uniformly distributed over [Xmin, Xmax], where Xmin and Xmax define the range of a fixed point representation." [0063]);
determining a first optimal quantization step size according to a distribution of the original data (FIG. 8, Determine quantizer parameters for quantizing values of the floating point network based on the selected moment(s) to obtain corresponding values of a fixed point machine learning network 804; "Application of quantization to the weights, biases, and activation values in artificial neural networks includes the determination of a step size. For example, the step sizes of a symmetric uniform quantizer for Gaussian, Laplacian, and Gamma distributions may be calculated with a deterministic function of the standard deviation of the input distribution, if it is assumed that the distributions have zero mean and unit variance." [0064]); 

    PNG
    media_image1.png
    509
    421
    media_image1.png
    Greyscale
  [0078]),
wherein the step of determining the first optimal quantization step size comprises calculating a mean ("measuring the mean activations, µCl)" [0076]) and a variance (σ2, "Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ2 is the square of the standard deviation and the square of the standard deviation is the variance) of the distribution of the original data ("FIG. 7A illustrates an input distribution 700 of activation values for an exemplary deep convolutional network also having a mean (μ) and a variance (σ)" [0069]);
("Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ′ teaches a first quantization parameter);
determining the first optimal quantization step size according to the first quantization parameter (
 
    PNG
    media_image2.png
    350
    569
    media_image2.png
    Greyscale
 [0078], the optimal step size as shown herein is 2-n, which is calculated using σ′ in equation 15)
performing fixed-point processing to the original data according to the first optimal quantization step size to generate first data (

    PNG
    media_image4.png
    234
    502
    media_image4.png
    Greyscale
 [0076]);
inputting the first data to a first layer of the convolutional neural network to generate first output data ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]);
determining a second optimal quantization step size according to a distribution of the first output data (

    PNG
    media_image5.png
    136
    677
    media_image5.png
    Greyscale
 [0078]; " The additional adjustment factor, α, is a value that may be adjusted to improve the classification performance." [0080]; "In addition, the step size adjustment factor a may be specified differently throughout the model. For example, a may be specified individually for weights and activations of each layer. In addition, the weights and biases may have very different dynamic ranges. For example, weights and biases may be specified to have different Q number representations and different bit-widths. Additionally, the bit-width of weights and biases in the same layer may be the same. In one configuration, for a given layer, weights have a format of Q 3.18 and biases have a format of Q 6.9." [0081])
wherein the step of determining the second optimal quantization step size comprises calculating a mean ("measuring the mean activations, µCl)" [0076]) and a variance (σ2, "Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ2 is the square of the standard deviation and the square of the standard deviation is the variance) of the distribution of the first output data ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]; "the output of each layer may serve as an input of a succeeding layer in the deep convolutional network" [0054]);
calculating a second quantization parameter according to the mean and variance of the distribution of the output data ("Equations for determining step size may be specified as: σ′=|μ|+σ or σ′=σ or σ′=√{square root over (μ2+σ2)} (14) where μ and σ are mean and standard deviation of the input to compute an effective sigma value σ′." [0078]; σ′ teaches a second quantization parameter); and
determining the second optimal quantization step size according to the second quantization parameter (
 
    PNG
    media_image2.png
    350
    569
    media_image2.png
    Greyscale
 [0078], the optimal step size as shown herein is 2-n, which is calculated using σ′ in equation 15)
performing the fixed-point processing to the first output data according to the second optimal quantization step size to generate second data (

    PNG
    media_image4.png
    234
    502
    media_image4.png
    Greyscale
 [0076]); 
inputting the second data to a second layer of the convolutional neural network ("Deep learning architectures, such as deep belief networks and deep convolutional networks, are layered neural networks architectures in which the output of a first layer of neurons becomes an input to a second layer of neurons, the output of a second layer of neurons becomes an input to a third layer of neurons, and so on." [0007]); and
wherein before performing the fixed-point processing to the first output data according to the second optimal quantization step size, the first output data is output to a rectified linear (ReLU) layer (FIG. 3A; "Quantization, however, need not be applied to the pooling layer if max pooling is specified, and/or the neuron layer if rectified linear units (ReLU) are specified." [0062]).

Lin does not explicitly disclose wherein the ReLU layer is implemented by using a Signoid function or a Tanh function.
Lin 2 teaches wherein the ReLU layer is implemented by using a Signoid function or a Tanh function ("Other nonlinear activation functions such as tanh, sigmoid, PReLU functions are much harder to model and analyze. However, in Section 5.1 we will see that applying the analysis in this section to a network with PReLU activation functions still yields useful enhancements." sec. 4.1.5, p. 5).
Lin and Lin 2 are analogous art because both are directed to fixed point neural network quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the activation function of Lin with the activation function alternatives such as tanh and sigmoid of Lin 2.  The modification would have been obvious because one of ordinary skill in the art would be motivated to use known alternatives to yield useful enhancements that reduce model size without any loss of accuracy, as suggested by Lin 2 ("applying the analysis in this section to a network with PReLU activation functions still yields useful enhancements." sec. 4.1.5, p. 5; "the fixed point DCNs with optimized bit width allocation offer >20% reduction in model size without any loss in accuracy" sec. Abstract, p. 1).

Regarding Claim 2,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the data format of the original data is a floating-point format and the data formats of the first data and the second data are a fixed-point format (FIG. 8, Determine quantizer parameters for quantizing values of the floating point network based on the selected moment(s) to obtain corresponding values of a fixed point machine learning network 804; "In an aspect of the present disclosure, the instructions loaded into the general-purpose processor 102 may comprise code for quantizing a floating point neural network to obtain a fixed point neural network. The instructions loaded into the general-purpose processor 102 may also comprise code for providing a fixed point representation when quantizing weights, biases and activation values in the network." [0036])

Regarding Claim 3,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the first optimal quantization step size is different from the second optimal quantization step size ("In addition, the step size adjustment factor a may be specified differently throughout the model. For example, a may be specified individually for weights and activations of each layer. In addition, the weights and biases may have very different dynamic ranges. For example, weights and biases may be specified to have different Q number representations and different bit-widths. Additionally, the bit-width of weights and biases in the same layer may be the same. In one configuration, for a given layer, weights have a format of Q 3.18 and biases have a format of Q 6.9." [0081]).

Regarding Claim 4,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the first layer and the second layer are convolutional layers of the convolutional neural network ("In some artificial neural networks (ANNs), such as a deep convolutional network (DCN), quantization may be applied to activations of the normalization layer; weights, biases, and activations of the fully connected layer; and/or weights, biases, and activations of the convolution layer." [0032]).

Regarding Claim 5,
The Lin/Lin 2 combination teaches the method of claim 4.  Lin further teaches wherein the first data input to the first layer of the convolutional neural network comprises first image data ("A DCN may be trained with supervised learning. During training, a DCN may be presented with an image, such as a cropped image of a speed limit sign, and a “forward pass” may then be computed to produce an output 322." [0044]), first weight data and/or first bias data ("FIG. 6A illustrates an input distribution 600 for an exemplary deep convolutional network. In this example, the input distribution 600 includes a variance (σ) and a mean value (μ). Aspects of the present disclosure are directed towards specifying a zero mean (μ=0) for the distributions of weights, biases and activation values, for example, as shown in FIG. 6B." [0066]).

Regarding Claim 8,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the first optimal quantization step size and/or the second optimal quantization step size are determined offline ("A processor may also be implemented as a combination of computing devices" [0089]).

Regarding Claim 9,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the step of performing the fixed-point processing to the original data according to the first optimal quantization step size to generate the first data further comprises: determining a first fixed-point format ("Fixed point numbers may be specified for using less complex software and/or hardware designs at the cost of reduced accuracy because floating point numbers have a greater dynamic range compared to fixed point numbers. Converting floating point numbers to fixed point numbers through the process of quantization may decrease the complexity of hardware and/or software implementations. The floating point numbers may assume a single-precision binary format including a sign bit, an 8-bit exponent, and a 23-bit fraction component. Aspects of the disclosure are directed to using the Q number format to represent fixed point numbers. Still, other formats may be considered. The Q number format is represented as Qm.n, where m is a number of bits for an integer part and n is a number of bits for a fraction part. In one configuration, m does not include a sign bit. Each Qm.n format may use an m+n+1 bit signed integer container with n fractional bits. In one configuration, the range is [−(2m), 2m-2n)] and the resolution is 2−n. For example, a Q14.1 format number may use sixteen bits. In this example, the range is [−214, 214-21] (e.g., [−16384.0, +16383.5]) and the resolution is 21(e.g., 0.5)." [0059]-[0060]) according to the first optimal quantization step size; and performing the fixed-point processing to the original data according to the first fixed-point format to generate the first data (

    PNG
    media_image4.png
    234
    502
    media_image4.png
    Greyscale
 [0076]).

Regarding Claim 10,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin further teaches wherein the original data comprises original image data ("A DCN may be trained with supervised learning. During training, a DCN may be presented with an image, such as a cropped image of a speed limit sign, and a “forward pass” may then be computed to produce an output 322." [0044]), first weight data and/or first bias data ("FIG. 6A illustrates an input distribution 600 for an exemplary deep convolutional network. In this example, the input distribution 600 includes a variance (σ) and a mean value (μ). Aspects of the present disclosure are directed towards specifying a zero mean (μ=0) for the distributions of weights, biases and activation values, for example, as shown in FIG. 6B." [0066]).

Regarding Claim(s) 12-16 and 19-20,
Claim(s) 12-16 and 19-20 recite(s) a computation device including a processor (Lin: "processor" [0089]) and memory (Lin: "memory" [0090]) storing instructions for performing functions corresponding to the method steps recited in claim(s) 1-5 and 8-9, respectively.  The Lin/Lin 2 combination teaches the limitations of claim(s) 12-16 and 19-20 as set forth above in connection with claim(s) 1-5 and 8-9.  Therefore, claim(s) 12-16 and 19-20 is/are rejected under the same rationale as respective claim(s) 1-5 and 8-9.

Regarding Claim 18,
The Lin/Lin 2 combination teaches the computation device of claim 12.  Lin further teaches wherein before performing the fixed-point processing to the first output data according to the second optimal quantization step size, the first output data is output to a rectified linear (ReLU) layer and/or a pooling layer (FIG. 3A; "Quantization, however, need not be applied to the pooling layer if max pooling is specified, and/or the neuron layer if rectified linear units (ReLU) are specified." [0062]).


Claim(s) 6 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. (US 2016/0328646, hereinafter "Lin") in view of Lin et al. (Fixed Point Optimization of Deep Convolutional Networks, hereinafter "Lin 2") and Anwar et al. (Fixed Point Optimization of Deep Convolutional Neural Networks for Object Recognition, hereinafter "Anwar").

Regarding Claim 6,
The Lin/Lin 2 combination teaches the method of claim 5.  Lin does not explicitly teach wherein the weight data and the bias data are performed fixed-point processing by using a non-uniform quantization method.
Anwar teaches wherein the weight data and the bias data (Sec. 2.2 Layer wise sensitivity analysis for non-uniform quantization; Figure 2 shows that the weights between the penultimate and final layer, rearlayer7 are most sensitive to quantization" sec. 2.2, p. 1133) are performed fixed-point processing ("optimization method for fixed point deep convolutional neural network, sec. Abstract, p. 1131) by using a non-uniform quantization method ("Table 2 shows a layer wise distribution of weights after undergoing non-uniform quantization and retraining." sec. 3., p. 1133).
Lin and Anwar are analogous art because both are directed towards quantization of convolutional neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the quantization method of the Lin/Lin 2 combination with the non-uniform quantization of Anwar.  The modification would have been obvious because one of ordinary skill in the art would be motivated to reduce 

Regarding Claim(s) 17,
Claim(s) 17 recite(s) a computation device including a processor (Lin: "processor" [0089]) and memory (Lin: "memory" [0090]) storing instructions for performing functions corresponding to the method steps recited in claim(s) 6.  The Lin/Lin 2/Anwar combination teaches the limitations of claim(s) 17 as set forth above in connection with claim(s) 6.  Therefore, claim(s) 17 is/are rejected under the same rationale as claim(s) 6.

Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. (US 2016/0328646, hereinafter "Lin") in view of Lin et al. (Fixed Point Optimization of Deep Convolutional Networks, hereinafter "Lin 2") and Morcel et al. (FPGA-based Accelerator for Deep Convolutional Neural Networks for the SPARK Environment, hereinafter "Morcel").

Regarding Claim 11,
The Lin/Lin 2 combination teaches the method of claim 1.  Lin does not explicitly teach wherein before determining the second optimal quantization step size according to the distribution of the first output data further comprises: converting the data format of the first output data from a fixed-point format to a floating-point format
("Figure 2 shows the architecture of our accelerator. It contains a floating-to-fixed-point conversion circuit, a zero padding circuit, a Pipelined 2D convolution filter, a fixed-to-floating-point conversion circuit, and finally, a Control Unit." sec. IV.B, p. 130).
Lin and Morcel are analogous art because both are directed towards training deep convolutional neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the quantization method of the Lin/Lin 2 combination with the training accelerator of Anwar.  The modification would have been obvious because one of ordinary skill in the art would be motivated to accelerate training of deep convolutional neural networks, as suggested by Morcel (sec. Abstract, p. 126).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES C KUO whose telephone number is (571)270-7477.  The examiner can normally be reached on M-F: 9:00 a.m. - 6:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/CHARLES C KUO/Examiner, Art Unit 2126 
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126