DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to communications filed on June 5, 2018. 
Claims 1 – 10 are presented for examination and are pending. 
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed on July 23, 2018.
Oath/Declaration
For the record, the Examiner acknowledges that the Oath/Declaration filed on June 5, 2018 has been received. 
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “100” has been used to designate both a neuron and a neuron output value. See Para [0019] of the specification. See Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance. 
Specification
The disclosure is objected to because of the following informalities: Para [0025]: “training result close to that of using the exact floating-point numbers for the weights can be obtained” should be rewritten as “training results close to that of using the exact floating-point numbers for the weights can be obtained”. Para [0025]: “the training makes the inference result of the trained neural network converges…” should be rewritten as “the training makes the inference result of the trained neural network converge…”
Claim Objections
Claim 10 is objected to because of the following informalities: 
With respect to claim 10, “fix-point number” should be rewritten as “fixed-point number”
Appropriate correction is required. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2 – 10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding Claim 2,
”.
Regarding Claim 4, 
With respect to claim 4, the term “suitable” is a relative term which renders the claim indefinite. The term "suitable" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purposes, “suitable” has been interpreted as a group within the mantissa capable of receiving a carry or borrow signal.  
Regarding Claim 5, 
With respect to claim 5, claim 5 recites “the most significant group” in lines 6 – 7 of the claim. There is insufficient antecedent basis for this limitation in the claim.  
With respect to claim 5, the limitation “wherein the neural network weight before adjustment is first adjusted in the mantissa of the neural network weight before adjustment followed by adjustment of the exponent of the neural network weight before adjustment,” lacks clarity because it is unclear how a weight, mantissa, and exponent can be adjusted before adjustment. A recommended amendment is “wherein the neural network weight  is first adjusted in the mantissa of the neural network weight  followed by adjustment of the exponent of the neural network weight ,”.
Regarding Claim 6, 
”.
With respect to claim 6, claim 6 recites “the most significant group” in lines 11 – 12 and 17 – 20 of the claim. There is insufficient antecedent basis for these limitations in the claim. 
Regarding Claim 7, 
With respect to claim 7, the limitation “wherein the next group on the left is closer to the most significant group” lacks clarity because it is unclear what closer is being compared to. For examination purposes, “wherein the next group on the left is closer to the most significant group” has been interpreted as wherein the next group on the left is closer to the most significant group than the current adjustment group.  
Claim 7 recites “the most significant group” in lines 2 and 14 of the claim. There is insufficient antecedent basis for these limitations in the claim.
Claim 7 recites “the opposite limit” in lines 9 – 10 of the claim. There is insufficient antecedent basis for this limitation in the claim.
Claim 7 recites “the next group on the left” in lines 11 and 13 – 14 of the claim. There is insufficient antecedent basis for these limitations in the claim. 
Regarding Claim 8, 
With respect to claim 8, claim 8 recites “the most significant group” in lines 2, 3 – 4, 12, and 14 of the claim. There is insufficient antecedent basis for these limitations in the claim.
Regarding Claim 9, 


Dependent claims 3 – 10 are rejected as being directly and indirectly dependent on rejected claims. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Lai et al. (“Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations”, hereinafter “Lai”) in view of Chen et al. (“Grouped Signed Power-of-Two Algorithms for Low-Complexity Adaptive Equalization”, hereinafter “Chen”), further in view of Sze et al. (“Efficient Processing of Deep Neural Networks”, hereinafter “Sze”).

As per claim 1, Lai teaches A method of training a neural network, for a neural network, comprising: using a floating-point signed digit number to represent a neural network weight of the neural network, (Page 1: “In this work, we propose using floating-point numbers for representing the weights” and Section 3.2: “One example of floating-point number representation is shown in Fig. 2. For a floating-point representation, there are typically three parts: sign, mantissa and exponent. The sign bit determines whether the number is a positive or negative number” teaches using a floating-point signed digit number to represent a neural network weight)

    PNG
    media_image1.png
    406
    554
    media_image1.png
    Greyscale

and an exponent of the neural network weight is represented by an exponent digit group; (Fig. 2 (shown above) teaches that the exponent of the neural network weight is represented by an exponent digit group because the exponent is a group of four bits)
Lai does not appear to explicitly teach: 
wherein a mantissa of the neural network weight is represented by multiple mantissa signed digit groups, 
and using the exponent digit group and at least one of the multiple mantissa signed digit groups of the neural network weight to perform weight adjustment computation and neural network inference computation. 
However, Chen teaches: 
(Section 2A: “Suppose that a 12-bit GSPT number is partitioned into three groups. Then, a GSPT number can be represented as
 
    PNG
    media_image2.png
    49
    306
    media_image2.png
    Greyscale

where each group is marked by an underline and the signed digit ‘ -1’ is represented by                         
                            
                                
                                    1
                                
                                -
                            
                        
                    .” teaches that the mantissa is represented by multiple signed digit groups)
	Lai and Chen are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights with a motivation to “[reduce] the complexity of the linear filter… since fewer digits need processing in data-coefficient multiplication” (Chen, page 816).
The combination of Lai and Chen does not appear to explicitly teach: 
and using the exponent digit group and at least one of the multiple mantissa signed digit groups of the neural network weight to perform weight adjustment computation and neural network inference computation.
However, Sze teaches: 
and using the exponent digit group and at least one of the multiple mantissa signed digit groups of the neural network weight to perform weight adjustment computation and neural network inference computation. (Page 2298: “When training a network, the weights (wij) are usually updated using a hill-climbing optimization process called gradient descent. A multiple of the gradient of the loss relative to each weight, which is the partial derivative of the loss with respect to the weight, is used to update the weight (i.e., updated wij t+ 1 = witj   − α (∂ L/∂ wij), where α is called the learning rate). Note that this gradient indicates how the weights should change in order to reduce the loss.” teaches using the weight of a neural network to compute a gradient (weight adjustment value). The gradient is a weight adjustment value because it indicates how the weights should change. Page 2317, Fig. 38(a) teaches that the weight can be represented by an exponent digit group and mantissa signed digit groups; Page 2297: “In the specific case of DNNs, this learning involves determining the value of the weights (and bias) in the network, and is referred to as training the network. Once trained, the program can perform its task by computing the output of the network using the weights determined during the training process. Running the program with these weights is referred to as inference.” teaches that training the neural network (performing weight adjustment computation) is used for neural network inference.)
Lai, Chen, and Sze are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Sze’s method for efficient processing of deep neural networks into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen with a motivation to “[provide] techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost” (Sze, page 2295).

As per claim 2, the combination of Lai, Chen, and Sze as shown above teaches The method of claim 1, 
Sze further teaches: 
Page 2298: “When training a network, the weights (wij) are usually updated using a hill-climbing optimization process called gradient descent. A multiple of the gradient of the loss relative to each weight, which is the partial derivative of the loss with respect to the weight, is used to update the weight (i.e., updated wij t+ 1 = witj   − α (∂ L/∂ wij), where α is called the learning rate). Note that this gradient indicates how the weights should change in order to reduce the loss.” teaches using the weight of a neural network to compute a gradient (weight adjustment value). The gradient is a weight adjustment value because it indicates how the weights should change. Page 2317, Fig. 38(a) teaches that the weight can be represented by an exponent digit group and mantissa signed digit groups)

    PNG
    media_image3.png
    112
    570
    media_image3.png
    Greyscale

adjusting the neural network weight before adjustment according to the weight adjustment value to generate a neural network weight after adjustment. (Page 2298: “When training a network, the weights (wij) are usually updated using a hill-climbing optimization process called gradient descent. A multiple of the gradient of the loss relative to each weight, which is the partial derivative of the loss with respect to the weight, is used to update the weight… teaches updating the neural network weight according to the gradient (weight adjustment value) to generate a weight after adjustment; “Note that this gradient indicates how the weights should change in order to reduce the loss.” teaches that the gradient is a weight adjustment value because it indicates how the weights should change)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Sze’s method for efficient processing of deep neural networks into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen with a motivation to “[provide] techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost” (Sze, page 2295).

As per claim 10, the combination of Lai, Chen, and Sze as shown above teaches The method of claim 2, 
Sze further teaches: 
wherein the weight adjustment value is generated by a computation involving the exponent digit group and the at least one of the multiple mantissa signed digit groups of the neural network weight and a neuron output value of the neural network, (Page 2298: “When training a network, the weights (wij) are usually updated using a hill-climbing optimization process called gradient descent. A multiple of the gradient of the loss relative to each weight, which is the partial derivative of the loss with respect to the weight, is used to update the weight (i.e., updated wij t+ 1 = witj   − α (∂ L/∂ wij), where α is called the learning rate). Note that this gradient indicates how the weights should change in order to reduce the loss.” teaches using the weight of a neural network to compute a gradient (weight adjustment value). The gradient is a weight adjustment value because it indicates how the weights should change. Page 2317, Fig. 38(a) teaches that the weight can be represented by an exponent digit group and mantissa signed digit groups. Page 2298 and Fig. 4: “An efficient way to compute the partial derivatives of the gradient is through a process called backpropagation. Backpropagation, which is a computation derived from the chain rule of calculus, operates by passing values backwards through the network to compute how the loss is affected by each weight.” teaches using the forward activations (neuron output value) in order to compute the gradient (weight adjustment value) along with the weights of the neural network, represented by an exponent digit group and mantissa signed digit groups; Page 2302: “input and output feature maps (ifmaps, ofmaps) are composed of activations (i.e., input and output neurons).” teaches that activations are neuron output values)

    PNG
    media_image4.png
    518
    540
    media_image4.png
    Greyscale

wherein the neuron output value is represented by a fix-point number or a floating-point number. (Page 2302: “input and output feature maps (ifmaps, ofmaps) are composed of activations (i.e., input and output neurons).” teaches that activations are neuron output values; Page 2317 – 2318: “Using dynamic fixed point, the bitwidth can be reduced to 8 b for the weights and 10 b for the activations without any fine tuning of the weights [123]; with fine tuning, both weights and activations can reach 8 b [124].” teaches that the activations are represented by fixed-point numbers)
Lai, Chen, and Sze are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Sze’s method for efficient processing of deep neural networks into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen with a motivation to “[provide] techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost” (Sze, page 2295).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Lai in view of Chen and Sze as shown above, further in view of Nystad (GB 2537419 A, hereinafter “Nystad”).

As per claim 3, the combination of Lai, Chen, and Sze as shown above teaches The method of claim 2, 
Lai further teaches: 
wherein the neural network weight before adjustment, the weight adjustment value, and the neural network weight after adjustment are represented by standard floating-point numbers, further comprising: (Section 3.2: “For a floating-point representation, there are typically three parts: sign, mantissa and exponent. The sign bit determines whether the number is a positive or negative number. The mantissa determines the significand part and the exponent determine the scale of the value. Usually, there are some special encodings used for representing some special numbers (e.g., 0, NaN and +/- infinity), For binary floating-point numbers, the mantissa can assume an implicit bit, which is also adopted by IEEE floating-point standard.” teaches that all of the weights used by the neural network are represented by floating point binary numbers that follow the IEEE floating-point standards.)
The combination of Lai, Chen, and Sze does not appear to explicitly teach: 
converting the mantissa of the neural network weight after adjustment from the standard floating-point number into the floating-point signed digit number, to generate the neural network weight represented by the floating-point signed digit number.
However, Nystad teaches: 
converting the mantissa of the neural network weight after adjustment from the standard floating-point number into the floating-point signed digit number, to generate the neural network weight represented by the floating-point signed digit number. (Page 10, lines 16 – 18: “the conversion circuitry is capable of converting a mantissa value of the floating-point input value into a sign magnitude representation for the intermediate format.” teaches converting the mantissa of the weight from standard floating-point to a floating point-signed digit number.)
Lai, Chen, Sze, and Nystad are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Nystad’s signed floating point conversion into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen and Sze with a motivation to “[simplify] the summation of two values…” (Nystad, Page 9, line 27).

Claims 4 – 8  are rejected under 35 U.S.C. 103 as being unpatentable over Lai in view of Chen and Sze as shown above, further in view of Hemmert et al. (“Fast, Efficient Floating-Point Adders and Multipliers for FPGAs”, hereinafter “Hemmert”).

As per claim 4, the combination of Lai, Chen, and Sze as shown above teaches The method of claim 2, 
Lai further teaches: 
wherein the weight adjustment value is represented by a standard floating-point number, the neural network weight before adjustment, and the neural network weight after adjustment are represented by the floating-point signed digit numbers, further comprising: (Section 3.2: “For a floating-point representation, there are typically three parts: sign, mantissa and exponent. The sign bit determines whether the number is a positive or negative number. The mantissa determines the significand part and the exponent determine the scale of the value. Usually, there are some special encodings used for representing some special numbers (e.g., 0, NaN and +/- infinity), For binary floating-point numbers, the mantissa can assume an implicit bit, which is also adopted by IEEE floating-point standard.” teaches that all of the weights used by the neural network are represented by floating point binary numbers that follow the IEEE floating-point standards.)
Chen further teaches: 
wherein the weight adjustment value indicates at least one suitable group, (Fig. 2 teaches that the coefficient updating block (weight adjustment value) indicates at least one suitable group; each group is suitable because each group is capable of receiving a carry or borrow signal)

    PNG
    media_image5.png
    367
    883
    media_image5.png
    Greyscale

and each of the at least one suitable group corresponds to at least one carry signal or at least one borrow signal. (Fig. 2 teaches that each of the groups corresponds to at least a carry or borrow signal; each group is suitable because each group is capable of receiving a carry or borrow signal)
Lai, Chen, and Sze are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Sze with a motivation to “[reduce] the complexity of the linear filter… since fewer digits need processing in data-coefficient multiplication” (Chen, page 816).

The combination of Lai, Chen, and Sze does not appear to explicitly teach: 
converting the weight adjustment value from the standard floating-point number into the floating-point signed digit number;
However, Hemmert teaches: 
Page 14: “(1) Sort numbers. (2) compute exponent: exp = explarge + expsmall − BIAS. (3) If smaller number is denormal, perform zero detect on smaller mantissa, let Sd equal the number of leading zeros. (4) Left shift smaller mantissa by Sd and compute exp = exp− Sd.” teaches converting the weight adjustment value from standard floating point to signed floating point.)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hemmert’s system of fast, efficient floating-point adders and multipliers into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen and Sze with a motivation to “[optimize] designs for floating-point add and multiply… ” (Hemmert, page 2).

As per claim 5, the combination of Lai, Chen, Sze, and Hemmert as shown above teaches The method of claim 4, 
Hemmert further teaches: 
wherein the neural network weight before adjustment is first adjusted in the mantissa of the neural network weight before adjustment followed by adjustment of the exponent of the neural network weight before adjustment, (Page 14: “The basic floating-point multiplication algorithm, without details on exception handling, is as follows.” and “(6) Perform mantissa multiplication. (7) For denormal output (negative exponent), right shift result and set exponent to zero. (8) Round result. (9) If rounded mantissa is greater than or equal to 2, right shift mantissa and add 1 to exponent.” teaches multiplying the mantissa (adjusting the mantissa of the neural network weight) before zeroing or incrementing the exponent (adjusting the exponent). )

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hemmert’s system of fast, efficient floating-point adders and multipliers into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen and Sze with a motivation to “[optimize] designs for floating-point add and multiply… ” (Hemmert, page 2).
Chen further teaches: 
and the adjustment of the mantissa part proceeds orderly from a current adjustment group in the multiple mantissa signed digit groups toward the most significant group of the multiple mantissa signed digit groups. (Fig. 2 teaches that the adjustment of the mantissa proceeds from updating unit focused on b0, b1, and b2 (current adjustment group) toward the most significant group; Section 2A: “Suppose that a 12-bit GSPT number is partitioned into three groups. Then, a GSPT number can be represented as 

    PNG
    media_image2.png
    49
    306
    media_image2.png
    Greyscale

where each group is marked by an underline and the signed digit ‘ -1’ is represented by                         
                            
                                
                                    1
                                
                                -
                            
                        
                    .” teaches that the mantissa groups are comprised of signed digit groups)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Sze 

As per claim 6, the combination of Lai, Chen, Sze, and Hemmert as shown above teaches The method of claim 5, 
Sze further teaches: 
wherein adjusting the neural network weight before adjustment according to the weight adjustment value comprises: (Page 2298: “When training a network, the weights (wij) are usually updated using a hill-climbing optimization process called gradient descent. A multiple of the gradient of the loss relative to each weight, which is the partial derivative of the loss with respect to the weight, is used to update the weight… teaches updating the neural network weight according to the gradient (weight adjustment value) to generate a weight after adjustment; “Note that this gradient indicates how the weights should change in order to reduce the loss.” teaches that the gradient is a weight adjustment value because it indicates how the weights should change)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Sze’s method for efficient processing of deep neural networks into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Chen and Hemmert with a motivation to “[provide] techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost” (Sze, page 2295).

Chen further teaches: 
Section 2B: “As mentioned before, the updating direction is more important than the magnitude of the update. Therefore, only the direction of the term, e[n] * x[n-k], is used for the coefficient updating in the GSPT LMS algorithm.” and 

    PNG
    media_image6.png
    113
    443
    media_image6.png
    Greyscale

“When e[n] * x[n-k] is positive (zero or negative), we increase (freeze or decrease) the linear filter coefficient wk[n]. The updating unit having four digits is shown in Fig. 1(a) and an illustration of the updating operation is given in Fig. 1(b).” teaches determining the update direction (current adjustment group) according to the adjustment value)
generating at least one carry signal or one borrow signal according to the weight adjustment value and transmitting the at least one carry signal or one borrow signal to the current adjustment group; (Section 2B: “When a positive trigger (increase) signal, carryin, is received, b3b2b1b0 of this updating unit ‘shifts up’ to increase its value. Likewise, when a negative trigger (decrease) signal, borrowin, is received, b3b2b1b0 of this updating unit ‘shifts down’.” teaches generating a carry or borrow signal and transmitting the signal to b3b2b1b0 (the current adjustment group); 

    PNG
    media_image6.png
    113
    443
    media_image6.png
    Greyscale

teaches generating a carry signal if e[n] * x[n-k] is greater than 0 or generating a borrow signal if e[n] * x[n-k] is less than 0 (the carry or borrow signal is determined according to the adjustment value))
Fig. 2 and Section 2B: “If b3b2b1b0 has been already at the upper (lower) limit 1000 (                        
                            
                                
                                    1
                                
                                -
                            
                        
                    000) and a positive (negative) trigger signal is received, the output signal carryout (borrowout) is sent to the next more significant unit and b3b2b1b0 is reset to 0000. Fig. 2 shows the coefficient updating block implemented by cascading four updating units in a 12-bit coefficient example.” teaches determining if the current adjustment group (which is not the most significant group) has reached a maximum or minimum limit and cannot be adjusted according to a carry or borrow signal)
when the current adjustment group is determined the most significant group of the multiple mantissa signed digit groups, determining whether before adjustment the most significant group has reached the maximum limit or the minimum limit and cannot be adjusted according to the carry signal or the borrow signal. (Section 2B: “If b3b2b1b0 has been already at the upper (lower) limit 1000 (                        
                            
                                
                                    1
                                
                                -
                            
                        
                    000) and a positive (negative) trigger signal is received, the output signal carryout (borrowout) is sent to the next more significant unit and b3b2b1b0 is reset to 0000. Fig. 2 shows the coefficient updating block implemented by cascading four updating units in a 12-bit coefficient example.” teaches determining if the current adjustment group has reached a maximum or minimum limit and cannot be adjusted according to a carry or borrow signal; Fig 2 teaches determining that the most significant group has reached the maximum or minimum limit and must undergo carryout or borrowout)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Sze 

As per claim 7, the combination of Lai, Chen, Sze, and Hemmert as shown above teaches The method of claim 6,
Chen further teaches:
wherein when the current adjustment group is determined not the most significant group of the multiple mantissa signed digit groups, determining whether before adjustment the current adjustment group has reached the maximum limit or the minimum limit and cannot be adjusted according to the carry signal or the borrow signal comprises: when determined before adjustment the current adjustment group has reached the maximum limit or the minimum limit and cannot be adjusted, setting the current adjustment group to the opposite limit (the minimum limit or the maximum limit); (Fig. 2 and Section 2B: “If b3b2b1b0 has been already at the upper (lower) limit 1000 (                        
                            
                                
                                    1
                                
                                -
                            
                        
                    000) and a positive (negative) trigger signal is received, the output signal carryout (borrowout) is sent to the next more significant unit and b3b2b1b0 is reset to 0000.” teaches setting the current adjustment group to 0000 (the opposite limit) when the current adjustment group has reached the maximum limit and cannot be adjusted according to the carry signal.
setting the next group on the left as the current adjustment group; (Fig. 2 teaches setting b5b4b3 (the group on the left of group b2b1b0) as the current adjustment group)

    PNG
    media_image5.png
    367
    883
    media_image5.png
    Greyscale

and transmitting the carry signal or the borrow signal to the new current adjustment group, (Fig. 2 teaches that the carry or borrow signal is transmitted to b5b4b3 (new current adjustment group))
wherein the next group on the left is closer to the most significant group; (Fig. 2 teaches that b5b4b3 is closer to b11b10b9 (the most significant group) than b2b1b0)
and when determined before adjustment the current adjustment group has not reached the maximum limit or the minimum limit and can be adjusted, adjusting the current adjustment group according to the carry signal or the borrow signal. (Section 2B: “When a positive trigger (increase) signal, carryin, is received… updating unit ‘shifts up’ to increase its value. Likewise, when a negative trigger (decrease) signal, borrowin, is received… updating unit ‘shifts down.’” teaches adjusting the group according to the carry or borrow signal if the adjustment group is not at the maximum or minimum limit)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Sze and Hemmert with a motivation to “[reduce] the complexity of the linear filter… since fewer digits need processing in data-coefficient multiplication” (Chen, page 816).
As per claim 8, the combination of Lai, Chen, Sze, and Hemmert as shown above teaches The method of claim 6, 
Chen further teaches: 
wherein when the current adjustment group is determined the most significant group of the multiple mantissa signed digit groups, determining whether before adjustment the most significant group has reached the maximum limit or the minimum limit and cannot be adjusted according to the carry signal or the borrow signal comprises:
(Section 2B: “If b3b2b1b0 has been already at the upper (lower) limit 1000 (                        
                            
                                
                                    1
                                
                                -
                            
                        
                    000) and a positive (negative) trigger signal is received, the output signal carryout (borrowout) is sent to the next more significant unit and b3b2b1b0 is reset to 0000. Fig. 2 shows the coefficient updating block implemented by cascading four updating units in a 12-bit coefficient example.” teaches determining if the current adjustment group has reached a maximum or minimum limit and cannot be adjusted according to a carry or borrow signal; Fig 2 teaches determining that the most significant group has reached the maximum or minimum limit and must undergo carryout or borrowout)
adjusting the most significant group according to the carry signal or the borrow signal; (Section 2B: “When a positive trigger (increase) signal, carryin, is received… updating unit ‘shifts up’ to increase its value. Likewise, when a negative trigger (decrease) signal, borrowin, is received… updating unit ‘shifts down.’” teaches adjusting the group according to the carry or borrow signal)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Chen’s GSPT number system into Lai’s method of deep convolutional neural network inference with floating-point weights as modified by Sze 
Hemmert further teaches: 
when determined before adjustment the current adjustment group has reached the maximum limit or the minimum limit and cannot be adjusted, increasing the exponent digit group and moving all the mantissa signed digit groups toward right by at least one digit; (Page 10: “If resulting mantissa is greater than or equal to 2, right shift by 1 and add 1 to exponent.” teaches increasing the exponent by one and shifting the mantissa right by 1 (moving all the mantissa signed digit groups toward right by one digit) if the current group has reached 2 (the maximum limit); Page 7, Fig. 1 teaches that the exponent is composed of a group of 11 bits (exponent digit group), the mantissa is composed of groups of 52 bits with an additional sign bit (mantissa signed digit groups))

    PNG
    media_image7.png
    152
    634
    media_image7.png
    Greyscale

and determining whether after adjustment the most significant group is zero to determine whether to adjust the multiple mantissa signed digit groups. (Page 10: “Perform leading zero detect on mantissa and normalize mantissa appropriately.” teaches normalizing the mantissa (adjust mantissa signed digit groups) if leading zeroes are detected (most significant group is 0); Page 7, Fig. 1 teaches that the mantissa is composed of groups of 52 bits with an additional sign bit (mantissa signed digit groups).)
Lai, Chen, Sze, and Hemmert are analogous art because they are directed to reducing computational complexity involving numeric systems. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hemmert’s system of fast, efficient .


Allowable Subject Matter
Claim 9 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) set forth in this Office Action and to include all of the limitations of the base claim and any intervening claims. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Sumbul et al. (US 10, 713, 558 B2) discloses training a neural network with floating-point values and determining a weight update value to update the weights of the neural network. 
Koster et al. (Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks) discloses efficient training of neural networks using quantized floating point-values. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHOUN J ABRAHAM whose telephone number is (571)272-8144.  The examiner can normally be reached on Mon - Fri 08:00-16:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.J.A./               Examiner, Art Unit 2125   

/KAMRAN AFSHAR/               Supervisory Patent Examiner, Art Unit 2125