DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 3/22/2018, the Remarks and Amendments filed on 7/12/2021 and the Terminal Disclaimer filed on 7/11/2021.  Acknowledgement is made with regards to priority claimed to Chinese Application No. CN 201810016819.5 filed on 1/5/2018.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4 and 10-13, and 15 are rejected under 35 U.S.C. § 103 as being obvious over Falcon et al. (US 20160026912 A1, hereinafter “Falcon”) in view of Ma (US 20190392299 A1, hereinafter “Ma”) and Lin et al. (Lin et al., “NEURAL NETWORKS WITH FEW MULTIPLICATIONS”, Feb. 26, 2016, ICLR 2016, pp. 1-9, hereinafter “Lin”).

Regarding claim 1, Falcon discloses [a] micro-processor circuit, adapted to perform a neural network operation, and comprising: ([0083]; “FIG. 10 illustrates a more detailed embodiment for implementing an example neural network, in accordance with embodiments of the present disclosure. In one embodiment, example CNN [convolutional neural network] 900 using a weight-shifting mechanism for CNNs may be implemented using a processing device 1000.”; and [0128]; “a processing system may include any system that has a processor, such as, for example, a digital signal processor (DSP), a microcontroller, an application specific integrated circuit (ASIC), or a microprocessor.”)
a parameter generation circuit, receiving in parallel a plurality of input parameters and a plurality of weight parameters of the neural network operation, and ([0085, 0088] and FIG. 11; “Execution cluster 1114 [a parameter generation circuit] may include a number of calculation circuits 1118, distribution logics 1116, 1122, and delay elements 1120. Distribution logic 1116 may include multiplexers to transmit             
                
                    
                        x
                    
                    
                        i
                    
                
            
         [a plurality of input parameters] to inputs of different calculation circuits 1118. Besides input signal             
                
                    
                        x
                    
                    
                        i
                    
                
            
        , distribution logic 1116 may also assign weight coefficients             
                
                    
                        w
                    
                    
                        i
                    
                
            
        , 1, …, N [a plurality of weight parameters] to different calculation circuits. … each of calculation circuits 1118 may accept sixteen input values in parallel to achieve modular and efficient computation.” Because FIG. 11 shows calculation circuits accepting inputs and weights together, it follows that weights may also be accepted in parallel.  Note that, in view of the 112f interpretation and 112a and 112b rejection above, the parameter generation module is a generic hardware processor such as a calculation circuit)
generating in parallel a plurality of sub-output parameters according to the input parameters and the weight parameters; ([0115-0126]; “FIG. 14 is a flowchart of an example embodiment of a method 1400 for weight-shifting, in accordance with embodiments of the present disclosure. Method 1400 may illustrate operations performed by, for example, CNN 900, processing device 1000, or calculation circuit 1200. … At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input [generating a plurality of sub-output parameters according to the input parameters and the weight parameters]. The previous results may also be used, if available. … Furthermore, method 1400 may be performed fully or in part in parallel with each other.”);
a compute module, coupled to the parameter generation module, receiving in parallel the sub-output parameters, and summing the sub-output parameters to generate a summed parameter ([0089] and [0099]; “FIG. 12 illustrates an example embodiment of a calculation circuit 1200 that may be used to implement fully or in part calculation circuit 1118. … FIG. 13A is a more detailed illustration of MAC unit 1210. Given N input values from input latches 1302, which in turn may come from input data 1202 and weights 1204, elements of input data 1202 and weights 1204 are multiplied pair-wise at 1304 and then added together in accumulators 1306.”; and [0081]; “the multiplication and sum operations may be implemented in parallel on multi-core CPU or GPU,”).
Falcon fails to explicitly disclose a compare logic circuit, coupled to the compute module, receiving the summed parameter, and performing a comparison operation based on the summed parameter to generate an output parameter of the neural network operation, wherein the parameter generation circuit performs a bit encoding to encode the input parameters and the weight parameters according to a value range of the weight parameters to generate a plurality of encoded input parameters and a plurality of encoded weight parameters, and the parameter generation circuit generates the sub- output parameters according to the encoded input parameters and the encoded weight parameters, wherein the different value ranges of the weight parameters correspond to different bit encoding.
Ma discloses a compare logic circuit, coupled to the compute module, receiving the summed parameter, and performing a comparison operation based on the summed parameter to generate an output parameter of the neural network operation ([0082]; “In an embodiment, the sign determination of the activation function may be performed by a majority voter logic block which determines whether a majority of inputs to the logic block are logic 1s (corresponding to values of +1) or logic 0s (corresponding to values of -1). In a particular embodiment, a majority voter logic block may be constructed from an adder tree”, which discloses the compare logic circuit (majority voter logic block) that is coupled to the compute module; and [0088]-[0089]; “FIG. 15 illustrates an example arrangement of computational logic blocks 1100 to perform an activation function operation of a binary neural network in accordance with certain embodiments . . . The results of the bitwise multiplications are provided to a majority voter logic block 1502 that determines the sign of the summation of these results”, the output parameter being a sign of the summation of the results; and Figures 11, 12, and 15; the figures show the compare logic or majority voter block that this coupled to the compute modules that do addition operations; and [0024]; “Although the drawings depict particular computer systems, the concepts of various embodiments are applicable to any suitable integrated circuits and other logic devices”).

Lin discloses wherein the parameter generation circuit performs a bit encoding to encode the input parameters and the weight parameters according to a value range of the weight parameters to generate a plurality of encoded input parameters and a plurality of encoded weight parameters, (Abstract; “First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts”, the bit encoding being the binarization operation; and Page 2, §3.1; “To ensure that P(Wij = 1) lies in a reasonable range, values in w¯ are forced to be a real value in the interval [-1, 1]”, which discloses that the binary quantization or bit encoding encodes the input and weight parameters according to a range of weight parameters [-1, 1] and Page 2, §3.2; “We split the interval of [-1, 1], within which the full precision weight value w¯ij lies, into two sub-intervals: [−1, 0] and (0, 1].”; and Page 3, §4; “However, bounding the values is essential for quantization because we need to supply a fixed number of bits for each sampled value, and if that value varies too much, we will need too many bits for the exponent. This, in turn, will result in the need for more bits to store the sampled value and unnecessarily increase the required amount of computation. While h 0 (Wx + b) is not a good choice for quantization, x is a better choice, because it is the hidden representation at each layer, and we know roughly the distribution of each layer’s activation. Our approach is therefore to eliminate multiplications in Eq. 4 by quantizing each entry in x to an integer power of 2. That way the outer product in Eq. 4 becomes a series of bit shifts. Experimentally, we find that allowing a maximum of 3 to 4 bits of shift is sufficient to make the network work well.” (emphasis added), where “x” is the quantized or encoded input parameter; and Page 4, Algorithm 1;  the algorithm discloses, in line 4, binarizing “w” which is an encoding of the weight parameters and encoding the input of each layer of the neural network during the backward propagation phase)
and the parameter generation circuit generates the sub- output parameters according to the encoded input parameters and the encoded weight parameters, (Page 4, Algorithm 1;  the algorithm discloses the generation of the sub-output parameters according to the encoded input parameters and the encoded weight parameters as the algorithm returns an output from the forward and backward propagation procedures)
wherein the different value ranges of the weight parameters correspond to different bit encoding (Page 2, §3.1; “To ensure that P(Wij = 1) lies in a reasonable range, values in w¯ are forced to be a real value in the interval [-1, 1]”, which discloses that the binary quantization or bit encoding encodes the input and weight parameters according to a range of weight parameters [-1, 1], and this is a binary bit encoding; and Page 2, §3.2; “We split the interval of [-1, 1], within which the full precision weight value w¯ij lies, into two sub-intervals: [−1, 0] and (0, 1].”, which discloses a ternary quantization/bit encoding/ternary connect which is a different bit encoding with different ranges of weight parameters).
Falcon, Ma, and Lin are analogous because all are concerned with neural network computations.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural networks to combine the bit encoding of Lin with the parameter generation circuit of Falcon to yield the predictable result of wherein the parameter generation circuit performs a bit encoding to encode the input parameters and the weight parameters according to a value range of the weight parameters to generate a plurality of encoded input parameters and a plurality of encoded weight parameters, and the parameter generation circuit generates the sub- output parameters according to the encoded input parameters and the encoded weight parameters, wherein the different value ranges of the weight parameters correspond to different bit encoding.  The motivation for doing so would be to reduce computation for a neural network operation (Lin; Abstract and §1, ¶2)

Regarding claim 12, Falcon discloses [a] method of performing a neural network operation, adapted to a micro-processor circuit comprising a parameter generation circuit, a compute module and a compare logic circuit, the method of performing the neural network operation comprises: (([0083]; “FIG. 10 illustrates a more detailed embodiment for implementing an example neural network, in accordance with embodiments of the present disclosure. In one embodiment, example CNN [convolutional neural network] 900 using a weight-shifting mechanism for CNNs may be implemented using a processing device 1000.”; and [0128]; “a processing system may include any system that has a processor, such as, for example, a digital signal processor (DSP), a microcontroller, an application specific integrated circuit (ASIC), or a microprocessor.”; and FIG. 11; “Execution cluster 1114 [a parameter generation module] may include a number of calculation circuits 1118”, the calculation circuits being the parameter generation and compute modules)
receiving in parallel a plurality of input parameters and a plurality of weight parameters of the neural network operation by the parameter generation circuit, and (([0085, 0088] and FIG. 11; “Execution cluster 1114 [a parameter generation module] may include a number of calculation circuits 1118, distribution logics 1116, 1122, and delay elements 1120. Distribution logic 1116 may include multiplexers to transmit             
                
                    
                        x
                    
                    
                        i
                    
                
            
         [a plurality of input parameters] to inputs of different calculation circuits 1118. Besides input signal             
                
                    
                        x
                    
                    
                        i
                    
                
            
        , distribution logic 1116 may also assign weight coefficients             
                
                    
                        w
                    
                    
                        i
                    
                
            
        , 1, …, N [a plurality of weight parameters] to different calculation circuits. … each of calculation circuits 1118 may accept sixteen input values in parallel to achieve modular and efficient computation.” Because FIG. 11 shows calculation circuits accepting inputs and weights together, it follows that weights may also be accepted in parallel)
generating in parallel a plurality of sub-output parameters according to the input parameters and the weight parameters; (([0115-0126]; “FIG. 14 is a flowchart of an example embodiment of a method 1400 for weight-shifting, in accordance with embodiments of the present disclosure. Method 1400 may illustrate operations performed by, for example, CNN 900, processing device 1000, or calculation circuit 1200. … At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input [generating a plurality of sub-output parameters according to the input parameters and the weight parameters]. The previous results may also be used, if available. … Furthermore, method 1400 may be performed fully or in part in parallel with each other.”)
receiving in parallel the sub-output parameters by the compute module, and summing the sub-output parameters to generate a summed parameter (([0089] and [0099]; “FIG. 12 illustrates an example embodiment of a calculation circuit 1200 that may be used to implement fully or in part calculation circuit 1118. … FIG. 13A is a more detailed illustration of MAC unit 1210. Given N input values from input latches 1302, which in turn may come from input data 1202 and weights 1204, elements of input data 1202 and weights 1204 are multiplied pair-wise at 1304 and then added together in accumulators 1306.”; and [0081]; “the multiplication and sum operations may be implemented in parallel on multi-core CPU or GPU,”).
Falcon fails to explicitly disclose compare logic and receiving the summed parameter by the compare logic circuit, and performing a comparison operation based on the summed parameter to generate an output parameter of the neural network operation, wherein the step of generating in parallel the sub-output parameters according to the input parameters and the weight parameters comprises: performing a bit encoding to encode the input parameters and the weight parameters by the parameter generation circuit according to a value range of the weight parameters, so as to generate a plurality of encoded input parameters and a plurality of encoded weight parameters; and generating the sub-output parameters by the parameter generation circuit according to the encoded input parameters and the encoded weight parameters, wherein the different value ranges of the weight parameters correspond to different bit encoding.
Ma discloses compare logic and receiving the summed parameter by the compare logic, and performing a comparison operation based on the summed parameter to generate an output parameter of the neural network operation ([0082]; “In an embodiment, the sign determination of the activation function may be performed by a majority voter logic block which determines whether a majority of inputs to the logic block are logic 1s (corresponding to values of +1) or logic 0s (corresponding to values of -1). In a particular embodiment, a majority voter logic block may be constructed from an adder tree”, which discloses the compare logic (majority voter logic block) that is coupled to the compute module; and [0088]-[0089]; “FIG. 15 illustrates an example arrangement of computational logic blocks 1100 to perform an activation function operation of a binary neural network in accordance with certain embodiments . . . The results of the bitwise multiplications are provided to a majority voter logic block 1502 that determines the sign of the summation of these results”, the output parameter being a sign of the summation of the results; and Figures 11, 12, and 15; the figures show the compare logic or majority voter block that this coupled to the compute modules that do addition operations).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 1.
wherein the step of generating in parallel the sub-output parameters according to the input parameters and the weight parameters comprises: performing a bit encoding to encode the input parameters and the weight parameters by the parameter generation circuit according to a value range of the weight parameters, so as to generate a plurality of encoded input parameters and a plurality of encoded weight parameters; and (Abstract; “First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts”, the bit encoding being the binarization operation; and Page 2, §3.1; “To ensure that P(Wij = 1) lies in a reasonable range, values in w¯ are forced to be a real value in the interval [-1, 1]”, which discloses that the binary quantization or bit encoding encodes the input and weight parameters according to a range of weight parameters [-1, 1] and Page 2, §3.2; “We split the interval of [-1, 1], within which the full precision weight value w¯ij lies, into two sub-intervals: [−1, 0] and (0, 1].”; and Page 3, §4; “However, bounding the values is essential for quantization because we need to supply a fixed number of bits for each sampled value, and if that value varies too much, we will need too many bits for the exponent. This, in turn, will result in the need for more bits to store the sampled value and unnecessarily increase the required amount of computation. While h 0 (Wx + b) is not a good choice for quantization, x is a better choice, because it is the hidden representation at each layer, and we know roughly the distribution of each layer’s activation. Our approach is therefore to eliminate multiplications in Eq. 4 by quantizing each entry in x to an integer power of 2. That way the outer product in Eq. 4 becomes a series of bit shifts. Experimentally, we find that allowing a maximum of 3 to 4 bits of shift is sufficient to make the network work well.” (emphasis added), where “x” is the quantized or encoded input parameter; and Page 4, Algorithm 1;  the algorithm discloses, in line 4, binarizing “w” which is an encoding of the weight parameters and encoding the input of each layer of the neural network during the backward propagation phase)
generating the sub-output parameters by the parameter generation circuit according to the encoded input parameters and the encoded weight parameters, (Page 4, Algorithm 1;  the algorithm discloses the generation of the sub-output parameters according to the encoded input parameters and the encoded weight parameters as the algorithm returns an output from the forward and backward propagation procedures)
wherein the different value ranges of the weight parameters correspond to different bit encoding (Page 2, §3.1; “To ensure that P(Wij = 1) lies in a reasonable range, values in w¯ are forced to be a real value in the interval [-1, 1]”, which discloses that the binary quantization or bit encoding encodes the input and weight parameters according to a range of weight parameters [-1, 1], and this is a binary bit encoding; and Page 2, §3.2; “We split the interval of [-1, 1], within which the full precision weight value w¯ij lies, into two sub-intervals: [−1, 0] and (0, 1].”, which discloses a ternary quantization/bit encoding/ternary connect which is a different bit encoding with different ranges of weight parameters).
The motivation to combine Falcon, Ma, and Lin is the same as discussed above with respect to claim 1.

	Regarding claim 2, the rejection of claim 1 is incorporated but Falcon fails to explicitly disclose wherein the compare logic circuit compares the number of a first value type and the number of a second value type of the sub-output parameters according to the summed parameter to determine the output parameter.
Ma discloses wherein the compare logic circuit compares the number of a first value type and the number of a second value type of the sub-output parameters according to the summed parameter to determine the output parameter ([0082]; “In an embodiment, the sign determination of the activation function may be performed by a majority voter logic block which determines whether a majority of inputs to the logic block are logic 1s (corresponding to values of +1) or logic 0s (corresponding to values of -1). In a particular embodiment, a majority voter logic block may be constructed from an adder tree”, which discloses the compare logic (majority voter logic block) that compares the number of a first value type (+1) to a number of a second value type (-1) using majority voting; and [0088]-[0089]; “FIG. 15 illustrates an example arrangement of computational logic blocks 1100 to perform an activation function operation of a binary neural network in accordance with certain embodiments . . . The results of the bitwise multiplications are provided to a majority voter logic block 1502 that determines the sign of the summation of these results”, the output parameter being a sign of the summation of the results; and Figures 11, 12, and 15; the figures show the compare logic or majority voter block that this coupled to the compute modules that do addition operations).



Regarding claim 4, the rejection of claim 1 is incorporated and Falcon further discloses wherein if value ranges of the input parameters and the weight parameters respectively comprise two value types, the parameter generation circuit adopts a first encoding method to encode the input parameters and the weight parameters, and ([0094]; “In one embodiment, for a given layer, the maximum and minimum values [example of two value types] of weights 1204 may be determined. In another embodiment and based on such a determination, weights 1204 may be scaled up to meet a defined range. For example, if weights 1204 are given as positive and negative [example of two value types] fractions less than one, then weights 1204 may be scaled up to the range (-1, 1) [disclosing a first encoding method to encode the weight parameters].”)
the parameter generation circuit generates the sub-output parameters through a first look-up table or a first logic circuit according to the encoded input parameters and the encoded weight parameters ([0090]-[0094] and Figure 12; the figure and the paragraphs disclose generating the sub-output parameters through a first logic circuit (calculation circuit 1200), according to the encoded input and weight parameters).

10, the rejection of claim 1 is incorporated and Falcon further discloses wherein the micro-processor circuit executes a micro-instruction to complete the neural network operation, ([0023]; “The following description describes weight-shifting mechanism for reconfigurable processing units within or in association with a processor, virtual processor, package, computer system, or other processing apparatus. In one embodiment, such a weight-shifting mechanism may be used in convolution neural networks (CNN).”; and [0053]; “in one embodiment, the decoder decodes a received instruction into one or more operations called ‘micro-instructions’ or ‘micro-operations’ (also called micro op or uops) that the machine may execute. In other embodiments, the decoder parses the instruction into an opcode and corresponding data and control fields that may be used by the micro-architecture to perform operations in accordance with one embodiment.”
a source operand of the micro-instruction comprises the input parameters and the weight parameters, and ([0032]; “For example, in one embodiment, the bits in a 64-bit register may be organized as a source operand containing four separate 16-bit data elements, each of which represents a separate 16-bit value.”; and [0107]; “For example, processing device 1000 may include registers for storing weights or input values as well as multiplexers to route values to appropriate multiplication circuits.”
a destination operand of the micro-instruction comprises the output parameter of the neural network operation ([0032]; “In one embodiment, a SIMD instruction specifies a single vector operation to be performed on two source vector operands to generate a destination vector operand (also referred to as a result vector operand) of the same or different size, with the same or different number of data elements, and in the same or different data element order.”; and [0123]; “The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit.”).

Regarding claim 11, the rejection of claim 1 is incorporated and Falcon further discloses wherein a bit width of each of the input parameters is equal to a bit width of each of the weight parameters, and (Figure 12; the figure discloses that a bit width of the input parameters is equal to a bit width of each of the weight parameters (8 bits = 8 bits or 1.7 format=1.7 format))
a bit width of the micro-processor circuit is greater than a sum of the bit widths of the input parameters and the weight parameters (Figure 12; the figure discloses that a bit width of the micro-processor circuit is greater than a sum of the bit widths of the input parameters and the weight parameters (19 bits>16 bits or 5.14 format>(1.7 format+1.7 format))).


	Regarding claim 13, the rejection of claim 12 is incorporated but Falcon fails to explicitly disclose comparing the number of a first value type and the number of a second value type of the sub-output parameters by the compare logic circuit according to the summed parameter, so as to determine the output parameter.
Ma discloses comparing the number of a first value type and the number of a second value type of the sub-output parameters by the compare logic circuit according to the summed parameter, so as to determine the output parameter “In an embodiment, the sign determination of the activation function may be performed by a majority voter logic block which determines whether a majority of inputs to the logic block are logic 1s (corresponding to values of +1) or logic 0s (corresponding to values of -1). In a particular embodiment, a majority voter logic block may be constructed from an adder tree”, which discloses the compare logic (majority voter logic block) that compares the number of a first value type (+1) to a number of a second value type (-1) using majority voting; and [0088]-[0089]; “FIG. 15 illustrates an example arrangement of computational logic blocks 1100 to perform an activation function operation of a binary neural network in accordance with certain embodiments . . . The results of the bitwise multiplications are provided to a majority voter logic block 1502 that determines the sign of the summation of these results”, the output parameter being a sign of the summation of the results; and Figures 11, 12, and 15; the figures show the compare logic or majority voter block that this coupled to the compute modules that do addition operations).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 1.

Regarding claim 15, the rejection of claim 12 is incorporated and Falcon further discloses wherein if value ranges of the input parameters and the weight parameters respectively comprise two value types, the parameter generation circuit adopts a first encoding method to encode the input parameters and the weight parameters, and ([0094]; “In one embodiment, for a given layer, the maximum and minimum values [example of two value types] of weights 1204 may be determined. In another embodiment and based on such a determination, weights 1204 may be scaled up to meet a defined range. For example, if weights 1204 are given as positive and negative [example of two value types] fractions less than one, then weights 1204 may be scaled up to the range (-1, 1) [disclosing a first encoding method to encode the weight parameters].”
the step of generating in parallel the sub-output parameters according to the input parameters and the weight parameters comprises: generating the sub-output parameters through a first look-up table or a first logic circuit by the parameter generation circuit according to the encoded input parameters and the encoded weight parameters ([0090]-[0094] and Figure 12; the figure and the paragraphs disclose generating the sub-output parameters through a first logic circuit (calculation circuit 1200), according to the encoded input and weight parameters).


Claims 5-9 and 16-20 are rejected under 35 U.S.C. § 103 as being obvious over Falcon in view of Ma and Lin and further in view of Zhu et al., (Zhu et al., “TRAINED TERNARY QUANTIZATION”, Feb. 23, 2017, ICLR 2017, pp. 1-10, hereinafter “Zhu”).

Regarding claim 5, the rejection of claim 1 is incorporated but Falcon fails to explicitly disclose wherein if value ranges of the input parameters and the weight parameters respectively comprise three value types, the parameter generation circuit adopts a second encoding method to encode the input parameters and the weight parameters, and the parameter generation circuit generates the sub-output parameters through a second look-up table or a second logic circuit according to the encoded input parameters and the encoded weight parameters.
Zhu discloses wherein if value ranges of the input parameters and the weight parameters respectively comprise three value types, the parameter generation circuit adopts a second encoding method to encode the input parameters and the weight parameters, and (Page 3-4, Section 4 and 4.1; equation (6) discloses that full-resolution weights that fall into one of three categories, where a value range of the weight parameters and the input parameters comprise three value types, and are translate into quantized ternary weights by adopting a second encoding method to encode the weight parameters)
the parameter generation circuit generates the sub-output parameters through a second look-up table or a second logic circuit according to the encoded input parameters and the encoded weight parameters (Abstract; “During inference, only ternary values (2-bit weights) and scaling factors are needed, therefore our models are nearly 16× smaller than fullprecision models. Our ternary models can also be viewed as sparse binary weight networks, which can potentially be accelerated with custom circuit”, which discloses a second logic circuit to generate sub-output parameters; and Page 4, Equation 6-8 and Figure 1; the equations disclose generating the sub-output parameters in the form of weights through ternary quantization, and this is accomplished using a second logic circuit as disclosed as the “custom circuit” in the Abstract). 
Falcon, Ma, Lin, and Zhu are analogous because all are concerned with configuring weights in neural network operations.  Before the effective filing date of the 

Regarding claim 6, the rejection of claims 1 and 5 are incorporated but Falcon fails to explicitly disclose wherein the compute module comprises: a first sub-compute module, configured to sum values of a first bit of each of the sub-output parameters to generate a first summed parameter; and a second sub-compute module, configured to sum values of a second bit of each of the sub-output parameters to generate a second summed parameter, wherein the compare logic circuit compares the first summed parameter and the second summed parameter to determine the output parameter.
Ma discloses wherein the compute module comprises: a first sub-compute module, configured to sum values of a first bit of each of the sub-output parameters to generate a first summed parameter; and (Figure 15 and [0090]-[0091]; the adder tree in the figure (1100G-1100M), comprises adders, which are sub compute modules, that sum values of a first or second bit of each of the sub output parameters to generate a summed parameter)
a second sub-compute module, configured to sum values of a second bit of each of the sub-output parameters to generate a second summed parameter, (Figure 15 and [0090]-[0091]; the adder tree in the figure (1100G-1100M), comprises adders, which are sub compute modules, that sum values of a first or second bit of each of the sub output parameters to generate a summed parameter)
wherein the compare logic circuit compares the first summed parameter and the second summed parameter to determine the output parameter ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the output parameter being the most significant bit determined through majority voting).
Falcon and Ma are analogous because both are concerned with hardware neural network circuit operations.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in hardware neural networks to combine the first and second sub-compute modules of Ma with the circuit of Falcon to yield the predictable result of wherein the compute module comprises: a first sub-compute module, configured to sum values of a first bit of each of the sub-output parameters to generate a first summed parameter; and a second sub-compute module, configured to 

Regarding claim 7, the rejection of claims 1 and 5 are incorporated and Falcon further discloses one of the encoded input parameters and one of the encoded weight parameters ([0094]; ““In one embodiment, for a given layer, the maximum and minimum values of weights 1204 may be determined. In another embodiment and based on such a determination, weights 1204 may be scaled up to meet a defined range. For example, if weights 1204 are given as positive and negative fractions less than one, then weights 1204 may be scaled up to the range (-1, 1).”); and [0090]; “In one embodiment, calculation circuit 1200 may include a 16-bit arithmetic left shifter 1240 to scale up inputs for computations of calculation circuit 1200. In another embodiment, calculation circuit 1200 may include a right shifter and truncate logic 1232 to scale down resulting calculations of calculation circuit 1200”, which discloses the encoded input parameters that scales up inputs)
Falcon fails to explicitly disclose wherein the second logic circuit comprises: a first sub-logic circuit, configured to generate a first bit of each of the sub-output parameters according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation; and a second sub-logic circuit, configured to generate a second bit of each of the sub-output parameters according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation.
Ma discloses a first sub-logic circuit, configured to generate a first bit of each of the sub-output parameters according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation; and (Figure 15; the figure discloses, under a broadest reasonable interpretation of the claim language, a first and second sub-logic circuit that generate a first and second bit of each of the sub-output parameters as an output of one of the XNOR gates in the figure)
a second sub-logic circuit, configured to generate a second bit of each of the sub-output parameters according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation (Figure 15; the figure discloses, under a broadest reasonable interpretation of the claim language, a first and second sub-logic circuit that generate a first and second bit of each of the sub-output parameters as an output of one of the XNOR gates in the figure).
Falcon and Ma are analogous because both are concerned with hardware neural network circuit operations.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in hardware neural networks to combine the first and second sub-logic circuits of Ma with the circuit and input and weight encoding of Falcon to yield the predictable result of wherein the second logic circuit comprises: a first sub-logic circuit, configured to generate a first bit of each of the sub-output parameters according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation; and a second sub-logic circuit, configured to generate a second bit of each of the sub-output parameters according to 

Regarding claim 8, the rejections of claims 1 and 5 are incorporated but Falcon fails to explicitly disclose wherein the compute module calculates a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters.
Ma discloses wherein the compute module calculates a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the sign of the summation being, under a broadest reasonable interpretation of the claim language, a result of calculating the number of first bits having a first value in the sub-output parameters and the number of second bits having the first value in the sub-output parameters).
Falcon and Ma are analogous because both are concerned with hardware neural network circuit operations.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in hardware neural networks to combine the 


Regarding claim 9, the rejections of claims 1 and 5 are incorporated but Falcon fails to explicitly disclose wherein the compare logic compares a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters to determine the output parameter.
Ma discloses wherein the compare logic compares a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters to determine the output parameter ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the output parameter being the most significant bit determined through majority voting).




Regarding claim 16, the rejection of claim 12 is incorporated but Falcon fails to explicitly disclose wherein if value ranges of the input parameters and the weight parameters respectively comprise three value types, the parameter generation circuit adopts a second encoding method to encode the input parameters and the weight parameters, and the step of generating in parallel the sub-output parameters according to the input parameters and the weight parameters comprises: generating the sub-output parameters by the parameter generation circuit through a second look-up table or a second logic circuit according to the encoded input parameters and the encoded weight parameters.
Zhu discloses wherein if value ranges of the input parameters and the weight parameters respectively comprise three value types, the parameter generation circuit adopts a second encoding method to encode the input parameters and the weight parameters (Page 3-4, Section 4 and 4.1; equation (6) discloses that full-resolution weights that fall into one of three categories, where a value range of the weight parameters and the input parameters comprise three value types, and are translate into quantized ternary weights by adopting a second encoding method to encode the weight parameters)
generating the sub-output parameters by the parameter generation circuit through a second look-up table or a second logic circuit according to the encoded input parameters and the encoded weight parameters (Abstract; “During inference, only ternary values (2-bit weights) and scaling factors are needed, therefore our models are nearly 16× smaller than fullprecision models. Our ternary models can also be viewed as sparse binary weight networks, which can potentially be accelerated with custom circuit”, which discloses a second logic circuit to generate sub-output parameters; and Page 4, Equation 6-8 and Figure 1; the equations disclose generating the sub-output parameters in the form of weights through ternary quantization, and this is accomplished using a second logic circuit as disclosed as the “custom circuit” in the Abstract). 
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 5.

Regarding claim 17, the rejection of claims 12 and 16 are incorporated but Falcon fails to explicitly disclose wherein the compute module comprises a first sub-compute module and a second sub-compute module, and the step of summing the sub-output parameters to generate the summed parameter comprises: summing a value of a first bit of each of the sub-output parameters by the first sub-compute module to generate a first summed parameter; and summing a value of a second bit of each of the sub-output parameters by the second sub-compute module to generate a second summed parameter, wherein the step of performing the comparison operation based on the summed parameter to generate the output parameter of the neural network operation comprises: comparing the first summed parameter and the second summed parameter by the compare logic circuit to determine the output parameter.
Ma discloses wherein the compute module comprises a first sub-compute module and a second sub-compute module, and the step of summing the sub-output parameters to generate the summed parameter comprises: summing a value of a first bit of each of the sub-output parameters by the first sub-compute module to generate a first summed parameter (Figure 15 and [0090]-[0091]; the adder tree in the figure (1100G-1100M), comprises adders, which are sub compute modules, that sum values of a first or second bit of each of the sub output parameters to generate a summed parameter)
summing a value of a second bit of each of the sub-output parameters by the second sub-compute module to generate a second summed parameter (Figure 15 and [0090]-[0091]; the adder tree in the figure (1100G-1100M), comprises adders, which are sub compute modules, that sum values of a first or second bit of each of the sub output parameters to generate a summed parameter)
wherein the step of performing the comparison operation based on the summed parameter to generate the output parameter of the neural network operation comprises: comparing the first summed parameter and the second summed parameter by the compare logic circuit to determine the output parameter ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the output parameter being the most significant bit determined through majority voting).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 6.

Regarding claim 18, the rejection of claims 12 and 16 are incorporated and Falcon further discloses one of the encoded input parameters and one of the encoded weight parameters ([0094]; ““In one embodiment, for a given layer, the maximum and minimum values of weights 1204 may be determined. In another embodiment and based on such a determination, weights 1204 may be scaled up to meet a defined range. For example, if weights 1204 are given as positive and negative fractions less than one, then weights 1204 may be scaled up to the range (-1, 1).”); and [0090]; “In one embodiment, calculation circuit 1200 may include a 16-bit arithmetic left shifter 1240 to scale up inputs for computations of calculation circuit 1200. In another embodiment, calculation circuit 1200 may include a right shifter and truncate logic 1232 to scale down resulting calculations of calculation circuit 1200”, which discloses the encoded input parameters that scales up inputs)
Falcon fails to explicitly disclose wherein the second logic circuit comprises a first sub-logic circuit and a second sub-logic circuit, and the step of generating the sub-output parameters through the second look-up table or the second logic circuit comprises: generating a first bit of each of the sub-output parameters by the first sub-logic circuit according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation; and generating a second bit of each of the sub-output parameters by the second sub-logic circuit according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation.
Ma discloses generating a first bit of each of the sub-output parameters by the first sub-logic circuit according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation (Figure 15; the figure discloses, under a broadest reasonable interpretation of the claim language, a first and second sub-logic circuit that generate a first and second bit of each of the sub-output parameters as an output of one of the XNOR gates in the figure)
generating a second bit of each of the sub-output parameters by the second sub-logic circuit according to one of the encoded input parameters and one of the encoded weight parameters of the neural network operation (Figure 15; the figure discloses, under a broadest reasonable interpretation of the claim language, a first and second sub-logic circuit that generate a first and second bit of each of the sub-output parameters as an output of one of the XNOR gates in the figure).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 7.

Regarding claim 19, the rejections of claims 12 and 16 are incorporated but Falcon fails to explicitly disclose wherein the compute module calculates a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters.
Ma discloses wherein the compute module calculates a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the sign of the summation being, under a broadest reasonable interpretation of the claim language, a result of calculating the number of first bits having a first value in the sub-output parameters and the number of second bits having the first value in the sub-output parameters).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 8.


Regarding claim 20, the rejections of claims 12 and 16 are incorporated but Falcon fails to explicitly disclose wherein the compare logic compares a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters to determine the output parameter.
wherein the compare logic compares a number of first bits having a first value in the sub-output parameters and a number of second bits having the first value in the sub-output parameters to determine the output parameter ([0091]; “That is, if the sign of the result of the summation is positive, then more +1 values were supplied to the adder than -1 values, and the most significant bit will be a logical 1 (representing a positive sign). However, if the sign of the result of the summation is negative, then more -1 values were supplied to the adder than +1 values, and the most significant bit will be a logical 0 (representing a negative sign). Accordingly, the majority voter produces the majority bit of the input bits”, the output parameter being the most significant bit determined through majority voting).
The motivation to combine Falcon and Ma is the same as discussed above with respect to claim 6.

Response to Arguments

	Applicant’s argument and amendments, filed on 7/12/2021, with respect to the 35 USC § 112(a) rejection of claims 1-11 have been fully considered and are persuasive. The 35 USC § 112(a) rejection of claims 1-11 has been withdrawn.

	Applicant’s argument and amendments, filed on 7/12/2021, with respect to the 35 USC § 112(b) rejection of claims 1-11 and 19-20 have been fully considered and are persuasive. The 35 USC § 112(b) rejection of claims 1-11 and 19-20 has been withdrawn.

Applicant’s argument and amendments, filed on 7/12/2021, as well as the Terminal Disclaimer, filed on 7/11/2021, with respect to the nonstatutory obviousness-type double patenting rejection of claims 1, 3, 10, 11, 12, and 14 have been fully considered and are persuasive.  The nonstatutory obviousness-type double patenting rejection of claims 1, 3, 10, 11, 12, and 14 has been withdrawn.

Applicant’s arguments and amendments, filed on 7/12/2021, with respect to the 35 USC § 103 rejection of claims 1-2, 4-13, and 15-20 have been considered but are moot because the arguments do not apply to any of the references being used in the current rejection to reject independent claims 1 and 12.  Falcon, Ma, and Lin are now being used to render claims 1 and 12 obvious under 35 USC § 103.

Conclusion
                                                                                                                                                                                            
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403.  The examiner can normally be reached on Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 




/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127                                                                                                                                                                                                        
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127