NOTE:

Examiner’s Note: the copy of the claims filed 27 January 2022 has been entered as it does not appear to introduce any amendments.

Applicant's arguments filed 27 January 2022 have been fully considered but they are not persuasive.

Applicant argues that the cited art does not teach “each convolution engine of the at least one convolution engine comprises a plurality of elements of multiply logic and a plurality of elements of addition logic, the plurality of elements of addition logic forming an adder tree configured to generate a sum of the outputs of the plurality of elements of multiply logic”.
However, Chakradhar teaches the CNN coprocessor includes a number of convolvers (150) and additional addition logic (224, 226) (figs. 3-5, etc.) where the logic includes multipliers (para. 0069, etc.) and an adder tree (paras. 0013, 0052, etc.). As shown in the rejections of the prior action(s), the convolvers (singly or as a group 141; see fig. 3) and addition/aggregation logic together are mapped to the convolution engine of the claimed invention, as the convolvers include multiply logic (para. 0069, see also paras. 0068 and 0070 which relate the multipliers to the use on the CNN inputs based on the selected precision and para. 0032 which discusses the convolution operations including multiply-accumulates) and the plurality of adders that add the elements received from the multiply logic in the convolvers (para. 0052, see also para. 0013 which discusses the tree structure).
Applicant further argues that Henry teaches away from the claimed convolution engine, pointing to para. 0082.
However, Henry has not been relied upon to teach the convolution engine of the claimed invention.  Furthermore, for the sake of argument, Henry teaches “rather than performing all the multiplies associated with all the connection inputs and then adding all the products together as in a conventional manner, advantageously each neuron is configured to perform, in a given clock cycle, the weight multiply operation associated with one of the connection inputs and then add (accumulate) the product with the accumulated value of the products associated with connection inputs processed in previous clock cycles up to that point. Assuming there are M connections to the neuron, after all M products have been accumulated (which takes approximately M clock cycles), the neuron performs the activation function on the accumulated value to generate the output, or result”.  As Henry only provides that there is an advantage in their proffered solution, and does not discredit or otherwise discourage the solution claimed. See MPEP § 2141.02.

Applicant also argues that the cited art does not teach “receiving a second subset of data and storing the second subset of data in the one or more buffers such that the second subset of data replaces at least a portion of the first subset of data in 
However, Chakradhar teaches a memory subsystem including input memory, output memory, temporary memory, and instruction memory (figs. 3-4, etc.) receives and sends input, kernel, and intermediate/output data from/to the CNN coprocessor template (140) under control of an input switch to implement the CNN layers (paras. 0043-51, etc.) where intermediate data may include data from a prior layer or portions of a layer to be combined (paras. 0024, 0044, 0047, 0053-54, etc.); and Henry teaches as each layer of the neural network is to be performed by the processor’s neural network units (NNU), which may perform convolutions, the current row of the input data in the input data RAM is overwritten with the next set of input data, as well as overwriting the weights in the weight RAM with the next layer’s weights (paras. 0075, 0117, 0216-219, 0333-335, etc.).  While Applicant further argues that the combination would overwrite input would not be obvious to one of ordinary skill in the art because it would result in a loss of data and a non-functioning component, Chakradhar also teaches that in the absence of sufficient convolvers intermediate accesses to off-chip memory would be needed (see, e.g., paras. 0038, 0050, etc.).  Just because the input and intermediate memories are part of an external memory does not mean that it is unlimited—and thus could benefit from space savings as described in the rejection--and does not mean that it contains the only possible copy of the input data (see, e.g., paras. 0043, 0053, 0064, 0071 for data between the external memory and the processor core performing the CNN, as well as receiving the data from the host to the co-processor).

/GEORGE GIROUX/Primary Examiner, Art Unit 2128