Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant's arguments filed 9/9/22 have been fully considered but they are not persuasive.
In re pgs. 7-8 applicant argues examiner used official notice on USC 101.
In response, the claims were rejected based on USC 101 and Berkheimer memo, not official notice, see below.
In re pg. 8 applicant argues the claims recite a practical application.
In response, the examiner respectfully disagrees, performing a scalar operation does not rise to the level of specifying a clear practical application (see USC 101 rejection below) and it is shown a well-known in the USC 103 rejection.  Gou teaches performing a scalar operation (weight is a scalar quantity or magnitude, pg. 10, Section: "Algorithm 1" This section showcases pseudocode and mapping for how a process of quantization and weight operations would be carried out in an exemplary neural network. The diagram shows the quantization in the forward pass phase (as seen by binarizing) with the backward pass being completed and the parameter update consisting of transforming back to the normalized form (second bullet point where weight is being calculated). The citation also teaches the scalar operation being performed with the equation used to calculate the new weight).

In pg. 9-10 the applicant argues the applied art fails to teach use the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating-point format.
In response, Drumond discloses use the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating-point format (“As such, the rest of the operations can be implemented in traditional floating-point logic with little performance degradation”, section 1 or see Figs. 5-6 which show the process of the hybrid BFP-FP accelerator and shows that after the convolution operations are completed, the result goes into a module named “BFP to FP” which converts the format back to normal-precision floating-point format and then transfers the data to update the parameter as seen with the activation buffer) and
Gou teaches performing a scalar operation (weight is a scalar quantity or magnitude, pg. 10, Section: "Algorithm 1" This section showcases pseudocode and mapping for how a process of quantization and weight operations would be carried out in an exemplary neural network. The diagram shows the quantization in the forward pass phase (as seen by binarizing) with the backward pass being completed and the parameter update consisting of transforming back to the normalized form (second bullet point where weight is being calculated). The citation also teaches the scalar operation being performed with the equation used to calculate the new weight).

In re pg. 10 the applicant argues Gou is silent regarding converting FP.
In response, the applicant cannot show non-obvious by attacking the references individually whereas here the rejections are based on a combination of references see In re Keller USPQ 871 (CCPA 1981).
Drumond discloses use the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating-point format (“As such, the rest of the operations can be implemented in traditional floating-point logic with little performance degradation”, section 1 or see Figs. 5-6 which show the process of the hybrid BFP-FP accelerator and shows that after the convolution operations are completed, the result goes into a module named “BFP to FP” which converts the format back to normal-precision floating-point format and then transfers the data to update the parameter as seen with the activation buffer).

In re pg. 11, applicant argues applied fails to teach by selecting a bounding box including a first set of values expressed in the normal-precision floating-point format, the first set of values being selected based on dimensions of the input tensor or on a tensor operation to be performed.
In response, Mellempudi teaches by selecting a bounding box including a first set of values expressed in the normal-precision floating-point format, the first set of values being selected based on dimensions of the input tensor or on a tensor operation to be performed (Mellempudi talks about the dynamic range being selected (by use of adjustment) for the tensor blocks. Dynamic range being equivalent to the bounding box with blocks in the citation being the size of variables sharing an exponent "Additionally, blocking can be performed for tensors data stored in low-precision or custom floating-point formats, with the data being shifted or the floating-point format being adjusted to properly capture the dynamic range of the data within each block.", 0235).

All arguments have been responded to above or in the body of the rejections below.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Modha (US 2019/0332925) teaches scalar (“ Two neurons are connected if the output of one is an input to the other. A weight is a scalar value encoding the strength of the connection between the output of one neuron and the input of another neuron.”, 0012).


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7, 9-11, 13-25 are rejected under §101 as non-eligible subject matter.
In regards to claim 1, the claim is rejected because the claimed invention is directed to mathematical calculation without significantly more. The claim recites receiving data, converting data formats, data processing calculations, and obtaining a result.
2A Prong 1: The following limitations under their broadest reasonable interpretation, cover performance of the limitations in the mind and/or mathematical calculation but for the recitation of generic computer components.
A computing system comprising: 
a computer-readable memory (see below) storing an operational parameter of a given layer of a neural network (see below); and 
a hardware accelerator (see below) in communication with the computer-readable memory for accelerating tensor operations, the hardware accelerator (see below) for accelerating tensor operations configured to: 
receive an input tensor for a given layer of a multi-layer neural network (see below); 
convert the input tensor from a normal-precision floating-point format to a quantized­ precision floating-point format; (mathematical calculation for transforming data of one type to another) 
perform a tensor operation using the input tensor converted to the quantized­ precision floating-point format; (performing a mathematical calculation from a set of possible mathematical calculations)
convert a result of the tensor operation from the quantized­ precision floating- point format to the normal-precision floating-point format; (mathematical calculation reverting the first format change back to the original format) and 
performing a scalar operation using the converted result in the normal-precision floating-point format “to update” the operational parameter stored in the computer-readable memory, (mental process of updating data when new results are available with assistance of pen and paper) where the parameter is stored in normal-precision floating-point format.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites additional elements of "a computer-readable memory", and "a hardware accelerator" which applicant's specification describes as [(,r0021) "the accelerator includes a Tensor Processing Unit (TPU) 182, reconfigurable logic devices 184 (e.g., contained in one or more FPGAs or a programmable circuit fabric), and/or a subgraph accelerator 186, however any suitable hardware accelerator can be used that models neural networks."]. 
As the claim is directed towards these devices with high-levels of generality (any processing unit and logic devices that may model neural networks along with any well-known computer-readable memory to perform mathematical calculations on said components), the claim amounts to instructions to apply the exception using said generic computer components. The claim also recites limitations like receiving data in the form of an input tensor and storing the data after calculations have been run on it as a parameter. These steps are also recited in high levels of generality and amount to mere instructions for data transfer which is a form of insignificant extra-solution activity.
Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using generic computer components to perform generic mathematical calculations and conversions amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Elements such as a neural network or an accelerator are considered to be generally linking the use of the judicial exception to a particular technological environment or field of use – see MPEP 2106.05(h)).
Further, the steps which recite receiving data in the form of an input tensor and storing data in the form of an end result parameter are steps which were considered to be insignificant extra­ solution activity in Step 2A Prong 2, and thus are re-evaluated in Step 2B to determine if the steps are more than what is well-understood, routine, conventional activity in the field. The court decisions cited in MPEP 2106.0S(d)(II) indicate that merely "storing and retrieving information in memory" is a well-understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby a conclusion that the claimed obtaining and generation step are well-understood, routine, conventional activities is supported under Berkheimer. The claim is not patent eligible.
Independent claim 9 is similar to claim 1 but is a method claim instead of a system claim.
The method claim still contains the generic accelerator component and the instructions for generic mathematical calculations to be carried out by said accelerator. Seeing as the claim is similar except for it being directed to a method instead of a system and offers no other significant changes in limitation, it is rejected for similar reasons and judgement as applied as in claim 1.

Independent claim 17 is similar to claims 1 and 9 but is directed towards a computer- readable media that contains the instructions to be carried out by an accelerator. The claim is still directed towards instructions for mathematical calculations to be carried out by general computer components. Seeing as the claim is similar except for it being directed to a method instead of a system and offers no other significant changes in limitation, it is rejected for similar reasons and judgement as applied as in claims 1 and 9.

Dependent claim 2 which relies on claim 1 recites the additional limitation of the floating point format conversion having a plurality of mantissa values that share a common exponent. As this limitation is merely a stipulation for one of the mathematical calculations already being done in the independent claim, it does not provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

Dependent claim 3 which relies on claim 1 recites additional limitations in which specific conditions for the mathematical operation must take place. Specifying the data input format and specifying the type of output format are both limitations that only specify how the mathematical calculation should be performed and do not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.
Dependent claim 4 which relies on claim 1 recites additional limitations in which specific conditions for the mathematical operation must take place. Specifying the data input format and specifying the type of output format are both limitations that only specify how the mathematical calculation should be performed and do not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.
Dependent claim 5 which relies on claim 1 recites additional limitations of when the mathematical calculation is supposed to occur and specifies the data being input to the calculation. Specifying the data input format and when the calculation should occur are conditions to be met for the mathematical calculation and do not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.
Dependent claim 6 which relies on claim 1 recites an additional limitation of specifying one of the mathematical calculations to be performed. Specifying one of the mathematical calculations that is to be carried out does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

Dependent claim 7 which relies on claim 1 recites an additional limitation of specifying one of the mathematical calculations to be performed. Specifying one of the mathematical calculations that is to be carried out does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

Dependent claim 10 which relies on claim 9 recites loading configuration data onto hardware to perform the operations of a neural network layer. As with the system claim, directing general computer components to perform non-specific computer operations does not add significantly more to the claim as it merely restates that the computations will be done on a computer and adds no further meaningful limitation to the claim. The claim is still rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.

Dependent claim 11 which relies on claim 9 recites an additional limitation of specifying initialization of weights being carried out in the given layer of a neural network. This limitation is generally linking to a particular field or technology and specifying the environment in which the mathematical calculation of the independent claim is to be taken place. As the limitation only serves to limit the mathematical calculation to the field of neural networks and specifies a particular layer of a neural network. Please see MPEP §2106.0S(h) for more details. The claim is not found to further add significantly more and/or alter the claim in a meaningful manner. The claim is still rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.

Dependent claim 13 which relies on claim 9 recites an additional limitation of specifying conditions for which bounding box is selected. As this claim only serves as a condition to determine which range will be used for the calculation, it provides no additional meaningful limitations. The claim is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.

Dependent claim 14 which relies on claim 9 recites an additional limitation of further specifying a particular operation and range. As with claim 13, claim 14 only serves as a condition and particular embodiment for the mathematical calculation and does not provide an additional meaningful limitation. The claim is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.
Dependent claim 15 which relies on claim 9 recites additional limitations of further specifying the steps used to complete the mathematical calculation. Selecting the range for the input data, identifying a variable for the calculation, and other mathematical functions, conversions or calculations like rounding or scaling numbers are claim limitations that are mathematical calculations. The claim limitations fail to transform the claim into something beyond the judicial exception as it does not significantly alter the claim or provide additional meaningful limitations. The claim is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.
Dependent claim 16 which relies on claim 9 recites the additional limitations of specifying the type of neural network and specifies that the neural network hardware accelerator
comprises programming hardware. Specifying the neural network type is merely linking to a particular field and does not add a meaningful limitation to the claim. Programming hardware is a general term for computer components and also does not contain a meaningful limitation. The claim is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 9.
Dependent claim 18 which relies on claim 17 recites additional limitations in which specific conditions for the mathematical operation must take place. Specifying the data input format and specifying the type of output format are both limitations that only specify how the mathematical calculation should be performed and do not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 17.
Dependent claim 19 which relies on claim 17 recites the additional limitations in which specific conditions for the mathematical operation must take place. Specifying the data input format and specifying the type of output format are both limitations that only specify how the mathematical calculation should be performed and do not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 17.
Dependent claim 20 which relies on claim 17 recites the additional limitation of specifying when the calculation takes place. Specifying when the calculation should occur as a condition to be met does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

Claims 17-20 are rejected under §101 for being directed to non-statutory subject matter.
The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because they include transitory forms of signal transmission. Applicant's specification further provides no distinction between the transitory or non-transitory media. All of the independent claims include hardware that specifies some kind of memory/media and all of the independent claims fail to distinguish between transitory and non-transitory memory. Dependent claims are rejected based off their reliance on the independent claims.

21. (New) The computing system of claim 1, wherein the tensor operation is
performed during a forward-propagation phase of training the neural network does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

22. (New) The computing system of claim 1, wherein the tensor operation is
performed during a back-propagation phase of training the neural network does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

23. (New) The computing system of claim 1, wherein the scalar operation is
performed for a single layer of the neural network. does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

24. (New) The computing system of claim 1, wherein the scalar operation comprises
adding a bias to the converted result does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

25. (New) The computing system of claim 1, wherein the scalar operation comprises
applying an activation function to the converted result does not significantly alter the claim or provide any additional meaningful limitation. It is rejected as a mathematical calculation with the same reasons and judgement as applied to claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7 and 9-11, 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Drumond ("End-to-End DNN Training with Block Floating Point Arithmetic", 2018), in view of Mellempudi (US 20180322607 A1) and further in view of Guo ("A Survey on Methods and Theories of Quantized Neural Networks", 2018).

In regards to claim 1, Drumond teaches the following:
the hardware accelerator (abstract) in communication with the memory for accelerating tensor operations configured to: receive an input tensor for a given layer of a multi-layer neural network;
[(Pg. 6, Fig. 6) This figure shows the input tensors (the data representations of x and w) being forwarded to the accelerator (FP to BFP boxes). Examiner notes that (Pg. 5, Col. 2, Section 5.1) explains the process in more detail and explicitly mentions the inputs being fed to the BFP which is equivalent to the accelerator. ]
convert the input tensor from a normal-precision floating-point format to a quantized-precision floating-point format [(Pg. 5, Col. 1, Section 4.4) "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly"
Here, Drumond teaches the FP (floating-point) to BFP (Block floating-point) conversion. Examiner notes that the BFP is a quantized-precision operation as evidenced by the citation below. Pg. 2, Column 2, Section "training with end-to-end low precision") "We use BFP instead, effectively computing quantization points by choosing tensor exponents"];
perform the tensor operation using the input tensor converted to the quantized­ precision floating-point format [(Pg. 4, Col. 1, Section 4.2) "BFP should be used for the most demanding, dot product based, computations, with other operations being performed in floating-point-like representations" Drumond teaches that all the operations that are based on dot product calculations will be done in the BFP format.];

convert a result of the tensor operation from the quantized-precision floating­ point format to the normal-precision floating-point format;
[ (Pg. 5, Col. 1, Section 4.4) "while the BFP-to-FP unit normalizes the mantissas according to the single given exponent."
This citation teaches the block floating-point back to floating point operation for the data. This operation takes place after trainings had taken place and other operations as seen in the paragraph leading up to the citation. In the alternative, the secondary reference Mellempudi, also teaches this limitation as seen below.]; and 
performing a scalar operation using the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating-point format.
[ (Fig. 5) This figure shows the process of the hybrid BFP-FP accelerator and shows that after the convolution operations are completed, the result goes into a module named "BFP to FP" which converts the format back to normal-precision floating-point format and then transfers the data to update the parameter as seen with the activation buffer.]

Drumond fails to particularly call for A computing system comprising: a computer-readable memory storing an operational parameter of a given layer of a neural network; a hardware accelerator in communication with the computer-readable memory and
performing a scalar operation using the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating- point format. Mellempudi as seen below:

Mellempudi teaches a computing system comprising: a computer-readable memory storing an operational parameter of a given layer of a neural network;

[ (,r0337) "The storage device and signals carrying the network traffic respectively represent one or more machine-readable storage media and machine-readable communication media. Thus, the storage devices of a given electronic device typically store code and/or data for execution on the set of one or more processors of that electronic device."]
[ (,r0108) "Application effective address space 482 within system memory 411 stores process elements"
This citation teaches the system memory storing process elements which are equivalent to the operational parameters from the claim limitation. This is showcased in Fig. 22 with reference number 2222 being the data stored on the memory. ]
[ (,r0153) "For example, the input to a convolution layer can be a multidimensional array of data that defines the various color components of an input image."
This citation teaches the operational parameter being data that will be processed in the convolutional network and therefore saved onto the memory.]
and a hardware accelerator in communication with the computer-readable memory, [(,r0095) "In one implementation, an accelerator integration circuit 436 provides cache management, memory access" This citation teaches the accelerator having access to the memory.] 
also teaches conversion of floating point [Mellempudi: (,r0228) "Once a pre-determined number of iterations has been performed by the inner block 1942, the data can be converted from dynamic fixed-point to floating-point via a dynamic fixed-point to floating-point converter unit"
This citation from Mellempudi teaches the same unit that is referenced in the primary and more explicitly points out the claim limitation. (See motivation to combine below)].

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings for quantizing tensors in neural networks, as taught by Drumond, with the dynamic precision management system as taught by Mellempudi. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide a performance and efficiency gain when carrying out the operations [ Mellempudi (,r0228)]. This would facilitate the recognized benefit of creating a quicker neural network overall and one that would utilize less resources or consume the same amount of resources for an increase in work done.

Guo teaches performing a scalar operation using the converted result in the normal-precision floating-point format to update the operational parameter stored in the computer-readable memory, where the parameter is stored in normal-precision floating- point format. [ (Pg. 10, Section: "Algorithm 1")
This section showcases pseudocode and mapping for how a process of quantization and weight operations would be carried out in an exemplary neural network. The diagram shows the quantization in the forward pass phase (as seen by binarizing) with the backward pass being completed and the parameter update consisting of transforming back to the normalized form (second bullet point where Wt is being calculated). The citation also teaches the scalar operation being performed with the equation used to calculate the new weight. ]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings for quantizing tensors in neural networks, as taught by Drumond/Mellempudi, with the post-quantization methods and operations as taught by Guo. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide a greater performance and efficiency gain than when implementing the BFP quantization alone to carry out all of the operations [ Drumond (Abstract) ]. This would facilitate the recognized benefit of creating a more efficient neural network architecture and one that still maintains accuracy.

In regards to claim 2, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
wherein the quantized-precision floating- point format is a block floating-point format having a plurality of mantissa values that share a common exponent.
[ (Pg. 3, Col. 2, Paragraph 2) "BFP represents numbers with a mantissa and exponent, like floating-point, but exponents are shared across entire tensors, as shown in Figure 1"
This citation teaches the block floating-point operation with the mantissa and the shared exponent.]
In regards to claim 3, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
and the quantized-precision floating-point format is a block floating- point format where a plurality of mantissa values within a given row share a common exponent,
[ (Pg. 3, Col. 2, Paragraph 2) "BFP represents numbers with a mantissa and exponent, like floating-point, but exponents are shared across entire tensors, as shown in Figure 1"

This citation teaches the block floating-point (BFP) and the exponents are shared across the row of a mantissa by being shared to the entire tensor. ]
and mantissa values in different rows have different respective exponents.

[ (Fig. 1a and 1b) This figure shows two examples. 1a demonstrates the exponent being shared across all rows as in the citation above and 1b shows an embodiment where the exponent is different for each respective row of the mantissa. ]
What Drumond does not distinctly disclose and is instead taught by Mellempudi is seen below:
wherein the input tensor is a two- dimensional matrix,[ (,r0237)
In general, this paragraph teaches about the multi-dimensional tensors and points to a specific embodiment shown in Fig. 21B that highlights the 2D matrix. The paragraph specifically points to a matrices A and Beach having rows and columns with various variable names and values which is a 2D matrix. ] Please refer to claim 1 for the motivation to combine.
In regards to claim 4, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
wherein the input tensor is a convolution filter,
[ (Pg. 5, Section 5.1) "We train DNNs with the hybrid approach, using BFP in the compute­ intensive operations (matrix multiplications, convolutions)"

This citation teaches the BFP format being used to perform matrix multiplications and convolution operations. This shows that convolution filters are used as inputs in at least some embodiments.	]
and the quantized-precision floating-point format is a block floating-point format where a plurality of mantissa values within a spatial pixel share a common exponent. [(Fig.1)
This figure (particularly 1a) shows the tensor sharing a common exponent across a plurality of mantissa values. Further, Section 5.2 which goes over the evaluation setup explains that the data sets used in the paper are composed of images which comprise of pixel data and therefore the mantissa values are within a spatial pixel. ]

In regards to claim 5, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
wherein the tensor operation is performed during a back-propagation mode of the neural network,
[ (Pg. 5, Section 5.1) "We modified TensorFlow's (Abadi et al., 2016) matrix multiplications and convolution operations to reproduce the behaviour of BFP matrix multipliers in both the forward and backward Passes"
This shows the operation being performed on the tensors during the backward pass which is the backward propagation mode of the neural network. ]
the input tensor is an output error term from an adjacent layer to the given layer. [ (Fig. 3) and (Fig. 5) Figure 3 shows the layout of the neural network using block floating point and shows the data is passed between next layer and previous layer to/from the "current" layer. Figure 5 points to the operations within each layer and shows that the "activation loss" which the examiner is interpreting as the output error term is then propagated back to the activation buffer. ]
[ (Pg. 4, Paragraph 1) "When computing weight gradients, the dot products are computed across batches, and therefore, entire batches of activations and gradients must share exponents to take advantage of fixed-point dot products"
The above citation serves as support with the weight gradients being equivalent to the error term. ]

In regards to claim 6, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
wherein the tensor operation is a dot product computation.
[ (Pg. 1, Col. 2, Paragraph 2) "We propose a hybrid BFP-FP framework where values float freely between dot product computations in BFP" ]

In regards to claim 7, The computing system of claim 1, is taught by Drumond/Mellempudi as in the rejection for claim 1 above. Drumond continues teaching the following limitations:
wherein the tensor operation is a convolution.

[ (Pg. 5, Col. 2, Section 5.1) "We train DNNs with the hybrid approach, using BFP in the compute-intensive operations (matrix multiplications, convolutions)" ]

In regards to claim 9, Drumond teaches the following limitations as seen below:
by the neural network accelerator (Drumond discloses DNNs/neural networks, abstract, e.g., “Hybrid BFP-FP representations enable a new class of efficient accelerators transparently implementing dense arithmetic for DNN while maintaining usability”, section 1 or see Fig. 5) converting an input tensor for the given layer from a normal-precision floating­ point format to a block floating-point format; [ (Pg. 5, Col. 1, Section 4.4) "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly" Here, Drumond teaches the FP (floating-point) to BFP (Block floating-point) conversion.] by selecting a bounding box including a first set of values expressed in the normal-precision floating-point format, the first set of values being selected based on dimensions of the input tensor or on a tensor operation to be performed (Mellempudi talks about the dynamic range being selected (by use of adjustment) for the tensor blocks. Dynamic range being equivalent to the bounding box with blocks in the citation being the size of variables sharing an exponent "Additionally, blocking can be performed for tensors data stored in low-precision or custom floating-point formats, with the data being shifted or the floating-point format being adjusted to properly capture the dynamic range of the data within each block.", 0235).
by the neural network accelerator performing a tensor operation using the input tensor converted to the block floating- point format;
[ (Pg. 4, Col. 1, Section 4.2) "BFP should be used for the most demanding, dot product based, computations, with other operations being performed in floating-point-like representations"
Drumond teaches that all the operations that are based on dot product calculations will be done in the BFP format. ]
by the neural network accelerator converting a result of the tensor operation from the block floating-point format to the normal-precision floating-point format [(Pg. 5, Col. 1, Section 4.4) "while the BFP-to-FP unit normalizes the mantissas according to the single given exponent."
This citation teaches the block floating-point back to floating point operation for the data. This operation takes place after trainings had taken place and other operations as seen in the paragraph leading up to the citation. In the alternative, the secondary reference Mellempudi, also teaches this limitation as seen below. ]

[ Mellempudi: (,r0228) "Once a pre-determined number of iterations has been performed by the inner block 1942, the data can be converted from dynamic fixed-point to floating-point via a dynamic fixed-point to floating-point converter unit"
This citation from Mellempudi teaches the same unit that is referenced in the primary and more explicitly points out the claim limitation. (See motivation to combine below)]
and by the neural network accelerator use the converted result in the normal-precision floating-point format to generate an output tensor of the layer of the neural network, where the output tensor is in normal- precision floating-point format.
[ (Fig. 5) This figure shows the process of the hybrid BFP-FP accelerator and shows that after the convolution operations are completed, the result goes into a module named "BFP to FP" which converts the format back to normal-precision floating-point format and then transfers the data to update the parameter as seen with the activation buffer. Examiner notes that the output being transferred is equivalent to an output tensor. ]

What is not distinctly disclosed by Drumond and is instead taught by Mellempudi is the following:
A method for a neural network accelerator, the method comprising: configuring the neural network accelerator to accelerate a given layer of a multi-layer neural network; [ (,r0152) "The computing architecture provided by embodiments described herein can be configured to perform the types of parallel processing that is particularly suited for training and deploying neural networks for machine learning."
This citation teaches the architecture (equivalent to the accelerator)]
[ (,r0211) "Down conversion can be performed to scale higher-precision dynamic fixed-point data output from one layer of a neural network to a lower precision for input to a subsequent layer."

This citation teaches the particular layer within a neural network being able to be selected.]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings for quantizing tensors in neural networks, as taught by Drumond, with the dynamic precision management system as taught by Mellempudi. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide a performance and efficiency gain when carrying out the operations [ Mellempudi (,r0228) ]. This would facilitate the recognized benefit of creating a quicker neural network overall and one that would utilize less resources or consume the same amount of resources for an increase in work done.

In regards to claim 10, The method of claim 9, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Mellempudi continues teaching the following limitations:
wherein configuring the neural network accelerator to accelerate a given layer of a multi-layer neural network comprises loading configuration data onto programmable hardware so that the programmable hardware performs the operations of the given layer of a multi-layer neural network.
[ (,r0323) In general this paragraph goes over the processor(s) in the various embodiments having instructions (equivalent to configuration data) such that the processor can carry out the operations described within the reference's specification. Examiner notes that paragraphs (,r0139)-(,r0143) (exemplary and not a complete selection) give a basic overview of possible machine learning operations that are ordinary for the art. Further, paragraphs (,r0188) and

onward describe quantization for floating point format. These two sets would provide the complete set of instructions for the accelerator. ]
Please refer to the motivation to combine from claim 9.
In regards to claim 11, The method of claim 9, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Mellempudi continues teaching the following limitations:
wherein configuring the neural network accelerator to accelerate the given layer of the multi-layer neural network comprises initializing weights of input edges of the given layer of the multi-layer neural network.
[ (,r0169) "To start the training process the initial weights may be chosen randomly or by pre­ training using a deep belief network. The training cycle then be performed in either a supervised or unsupervised manner"
This citation from Mellempudi teaches initializing weights for the input of the neural network. Examiner notes that the accelerator providing this step along with the configuration is taught in claim 9. ]
Please refer to the motivation to combine from claim 9.

In regards to claim 13, The method of claim 12, is taught by Drumond/Mellempudi as seen in the rejection for claim 12 seen above. Mellempudi continues teaching the following limitations:
wherein the bounding box is selected based on the tensor operation performed. [ (,r0231) "For example, back propagation may require a larger dynamic range, so the computational logic can be configured to block the tensor data using smaller block sizes. For forward propagation computations, blocking may not be required."
This citation from Mellempudi further shows that different operations (like back propagation in this example) may require a different dynamic range than other operations. The citation teaches the computational logic being able to select based on the needs of the specific operation being performed. ]
Please refer to the motivation to combine from claim 9.
In regards to claim 14, The method of claim 13, is taught by Drumond/Mellempudi as seen in the rejection for claim 13 seen above. Drumond continues teaching the following limitations:
wherein the tensor operation performed is a matrix- matrix multiply and the selected bounding box is a column of a matrix of the input tensor.
[ (Pg. 4, Col. 1, Paragraph 1) "In a DNN's fully-connected layers, this requirement translates to one exponent per activation tensor and weight matrix column in the forward pass"
This citation from Drumond teaches the dynamic range being selected with a column of the matrix. The operation in question is explained to be a dot product which is equivalent to a matrix-matrix multiplication and can be found in the paragraph preceding the citation. ]

In regards to claim 15, The method of claim 9, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Mellempudi continues teaching the following limitations:
wherein converting the input tensor for the given layer from the normal-precision floating-point format to the block floating-point format comprises: selecting a bounding box for a plurality of elements of the input tensor;
[ (,r0235) "Additionally, blocking can be performed for tensors data stored in low-precision or custom floating-point formats, with the data being shifted or the floating-point format being adjusted to properly capture the dynamic range of the data within each block."
This citation from Mellempudi talks about the dynamic range being selected (by use of adjustment) for the tensor blocks. Dynamic range being equivalent to the bounding box with blocks in the citation being the size of variables sharing an exponent. ]

identifying a shared exponent for the selected plurality of elements within the bounding box of the input tensor;
[ (,r0191) "The dynamic fixed-point representation enables an 8x8 tensor 1415 of 32-bit floating­ point values 1414 to be stored in an 8x8 tensor 1425 of 16-bit integer values, each associated with an 8-bit shared exponent."
This citation shows the shared exponent with the bounding box]
scaling mantissa values of the elements of the input tensor so that integer portions of the scaled mantissas have a selected number of bits for the block floating­ point format;
[ (,r0189) "To convert from floating-point to traditional fixed-point, one can multiply the floating­ point value by 2fb, where fb is the number of fractional bits for the target fixed-point representation (e.g., 28, for 24.8 fixed-point) and round the result to the nearest integer'']
removing fractional bits from the scaled integer portions of the mantissas;
[ (,r0194) "To quantize an exemplary floating-point value 1512 (fx=3.4667968) having an exponent 1514A (Ex) and a mantissa 1514B (Mx), the mantissa 1514B is right shifted by the difference between the exponent 1514A and the absolute max value exponent to create a magnitude integer 1524 (Ix), with the implicit leading bit 1513 (LB) stored as an explicit bit 1523 within the magnitude integer 1524. The sign bit 1520 (Sx) is maintained for the quantized fixed­ point value."
In this citation, Mellempudi teaches the fractional bits being removed with the right shift operation. ]
and rounding the mantissas to produce block floating-point values.

[ (,r0217) "FIG. 17 illustrates floating-point to dynamic fixed-point biased rounding, according to an embodiment. Quantization with biased rounding as illustrated in FIG. 17 is similar to quantization as illustrated in FIG. 15A. Additionally, a round bit 1740 and a bias bit 1742 are

used to capture bits that would otherwise be lost during the right shift to generate the integer magnitude value." ]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings for quantizing tensors in neural networks, as taught by Drumond, with the floating point operations as taught by Mellempudi.
The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide a performance and efficiency gain when carrying out the operations [ Mellempudi (,r0228) ]. This would facilitate the recognized benefit of creating a quicker neural network overall and one that would utilize less resources or consume the same amount of resources for an increase in work done.
In regards to claim 16, The method of claim 9, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Mellempudi continues teaching the following limitations:
wherein the multi-layer neural network is a recurrent neural network and configuring the neural network accelerator to accelerate the given layer of the multi-layer neural network comprises programming hardware to perform a function of a layer of the recurrent neural network.
[ (,r0323) In general this paragraph goes over the processor(s) in the various embodiments having instructions (Processor being and/or containing the accelerator) such that the processor can carry out the operations described within the reference's specification (operations including various neural network operations). Examiner notes that paragraphs (,r0139)-(,r0143)

(exemplary and non-complete selection) give a basic overview of possible machine learning operations that are ordinary for the art. ]
[ (,r0242) "While CNN training is illustrated, the techniques described herein can also be applied to other types of neural networks, such as RNNs, LSTM, and GANs (generative adversarial networks)."
This citation teaches a possible embodiment where the neural network is a RNN.
Examiner notes that the reference states the techniques described herein also refers to the operations for the neural network and the accelerator. The reference details an embodiment of an RNN in figure 10.]

In regards to claim 17, Drumond teaches the following limitations seen below:
with the neural network accelerator converting an input tensor for a given layer of a multi-layer neural network from a normal-precision floating-point format to a block floating-point format; [(Pg. 5, Col. 1, Section 4.4) "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly" Here, Drumond teaches the FP (floating-point) to BFP (Block floating-point) conversion.] by selecting a bounding box including a first set of values expressed in the normal-precision floating-point format, the first set of values being selected based on dimensions of the input tensor or on a tensor operation to be performed (Mellempudi talks about the dynamic range being selected (by use of adjustment) for the tensor blocks. Dynamic range being equivalent to the bounding box with blocks in the citation being the size of variables sharing an exponent "Additionally, blocking can be performed for tensors data stored in low-precision or custom floating-point formats, with the data being shifted or the floating-point format being adjusted to properly capture the dynamic range of the data within each block.", 0235).
with the neural network accelerator performing a tensor operation using an operational parameter of the given layer of the neural network and the input tensor converted to the block floating-point format;
[ (Pg. 4, Col. 1, Section 4.2) "BFP should be used for the most demanding, dot product based, computations, with other operations being performed in floating-point-like representations"
Drumond teaches that all the operations that are based on dot product calculations will be done in the BFP format. ]

with the neural network accelerator converting a result of the tensor operation from the block floating-point format to the normal-precision floating-point format;
[ (Pg. 5, Col. 1, Section 4.4) "while the BFP-to-FP unit normalizes the mantissas according to the single given exponent."
This citation teaches the block floating-point back to floating point operation for the data. This operation takes place after trainings had taken place and other operations as seen in the paragraph leading up to the citation. In the alternative, the secondary reference Mellempudi, also teaches this limitation as seen below. ]
[ Mellempudi: (,r0228) "Once a pre-determined number of iterations has been performed by the inner block 1942, the data can be converted from dynamic fixed-point to floating-point via a dynamic fixed-point to floating-point converter unit"
This citation from Mellempudi teaches the same unit that is referenced in the primary and more explicitly points out the claim limitation. (See motivation to combine below)]
and with the neural network accelerator using the converted result in the normal-precision floating-point format to update the operational parameter stored in the one or more computer-readable media. [ (Fig. 5)
This figure shows the process of the hybrid BFP-FP accelerator and shows that after the convolution operations are completed, the result goes into a module named "BFP to FP" which converts the format back to normal-precision floating-point format and then transfers the data to update the parameter as seen with the activation buffer. Examiner notes that the output being transferred is equivalent to an output tensor.]

What is not distinctly disclosed by Drumond and is instead taught by Mellempudi is seen below:
One or more computer-readable media storing computer-executable instructions, which when executed by a neural network accelerator, cause the neural network accelerator to perform operations, the operations comprising:

[ (,r0323) "For example, the machine-readable medium may include instructions which represent various logic within the processor. When read by a machine, the instructions may cause the machine to fabricate the logic to perform the techniques described herein."]
[ (,r0211) "Down conversion can be performed to scale higher-precision dynamic fixed-point data output from one layer of a neural network to a lower precision for input to a subsequent layer."
This citation teaches the particular layer within a neural network being able to be selected.]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings for quantizing tensors in neural networks, as taught by Drumond, with the dynamic precision management system as taught by Mellempudi. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide a performance and efficiency gain when carrying out the operations [ Mellempudi (,r0228) ]. This would facilitate the recognized benefit of creating a quicker neural network overall and one that would utilize less resources or consume the same amount of resources for an increase in work done.

In regards to claim 18, The one or more computer-readable media of claim 17, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Drumond continues teaching the following limitations:
and converting the input tensor from the normal-precision floating-point format to the block floating-point format comprises selecting a plurality of elements within a column of the two-dimensional matrix to share a common exponent in the block floating­ point format.
[ (Pg. 4, Col. 1, Paragraph 1) "In a ONN's fully-connected layers, this requirement translates to one exponent per activation tensor and weight matrix column in the forward pass"
This citation from Drumond teaches the dynamic range being selected with a column of the matrix. The operation in question is explained to be a dot product which is equivalent to a matrix-matrix multiplication and can be found in the paragraph preceding the citation within the reference. Examiner notes that the two-dimensional array is taught by Mellempudi as noted below and Drumond is relied upon to teach the actions performed on the 2D array explicitly. ]
What Drumond does not distinctly disclose and is instead taught by Mellempudi is seen below:
wherein the input tensor is a two-dimensional matrix,[ (,r0237)
In general, this paragraph teaches about the multi-dimensional tensors and points to a specific embodiment shown in Fig. 21B that highlights the 2D matrix. The paragraph specifically points to a matrices A and B each having rows and columns with various variable names and values which is a 2D matrix. ]
Please refer to the motivation to combine in claim 17.

In regards to claim 19, The one or more computer-readable media of claim 17, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Drumond continues teaching the following limitations:

and converting the input tensor from the normal-precision floating-point format to the block floating-point format comprises selecting a plurality of elements within a row of the two-dimensional matrix to share a common exponent in the block floating-point format.
[ (Pg. 4, Col. 1, Paragraph 1) "and one exponent per activation gradient tensor and weight matrix row in the backward pass."
This citation from Drumond teaches the dynamic range being selected with a row of the matrix. The operation in question is explained to be a dot product which is equivalent to a matrix-matrix multiplication and can be found in the paragraph preceding the citation within the reference. Examiner notes that the two-dimensional array is taught by Mellempudi as noted below and Drumond is relied upon to teach the actions performed on the 2D array explicitly. ]
What Drumond does not distinctly disclose and is instead taught by Mellempudi is seen below:
wherein the input tensor is a two-dimensional matrix,
[ (,r0237) In general, this paragraph teaches about the multi-dimensional tensors and points to a specific embodiment shown in Fig. 21B that highlights the 2D matrix. The paragraph specifically points to a matrices A and Beach having rows and columns with various variable names and values which is a 2D matrix. ]
Please refer to the motivation to combine in claim 17.


In regards to claim 20, The one or more computer-readable media of claim 17, is taught by Drumond/Mellempudi as seen in the rejection for claim 9 seen above. Drumond continues teaching the following limitations:
wherein the tensor operation is performed during a back-propagation mode of the neural network.
[ (Pg. 5, Section 5.1) "We modified TensorFlow's (Abadi et al., 2016) matrix multiplications and convolution operations to reproduce the behaviour of BFP matrix multipliers in both the forward and backward Passes" This shows the operation being performed on the tensors during the backward pass which is the backward propagation mode of the neural network.]

21. (New) The computing system of claim 1, wherein the tensor operation is
performed during a forward-propagation phase of training the neural network (e.g., Drumond: “In a DNN's fully-connected layers, this requirement translates
to one exponent per activation tensor and weight matrix column in the forward pass, and one exponent per activation gradient tensor and weight matrix row in the backward pass”, section 4.1; “Figure 3 illustrates the dataflow of the forward and backward passes of a fully connected layer. Weights are stored in BFP format throughout the training process, to take advantage of the compressed nature of BFP representations. This reduces memory bandwidth during both forward and backward passes, as well as the amount of communication during parameter updates”, section 4.2; “We modified TensorFlow's (Abadi et aL, 2016) matrix multiplications and convolution operations to reproduce the behaviour of BFP matrix multipliers in both the forward and backward passes. We used TensorFlow's defun function to create a new op that processes the inputs and outputs of both the forward and backward passes of another tensorflow op, to simulate the usage of BFP.”, section 5.1).

22. (New) The computing system of claim 1, wherein the tensor operation is
performed during a back-propagation phase of training the neural network. (e.g., Drumond: “In a DNN's fully-connected layers, this requirement translates
to one exponent per activation tensor and weight matrix column in the forward pass, and one exponent per activation gradient tensor and weight matrix row in the backward pass”, section 4.1; “Figure 3 illustrates the dataflow of the forward and backward passes of a fully connected layer. Weights are stored in BFP format throughout the training process, to take advantage of the compressed nature of BFP representations. This reduces memory bandwidth during both forward and backward passes, as well as the amount of communication during parameter updates”, section 4.2; “We modified TensorFlow's (Abadi et aL, 2016) matrix multiplications and convolution operations to reproduce the behaviour of BFP matrix multipliers in both the forward and backward passes. We used TensorFlow's defun function to create a new op that processes the inputs and outputs of both the forward and backward passes of another tensorflow op, to simulate the usage of BFP.”, section 5.1).


23. (New) The computing system of claim 1, wherein the scalar operation is
performed for a single layer of the neural network (“In a DNN's fully-connected layers, this requirement translates to one exponent per activation tensor and weight matrix column in the forward pass, and one exponent per activation gradient tensor and weight matrix row in the backward pass.”, section 4 and Fig. 3).

24. (New) The computing system of claim 1, wherein the scalar operation comprises
adding a bias to the converted result (“Floating-point tensors converted to BFP lose precision when tensors have a wide range of values. The BFP implementation can minimize the loss of precision during conversions by choosing an appropriate exponent for the tensor and rounding numbers with bias free policy.”, section 4.3;
Gou also teaches using bias algorithms 1 and 2; 
Mellempudi also teaches bias “Additionally, a round bit 1740 and a bias bit 1742 are used to capture bits that would otherwise be lost during the right shift to generate the integer magnitude value. Based on the round bit 1740 and bias bit 1742 valued, a truth function or truth table, such as the truth table shown in Table 5, is applied based on the round and bias values. A rounded magnitude 1744 can then be generated that is rounded or not rounded based on the truth table.”, 0217; 0127-0128).

25. (New) The computing system of claim 1, wherein the scalar operation comprises
applying an activation function to the converted result (“In a DNN's fully-connected layers, this requirement translates to one exponent per activation tensor and weight matrix column in the forward pass, and one exponent per activation gradient tensor and weight matrix row in the backward pass. Since storing the weight matrix in two views (with both per-row and per-column exponent) is not possible, we use a single exponent for the entire weight matrix. The requirements are similar for convolutions: one exponent per activation input and kernel matrix. When computing weight gradients, the dot products are computed across batches, and therefore, entire batches of activations and gradients must share exponents to take advantage of fixed-point dot products, 4.2. Hybrid BFP-FP DNN training
BFP should be used for the most demanding, dot product
based, computations, with other operations being performed
in floating-point-like representations. This configuration
enables the bulk of the DNN operations to be performed in
efficient fixed-point logic, and facilitates the use of various
activation functions or techniques like batch normalization
without the restrictions imposed by BFP.”, section 4).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

US 2019/0332925 Modha teaches scalar (“ Two neurons are connected if the output of one is an input to the other. A weight is a scalar value encoding the strength of the connection between the output of one neuron and the input of another neuron.”, 0012).

US 20200193273 A1 - Method for operating neural network accelerator using selective precision of quantized tensors involving performing dot products which teaches mantissa and exponent quantization, dot product operations, floating point format, different precision levels and a secondary output.

US 20200104131 A1 - Method for operating a digital computer to reduce computational complexity associated with dot products which teaches quantization, floating point format, scalar operations and dot product operations.

US 20190205746 A1 - machine learning sparse computation mechanism for arbirtrary neural network arithmetic compute microarchitecture and sparsity for training mechanism which teaches quantization, floating point formats, mix precision operations, convolutions, backward propagation, and dot product operations.

US 20190035132 A1 -Systems and methods for real time complex character animations and interactivity which teaches bounding boxes, neural networks and various computational operations.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID R VINCENT whose telephone number is (571)272-3080. The examiner can normally be reached ~Mon-Fri 12-8:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 5712703428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID R VINCENT/Primary Examiner, Art Unit 2123