Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1, 6, 7, and 12 objected to because of the following informalities: 
In claim 1 “the the format” should read “the format”.
In claim 6 “input tenor” should read “input tensor”.
In claim 7 “flowing point” should read “floating point”.
  “one or more computer readable storage media” is not specified as transitory or non-transitory in the claim or the specification.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3, 4, 5, 8, and 11-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.



Regarding claim 4: “normal precision format operations” is indefinite.  In the standard case of casting floats to integers for quantization normal precision format operations is almost meaningless as integer math readily scales to a bit representation of any size such that an 8-bit precision operations is functionally equivalent to a 16-bit operation in that they both take the same number of clock cycles.  If the intention is to use floating point arithmetic at a particular precision then that is not made clear in the claims or the specification.  In the interest of further examination this is interpreted as any non block format operation.

Regarding Claim 5: “The scaled mantissa”, and “the mantissa” lacks antecedent basis.

Regarding Claim 5: “the mantissa” and “the scaled integer portion of the mantissa” is indefinite.  It is unclear whether the integer portion of the mantissa is scaled or the mantissa itself is scaled.  This confusion arises because the scaled integer portion of the mantissa is introduced immediately after the integer portion of the scaled mantissa is introduced.  In the case that the integer portion of the mantissa is scaled, this would also be indefinite, because then the mantissa could either refer to the scaled or 

Regarding claim 13: storing quantized values in an unquantized format is contradictory and indefinite.  This could simply refer to casting a quantized type such as integer into a floating point format, or could refer to a method of exactly copying memory from a smaller size pointer to a larger size pointer.  The conversion is not explicit in the specification.  In the interest of further examination storing quantized values in a normal-precision floating-point format is simply interpreted as storing values in IEEE 32-bit floating point format.

Regarding claim 8:  “Wherein the input tensor represents a neural network” is indefinite. The specification only teaches an input tensor representing a portion of a trained neural network, and representing an entire neural network with a single tensor would not be readily understood by one of ordinary skill in the art.  The specification teaches using separate tensors for edge weights and activation weights.  In the interest of further examination the input tensor is being interpreted as representing a portion of a neural network.

Regarding claim 11: the difference between flattening the input tensor to obtain a two dimensional matrix and converting the input tensor to obtain the matrix is indefinite.  Further, the method of obtaining these converted tensors is not disclosed in the specification.  How the input tensor could simultaneously be output as two matrices with different dimensions is also indefinite.  In the interest of further examination these claims are interpreted as simply the input matrix being converted into a two dimensional matrix.

Regarding claim 12: “instructions that cause the system to” is indefinite. With respect to the instant specification instructions are described as comprising further instructions.  This definition is circular and does not give structure to what is intended by instructions. This is further exacerbated by the lack of direction in the machine readable storage media for storing the instructions and whether or not it is intended to be transitory or non-transitory.  Further, with respect to the instant specification the instructions should not be limited to merely computer-executable instructions.  The use of “instructions” in the claims amounts to merely a black box in order to perform a recited function.  Aristocrat, 521 F.3d at 1334, 86 USPQ2d at 1239; Finisar, 523 F.3d at 1340-41, 86 USPQ2d at 1623. In addition, merely referencing a specialized computer (e.g., a "bank computer"), some undefined component of a computer system (e.g., "access control manager"), "logic," "code," or elements that are essentially a black box designed to perform the recited function, will not be sufficient because there must be some explanation of how the computer or the computer component performs the claimed function. Blackboard, Inc. v. Desire2Learn, Inc., 574 F.3d 1371, 1383-85, 91 USPQ2d 1481, 1491-93 (Fed. Cir. 2009); Net MoneyIN, Inc. v. VeriSign, Inc., 545 F.3d 1359, 1366-67, 88 USPQ2d 1751, 1756-57 (Fed. Cir. 2008); Rodriguez, 92 USPQ2d at 1405-06.  In the interest of further examination instructions are being interpreted as computer code stored in a non-transitory computer readable storage medium.

Regarding claim 15: “instructions that cause the system to” is indefinite. With respect to the instant specification instructions are described as comprising further instructions.  This definition is circular and does not give structure to what is intended by instructions. This is further exacerbated by the lack of direction in the machine readable storage media for storing the instructions and whether or not it is intended to be transitory or non-transitory.  Further, with respect to the instant specification the instructions should not be limited to merely computer-executable instructions.  The use of “instructions” in the claims amounts to merely a black box in order to perform a recited function.  Aristocrat, 521 F.3d at 1334, 86 USPQ2d at 1239; Finisar, 523 F.3d at 1340-41, 86 USPQ2d at 1623. In addition, merely referencing a specialized computer (e.g., a "bank computer"), some undefined component of a computer system (e.g., "access control manager"), "logic," "code," or elements that are essentially a black box designed to perform the recited function, will not be sufficient because there must be some explanation of how the computer or the computer component performs the claimed function. Blackboard, Inc. v. Desire2Learn, Inc., 574 F.3d 1371, 1383-85, 91 USPQ2d 1481, 1491-93 (Fed. Cir. 2009); Net MoneyIN, Inc. v. VeriSign, Inc., 545 F.3d 1359, 1366-67, 88 USPQ2d 1751, 1756-57 (Fed. Cir. 2008); Rodriguez, 92 USPQ2d at 1405-06.  In the interest of further examination instructions are being interpreted as computer code stored in a non-transitory computer readable storage medium.


Regarding claim 17:  “a parameter of the quantized precision format” is indefinite.  Among other possibilities a parameter could refer to the size of the mantissa of the format, or could refer to a digit, or could refer to the method in which the format is quantized, or could refer to the location where it’s stored.  A parameter of the quantized precision format is not further disclosed in the specification.  In the interest of further examination a parameter of the quantized precision format is interpreted as any aspect of a quantized precision format including the value.

Regarding claim 17: “instructions that cause the processor to” is indefinite.  With respect to the instant specification instructions are described as comprising further instructions.  This definition is circular and does not give structure to what is intended by instructions. This is further exacerbated by the lack of direction in the machine readable storage media for storing the instructions and whether or not it is intended to be transitory or non-transitory.  Further, with respect to the instant specification the instructions should not be limited to merely computer-executable instructions.  The use of “instructions” in the claims amounts to merely a black box in order to perform a recited function.  Aristocrat, 521 F.3d at 1334, 86 USPQ2d at 1239; Finisar, 523 F.3d at 1340-41, 86 USPQ2d at 1623. In addition, merely referencing a specialized computer (e.g., a "bank computer"), some undefined component of a computer system (e.g., "access control manager"), "logic," "code," or elements that are essentially a black box designed to perform the recited function, will not be sufficient because there must be some explanation of how the computer or the computer component performs the claimed function. Blackboard, Inc. v. Desire2Learn, Inc., 574 F.3d 1371, 1383-85, 91 USPQ2d 1481, 1491-93 (Fed. Cir. 2009); Net MoneyIN, Inc. v. VeriSign, Inc., 545 F.3d 1359, 1366-67, 88 USPQ2d 1751, 1756-57 (Fed. Cir. 2008); Rodriguez, 92 USPQ2d at 1405-06.  In the interest of further examination instructions are being interpreted as computer code stored in a non-transitory computer readable storage medium.

Claims 13-16 are rejected with respect to their dependence on claim 12.

Claims 18-20 are rejected with respect to their dependence on claim 17.

Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: converting an input tensor of normal-precision floating-point numbers to a set of numbers represented in a quantized-precision format, at least one parameter of the format being selected to emulate a quantized hardware accelerator for processing a neural network comprising the input tensor (selecting type of data to be manipulated), by the processor, performing at least one operation with the set of quantized-precision format number, producing a modified set of quantized-precision format numbers (mathematical calculation), and converting the modified set of quantized-precision format numbers to an output tensor of numbers in the normal-precision floating-point format (selecting type of data to be manipulated).  Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 2:  Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 2 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 2 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the quantized-precision format is a block floating-point format where at least two elements of the set of quantized-precision format numbers share a common exponent (selecting type of data to be manipulated).  Therefore, claim 2 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 2 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 2 is directed to a judicial exception.
Step 2B Analysis:  Claim 2 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 2 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 3:  Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 3 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 3 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the quantized-precision format is a block floating-point format where at least two but not all of two columns, two rows, two tiles, two columns of a tile, or two rows of a tile share a common exponent (selecting type of data to be manipulated).  Therefore, claim 3 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 3 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 3 is directed to a judicial exception.
Step 2B Analysis:  Claim 3 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 3 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 4:  Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 4 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 4 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: generating the input tensor of normal-precision floating-point numbers by training a neural network, the set of normal-precision floating-point numbers representing at least one of edge weights or activation weights for the neural network (outputting data), the performing at least one operations comprises performing normal-precision format operations on the quantized-precision format numbers (mathematical calculation).  Therefore, claim 4 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 4 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 4 is directed to a judicial exception.
Step 2B Analysis:  Claim 4 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 4 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 5:  Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 5 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 5 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: identifying a shared exponent for a selected at least two elements of the input tensor (evaluation and judgement), scaling values of the input tensor so that the integer portion of the scaled mantissas has a selected number of bits for the quantized precision format (mathematical calculation), removing fractional bits from the scaled integer portion of the mantissa (evaluation and judgement), rounding the mantissa to produce a quantized precision value (mathematical calculation).  Therefore, claim 5 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 5 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 5 is directed to a judicial exception.
Step 2B Analysis:  Claim 5 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 5 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 6:  Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 6 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 6 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following:reshaping the input tensor to allow the converting the input tensor to include independent operations on portions of the input tenor (selection of a data type).  Therefore, claim 6 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 6 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 6 is directed to a judicial exception.
Step 2B Analysis:  Claim 6 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 6 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 7:  Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 7 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 7 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the input tensor represents a portion of a previously-trained neural network (selection of a data type), the performing at least one operation comprises performing inference operations with the quantized neural network (evaluation), comparing output of the neural network based on the inference operations to output of the previously-trained neural network in the flowing point format (observation, evaluation, and judgement).  Therefore, claim 7 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 7 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 7 is directed to a judicial exception.
Step 2B Analysis:  Claim 7 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 7 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 8:  Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 8 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 8 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: wherein the input tensor represents a neural network (selection of a data type), calculating loss of a neural network using the set of quantized-precision format numbers (mathematical calculation), updating the modified set of quantized-precision format numbers based on a gradient calculated based on the calculated loss of the neural network (outputting data).  Therefore, claim 8 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 8 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 8 is directed to a judicial exception.
Step 2B Analysis:  Claim 8 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 8 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 9:  Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 9 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the normal-precision floating-point format is one of the following: a 16-bit floating-point format, a 32-bit floating-point format, a 64-bit floating-point format, or an 80-bit floating-point format. (selection of a data type).  Therefore, claim 9 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 9 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 9 is directed to a judicial exception.
Step 2B Analysis:  Claim 9 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 9 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 10:  Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 10 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 10 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: input tensor has two dimensions X and N, the performing the at least one operation comprises applying a convolution kernel having three dimensions K, N, and P to the input tensor (mathematical calculation), flattening the convolution kernel into a two-dimensional matrix having two dimensions K×N and P (selection of a data type), converting the input tensor into a matrix having two dimensions K×N and X (mathematical calculation).  Therefore, claim 10 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 10 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 10 is directed to a judicial exception.
Step 2B Analysis:  Claim 10 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 10 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 11:  Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 11 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 11 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the input tensor has three dimensions X, Y, and N, the performing the at least one operation comprises applying a convolution kernel having four dimensions K, L, N, and P to the input tensor (mathematical calculation), flattening the input tensor into a two-dimensional matrix having two dimensions N×M and K (selection of a data type), converting the input tensor into a matrix having two dimensions K×L×N and M (mathematical calculation).  Therefore, claim 11 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 11 recites additional elements “processor” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 11 is directed to a judicial exception.
Step 2B Analysis:  Claim 11 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 11 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 12:  Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 12 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 12 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following:instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format (evaluation), instructions that cause the system to convert at least one of the tensors to values expressed in a quantized-precision format (selection of a data type), instructions that cause the system to perform at least one mathematical operation with the at least one of the quantized tensors, producing modified tensors (mathematical calculation), instructions that cause the system to convert the modified tensors to a normal-precision floating-point format (selection of a data type).  Therefore, claim 12 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 12 recites additional elements “processors”, “memory”, and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 12 is directed to a judicial exception.
Step 2B Analysis:  Claim 12 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 12 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 13:  Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 13 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 13 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the mathematical operation is performed with the quantized values stored in a normal-precision floating-point format. (mathematical calculation).  Therefore, claim 13 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 13 recites additional elements “processors”, “memory”, and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 13 is directed to a judicial exception.
Step 2B Analysis:  Claim 13 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 13 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 14:  Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 14 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 14 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: the mathematical operation is performed by emulating quantized operations with the quantized values (mathematical calculation).  Therefore, claim 14 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 14 recites additional elements “processors”, “memory”, and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 14 is directed to a judicial exception.
Step 2B Analysis:  Claim 14 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 14 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 15:  Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 15 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 15 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: wherein the modified tensors represent a quantized neural network (selection of a data type), instructions that cause the system to perform quantized training of the quantized neural network to produce the modified tensors (observation, evaluation, and judgement).  Therefore, claim 15 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 15 recites additional elements “processors”, “memory”, and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 15 is directed to a judicial exception.
Step 2B Analysis:  Claim 15 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 15 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 16:  Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 16 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 16 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: instructions to program a neural network accelerator with quantized values determined based on executing the instructions to convert the tensors, to perform the at least one mathematical operation, and/or to convert the modified tensors to the normal-precision floating-point format (mathematical calculation).  Therefore, claim 16 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 16 recites additional elements “processors”, “memory”, and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 16 is directed to a judicial exception.
Step 2B Analysis:  Claim 16 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 16 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 17:  Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 17 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 17 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following instructions that cause the processor to specify at least one parameter of the quantized precision format (evaluation and judgement), instructions that cause the processor to convert a normal precision format tensor to the quantized precision format (selection of a data type), instructions that cause the processor to provide at least one tensor operation in the quantized precision format (mathematical calculation), instructions that cause the processor to convert an output of the at least one tensor operation to the normal precision format (outputting data).  Therefore, claim 17 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 17 recites additional elements “processors” and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 17 is directed to a judicial exception.
Step 2B Analysis:  Claim 17 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 17 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 18:  Claim 18 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 18 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 18 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: wherein the at least one parameter is for a neural network represented in the quantized precision format, the at least one parameter including at least one of the following: a bit width of node weights, a bit width of activation values, a floating-point format for performing non-quantized operations, a tile size for a shared exponent, a parameter to share an exponent on a per-row basis, a parameter to share an exponent on a per-column basis, and/or a parameter specifying a method of common exponent selection. (selection of a data type). Therefore, claim 18 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 18 recites additional elements “processors” and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 18 is directed to a judicial exception.
Step 2B Analysis:  Claim 18 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 18 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 19:  Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 19 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 19 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: wherein the computer-readable instructions further comprise a parameter to specify flattening a tensor prior to quantization (selection of a data type). Therefore, claim 19 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 19 recites additional elements “processors” and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 19 is directed to a judicial exception.
Step 2B Analysis:  Claim 19 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 19 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 20:  Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 20 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 20 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: instructions to provide a class method defining a quantized matrix multiplication operation (mathematical calculation). Therefore, claim 20 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 20 recites additional elements “processors” and “computer readable storage medium” However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 20 is directed to a judicial exception.
Step 2B Analysis:  Claim 20 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 20 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-20 are rejected under 35 U.S.C. § 101. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 2, 4, 7, and 17 are rejected under 35 U.S.C. 102 as being unpatentable over Drumond (“End-to-End DNN Training with Block Floating Point Arithmetic”, 2018).

Regarding claim 1, Drumond teaches A method comprising: by a processor, converting an input tensor of normal-precision floating-point numbers to a set of numbers represented in a quantized-precision format ([Sec. 4.4] FIG. 5 "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly" ) at least one parameter of the the format being selected to emulate a quantized hardware accelerator for processing a neural network comprising the input tensor; ([Sec. 5.1] "We modified
TensorFlow’s (Abadi et al., 2016) matrix multiplications and convolution operations to reproduce the behaviour of BFP matrix multipliers in both the forward and backward
passes." ) by the processor, performing at least one operation with the set of quantized-precision format number, producing a modified set of quantized-precision format numbers; ([Sec. 5.1] " We train DNNs with the hybrid approach, using BFP" See also FIG. 6. Quantized-precision format numbers are passed directly to the layer operation where a modified set of quantized-precision format numbers are produced. ) converting the modified set of quantized-precision format numbers to an output tensor of numbers in the normal-precision floating-point format. (FIG. 6 "BFP to FP").

 Regarding claim 2, Drumond teaches The method of claim 1, wherein: the quantized-precision format is a block floating-point format where at least two elements of the set of quantized-precision format numbers share a common exponent. (FIG. 1 "A n-element tensor in BFP and FP representations. BFP tensors save space and simplify computations by sharing exponents across tensors." The 10-bit exponent is shared for all n of the quantized precision elements in the tensor.).

Regarding claim 4, Drumond teaches The method of claim 1, further comprising: generating the input tensor of normal-precision floating-point numbers by training a neural network, the set of normal-precision floating-point numbers representing at least one of edge weights or activation weights for the neural network, wherein: ([Introduction] "We propose a hybrid BFP-FP framework where values float freely between dot product computations in BFP, resulting in better choice of exponents, and perform the rest of the training in traditional floating point arithmetic. "  [Sec. 2] "these networks require hardware that is orders of magnitude simpler for inference, they are trained in a similar way to traditional neural networks, with both activations and parameters represented with floating-point." Parameter interpreted as synonymous with weight.  Activation interpreted as synonymous with activation weight. ) the performing at least one operations comprises performing normal-precision format operations on the quantized-precision format numbers. ([Sec. 4.1] "BFP represents numbers with a mantissa and exponent, like floating-point, but exponents are shared across entire tensors, as shown in Figure 1, resulting in dot products that can be computed entirely in fixed-point logic." fixed point logic is interpreted as synonymous with normal-precision format operations.).

Regarding claim 7, Drumond teaches The method of claim 1, wherein: the input tensor represents a portion of a previously-trained neural network, ([Sec. 5.1] "In the backward pass, we perform the same pre-/post-processing of the inputs/outputs of the x derivative (Figure 6b), but handle the w derivative differently (Figure 6c) since it performs a reduction across entire batches. Thus, to emulate the behavior of an accelerator with native BFP, we convert inputs to BFP tensors that share exponents across the entire batch. Finally, we re-align weights and their gradients during updates to simulate the update of weights stored in BFP." See Also FIG. 6 FIG. 6 shows operations in training in TensorFlow.  Backward pass refers to backpropagation which is an aspect of training. ) the performing at least one operation comprises performing inference operations with the quantized neural network; and ([Sec. 4.4] "The activation/loss and the conversion units are capable of processing a single 75-wide tensor per cycle. Weights are kept in BFP throughout the entire training process and during inference." ) the method further comprises: comparing output of the neural network based on the inference operations to output of the previously-trained neural network in the flowing point format. ([Sec. 6] "We now evaluate DNN training with the hybrid approach, that is referred to as BFP for simplicity, comparing it to FP32-based training." [Sec. 6.1] "Although 4-bit-mantissa BFP is outperformed by FP32, it still converges, uncovering a quality-performance trade-off: users that can tolerate models with lower quality can achieve better energy-efficiency during training and inference." ) .

Regarding claim 17, Drumond teaches One or more computer-readable storage media storing computer-readable instructions that when executed by a processor, cause the processor to perform a method of using an application programming interface for performing operations in a quantized precision format, the instructions comprising: ([Introduction] "In this paper, we make the observation that in DNNs, the majority of the arithmetic operations executed are performed as part of dot product calculations, and therefore, limiting dense fixed-point-like arithmetic to only replacing the dot products still allows us to accelerate the majority of the network. As such, the rest of the operations can be implemented in traditional floating-point logic with little performance degradation. We propose a hybrid BFP-FP framework where values float freely between dot product computations in BFP, resulting in better choice of exponents, and perform the rest of the training in traditional floatingpoint arithmetic.") instructions that cause the processor to specify at least one parameter of the quantized precision format; ([Introduction] "The use of BFP has allowed signal processors to convert common algorithms (e.g., FFT) to dense and parallel integer arithmetic hardware." BFP is interpreted as a quantized precision format. See also FIG. 6 FP to BFP ) instructions that cause the processor to convert a normal precision format tensor to the quantized precision format; ([Sec. 4.4] FIG. 5 "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly" ) instructions that cause the processor to provide at least one tensor operation in the quantized precision format; and ([Sec. 4.1] "BFP represents numbers with a mantissa and exponent, like floating-point, but exponents are shared across entire tensors, as shown in Figure 1, resulting in dot products that can be computed entirely in fixed-point logic." dot product is interpreted as a tensor operation ) instructions that cause the processor to convert an output of the at least one tensor operation to the normal precision format. (FIG. 6 "BFP to FP").

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim 3, 5, 6, 8, 9, 12-16, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Drumond and in view of Mellempudi (US 2018/0322607 A1).

 Regarding claim 3, Drumond teaches The method of claim 1. However, Drumond does not explicitly teach wherein: the quantized-precision format is a block floating-point format where at least two but not all of two columns, two rows, two tiles, two columns of a tile, or two rows of a tile share a common exponent.

Mellempudi who teaches a related neural network accelerator teaches wherein: the quantized-precision format is a block floating-point format where at least two but not all of two columns, two rows, two tiles, two columns of a tile, or two rows of a tile share a common exponent. ([¶0237] "The metadata can maintain the exponent scaling factor for each block, as well as the block size for each block. In one embodiment, partitioning with variable block sizes can be performed along all dimensions of the tensor. A data representation can be generated that has variable size blocks in all dimensions of the tensor" FIG. 21A shows at least two rows but not all of the rows share a common exponent.) 

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

 Regarding claim 5, Drumond teaches The method of claim 1.  However, Drumond does not explicitly teach wherein the converting the input tensor comprises: identifying a shared exponent for a selected at least two elements of the input tensor; scaling values of the input tensor so that the integer portion of the scaled mantissas has a selected number of bits for the quantized precision format; removing fractional bits from the scaled integer portion of the mantissa; and rounding the mantissa to produce a quantized precision value. .

 Mellempudi who teaches a related neural network accelerator teaches wherein the converting the input tensor comprises: identifying a shared exponent for a selected at least two elements of the input tensor; ([¶0191] "The dynamic fixed-point representation enables an 8×8 tensor 1415 of 32-bit floating-point values 1414 to be stored in an 8×8 tensor 1425 of 16-bit integer values, each associated with an 8-bit shared exponent." ) scaling values of the input tensor so that the integer portion of the scaled mantissas has a selected number of bits for the quantized precision format; removing fractional bits from the scaled integer portion of the mantissa; and ([¶0189] " To convert from floating-point to traditional fixed-point, one can multiply the floating-point value by 2fb, where fb is the number of fractional bits for the target fixed-point representation (e.g., 28, for 24.8 fixed-point) and round the result to the nearest integer." [¶0194] "To quantize an exemplary floating-point value 1512 (fx=3.4667968) having an exponent 1514A (Ex) and a mantissa 1514B (Mx), the mantissa 1514B is right shifted by the difference between the exponent 1514A and the absolute max value exponent to create a magnitude integer 1524 (Ix), with the implicit leading bit 1513 (LB) stored as an explicit bit 1523 within the magnitude integer 1524. The sign bit 1520 (Sx) is maintained for the quantized fixed-point value. The scaled exponent scale factor 1522 (SF) is computed as shown in equation (2) above." Right shifting interpreted as synonymous with removing bits. ) rounding the mantissa to produce a quantized precision value. ([¶0217] "FIG. 17 illustrates floating-point to dynamic fixed-point biased rounding, according to an embodiment. Quantization with biased rounding as illustrated in FIG. 17 is similar to quantization as illustrated in FIG. 15A. Additionally, a round bit 1740 and a bias bit 1742 are used to capture bits that would otherwise be lost during the right shift to generate the integer magnitude value." FIG. 17 explicitly shows the rounding occuring in the mantissa. ) 

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

Regarding claim 6, Drumond teaches The method of claim 1.  However, Drumond does not explicitly teach further comprising: reshaping the input tensor to allow the converting the input tensor to include independent operations on portions of the input tenor.

Mellempudi who teaches a related neural network accelerator The method of claim 1, further comprising: reshaping the input tensor to allow the converting the input tensor to include independent operations on portions of the input tenor. ([¶0217] "FIG. 17 illustrates floating-point to dynamic fixed-point biased rounding, according to an embodiment. Quantization with biased rounding as illustrated in FIG. 17 is similar to quantization as illustrated in FIG. 15A. Additionally, a round bit 1740 and a bias bit 1742 are used to capture bits that would otherwise be lost during the right shift to generate the integer magnitude value." reshaping the input tensor interpreted as quantizing the tensor values.  Converting interpreted as synonymous with casting, independent operations on portions of the input tensor interpreted as synonymous with integer based math or other bit operations such as shifting the fractional bit. ) 

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

Regarding claim 8, Drumond teaches The method of claim 1, wherein the input tensor represents a neural network, and wherein the method further comprises: ([Sec. 5.1] "In the backward pass, we perform the same pre-/post-processing of the inputs/outputs of the x derivative (Figure 6b), but handle the w derivative differently (Figure 6c) since it performs a reduction across entire batches. Thus, to emulate the behavior of an accelerator with native BFP, we convert inputs to BFP tensors that share exponents across the entire batch. Finally, we re-align weights and their gradients during updates to simulate the update of weights stored in BFP." See Also FIG. 6 FIG. 6 shows operations in training in TensorFlow.  Backward pass refers to backpropagation which is an aspect of training. ) calculating loss of a neural network using the set of quantized-precision format numbers; and ([Sec. 6.1] "Figure 7 shows the training loss of BFP with various rounding and exponent policies and mantissa bit-widths." BFP is interpreted as synonymous with quantized-precision format numbers. ) 

However, Drumond does not explicitly teach updating the modified set of quantized-precision format numbers based on a gradient calculated based on the calculated loss of the neural network.

 Mellempudi who teaches a related neural network accelerator teaches updating the modified set of quantized-precision format numbers based on a gradient calculated based on the calculated loss of the neural network. ([¶0158] "The error values are then propagated backwards until each neuron has an associated error value which roughly represents its contribution to the original output. The network can then learn from those errors using an algorithm, such as the stochastic gradient descent algorithm, to update the weights of the neural network." error and loss are interpreted as synonymous.).

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

 Regarding claim 9, Drumond teaches The method of claim 1, wherein: the normal-precision floating-point format is one of the following: a 16-bit floating-point format, a 32-bit floating-point format, a 64-bit floating-point format, or an 80-bit floating-point format. ([Sec. 5.2] "Evaluation Metric. To evaluate the impact of BFP, we tune the models using only FP32, and then train the same models from scratch with the same hyper-parameters in BFP. We report training loss and best top-1 error." ) .

 Regarding claim 12, Drumond teaches A quantization-enabled system for modeling a neural network comprising tensors representing node weights and edges, the system comprising: ([Introduction] "In this paper, we make the observation that in DNNs, the majority of the arithmetic operations executed are performed as part of dot product calculations, and therefore, limiting dense fixed-point-like arithmetic to only replacing the dot products still allows us to accelerate the majority of the network. As such, the rest of the operations can be implemented in traditional floating-point logic with little performance degradation. We propose a hybrid BFP-FP framework where values float freely between dot product computations in BFP, resulting in better choice of exponents, and perform the rest of the training in traditional floatingpoint arithmetic." ) instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format; ([Sec. 2] "these networks require hardware that is orders of magnitude simpler for inference, they are trained in a similar way to traditional neural networks, with both activations and parameters represented with floating-point." node weights and edges interpreted as synonymous with parameters ) instructions that cause the system to convert at least one of the tensors to values expressed in a quantized-precision format; ([Sec. 4.4] FIG. 5 "The FP-to-BFP units convert tensors by detecting the maximum exponent of the input FP tensors and normalizing the mantissas accordingly" ) instructions that cause the system to perform at least one mathematical operation with the at least one of the quantized tensors, producing modified tensors; and ([Sec. 5.1] " We train DNNs with the hybrid approach, using BFP" See also FIG. 6. Quantized-precision format numbers are passed directly to the layer operation where a modified set of quantized-precision format numbers are produced. ) instructions that cause the system to convert the modified tensors to a normal-precision floating-point format. (FIG. 6 "BFP to FP" ) 

However, Drumond does not explicitly teach memory; one or more processors coupled to the memory; one or more computer readable storage media storing computer-readable instructions that when executed by the at least one processor, cause the system to perform a method of evaluating the neural network.

 Mellempudi who teaches a related neural network accelerator teaches memory; (FIG. 1) one or more processors coupled to the memory; (FIG. 1) one or more computer readable storage media storing computer-readable instructions that when executed by the at least one processor, cause the system to perform a method of evaluating the neural network ([¶0219] "The register file 1806 can store general-purpose and architectural registers used by the SIMT unit 1809. The thread manager 1808 can distribute and re-distribute threads among the compute units of the SIMT unit 1809. In one embodiment, the SIMT unit 1809 is configured to execute a single instruction as multiple threads, with each thread of the instruction executed by a separate compute unit.").

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

Regarding claim 13, the combination of Drumond and Mellempudi teaches The system of claim 12, wherein: the mathematical operation is performed with the quantized values stored in a normal-precision floating-point format. (Drumond [Sec. 4.1] "BFP represents numbers with a mantissa and exponent, like floating-point, but exponents are shared across entire tensors, as shown in Figure 1, resulting in dot products that can be computed entirely in fixed-point logic." The quantized values in Drumond are taught as being stored as standard floating point format with the exponent shared across a tensor.).

Regarding claim 14, the combination of Drumond and Mellempudi teaches The system of claim 12, wherein:the mathematical operation is performed by emulating quantized operations with the quantized values. (Drumond [Sec. 5.1] "In the backward pass, we perform the same pre-/post-processing of the inputs/outputs of the x derivative (Figure 6b), but handle the w derivative differently (Figure 6c) since it performs a reduction across entire batches. Thus, to emulate the behavior of an accelerator with native BFP, we convert inputs to BFP tensors that share exponents across the entire batch. Finally, we re-align weights and their gradients during updates to simulate the update of weights stored in BFP." See Also FIG. 6 FIG. 6 shows operations in training in TensorFlow.  Backward pass refers to backpropagation which is an aspect of training.).

Regarding claim 15, the combination of Drumond and Mellempudi teaches  The system of claim 12, wherein the modified tensors represent a quantized neural network, and wherein the instructions to perform the at least one mathematical operation further comprise: instructions that cause the system to perform quantized training of the quantized neural network to produce the modified tensors. (Drumond [Sec. 5.1] " We train DNNs with the hybrid approach, using BFP" See also FIG. 6. Quantized-precision format numbers are passed directly to the layer operation where a modified set of quantized-precision format numbers are produced. ) .

 Regarding claim 16, the combination of Drumond and Mellempudi teaches  The system of claim 12, wherein the instructions further comprise: instructions to program a neural network accelerator with quantized values determined based on executing the instructions to convert the tensors, to perform the at least one mathematical operation, and/or to convert the modified tensors to the normal-precision floating-point format. (Drumond FIG. 6 "BFP to FP").

Regarding claim 18, Drumond teaches The computer-readable storage media of claim 17.  However, Drumond does not explicitly teach wherein the at least one parameter is for a neural network represented in the quantized precision format, the at least one parameter including at least one of the following: a bit width of node weights, a bit width of activation values, a floating-point format for performing non-quantized operations, a tile size for a shared exponent, a parameter to share an exponent on a per-row basis, a parameter to share an exponent on a per-column basis, and/or a parameter specifying a method of common exponent selection.

Mellempudi who teaches a related neural network accelerator teaches The computer-readable storage media of claim 17, wherein the at least one parameter is for a neural network represented in the quantized precision format, the at least one parameter including at least one of the following: a bit width of node weights, a bit width of activation values, a floating-point format for performing non-quantized operations, a tile size for a shared exponent, a parameter to share an exponent on a per-row basis, a parameter to share an exponent on a per-column basis, and/or a parameter specifying a method of common exponent selection. (FIG. 21A [¶0235] "In addition to shared exponent or scaling factor data, the metadata can also contain other terms used for data conversions, such as floating-point to fixed point or fixed-point to floating-point conversions" metadata is interpreted as synonymous with parameter.).

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

 Regarding claim 19, Drumond teaches The computer-readable storage media of claim 17.  However, Drumond does not explicitly teach wherein the computer-readable instructions further comprise a parameter to specify flattening a tensor prior to quantization.

Mellempudi who teaches a related neural network accelerator teaches The computer-readable storage media of claim 17, wherein the computer-readable instructions further comprise a parameter to specify flattening a tensor prior to quantization. ([¶0171] "The untrained neural network 1106 can learn groupings within the unlabeled input and can determine how individual inputs are related to the overall dataset. Unsupervised training can be used to generate a self-organizing map, which is a type of trained neural network 1107 capable of performing operations useful in reducing the dimensionality of data." [¶0231] "In a training scenario, some tensors can be blocked" Mellempudi explicitly teaches reducing the dimensionality of the input data prior to training and furthermore teaches quantizing the input tensor for training.  While Mellempudi does not explicitly teach flattening the tensor prior to quantization this simply amounts to a change in sequence and according to In reBurhans, 154 F.2d 690, 69 USPQ 330 (CCPA 1946) (selection of any order of performing process steps is prima facie obvious in the absence of new or unexpected results).  Quantizing the tensor prior to flattening would be expected to yield similar or even more performant results, therefore this limitation is considered obvious and yielding of expected results.).

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

Regarding claim 20, Drumond teaches The computer-readable storage media of claim 17.  However, Drumond does not explicitly teach wherein the computer-readable instructions further comprise: instructions to provide a class method defining a quantized matrix multiplication operation.

Mellempudi who teaches a related neural network accelerator teaches The computer-readable storage media of claim 17, wherein the computer-readable instructions further comprise: instructions to provide a class method defining a quantized matrix multiplication operation. ([¶0234] "One solution is to split the tensor into smaller blocks with independent shared exponent while maintaining the integer data at lower precision. This technique is useful for expressing large matrix multiplication and convolution operations.").

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the accelerators in Drumond and Mellempudi. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Mellempudi ([¶0228] “additional operations for the tensor data can be performed in floating-point. Logic 1940 can be used to enable training for a dataset to be performed at least in part in a dynamic fixed-point precision, enabling a performance and efficiency gain during the earlier portion of training, reducing the overall training time for a neural network.”).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Drumond and in view of Yang (US 2018/0157940 A1).

Regarding claim 10, Drumond teaches the method of claim 1.  However, Drumond does not explicitly teach wherein: the input tensor has two dimensions X and N, the performing the at least one operation comprises applying a convolution kernel having three dimensions K, N, and P to the input tensor, the method further comprising: flattening the convolution kernel into a two-dimensional matrix having two dimensions K×N and P; and converting the input tensor into a matrix having two dimensions K×N and X. .

Yang who teaches a related art of using convolutional neural networks teaches wherein: the input tensor has two dimensions X and N, ([¶0064] "An input image generally contains a large amount of imagery data. In order to perform image processing operations. The input image 1100 is partitioned into M-pixel by M-pixel blocks 1111-1112 as shown in FIG. 11A." [¶0065] "In another embodiment, the input image is a rectangular shape with dimensions of (2I×M)-pixel and (2J×M)-pixel, where I and J are positive integers.." 2I*M is interpreted as synonymous with X, and 2J*M is interpreted as synonymous with N.) the performing the at least one operation comprises applying a convolution kernel having three dimensions K, N, and P to the input tensor, the method further comprising: flattening the convolution kernel into a two-dimensional matrix having two dimensions K×N and P; and ([¶0060] "After 3×3 convolutions for each group of imagery data are performed for predefined number of filter coefficients, convolution operations results Out(m, n) are sent to the first set of memory buffers via another multiplex" [¶0047] "m, n are corresponding row and column numbers for identifying which imagery data (pixel) within the (M+2)-pixel by (M+2)-pixel region the convolution is performed;" 3x3 convolution interpreted as refering to a three dimensional convolution kernel.  Outputting as two parameters interpreted as flattening the results. See also formula 1 ¶0046 ) converting the input tensor into a matrix having two dimensions K×N and X. ([¶0062] "If a 2×2 pooling operation is required, the M×M output results are reduced to (M/2)×(M/2)" See also FIG. 10). 

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the CNN taught in Yang with the accelerator in Drumond. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Yang ([¶0004] “The FCN layer is therefore required to project the high dimensional vector to a relatively low dimensional space, e.g., 4096, 1024, or smaller number (e.g., 128). Disadvantage of such a feature extraction is that the huge number of parameters (e.g. more than 100 million (i.e., 25088×4096) for the FCN layer connecting to convolutional layer). As a result, runtime performance is low due to such a high computation complexity.”).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Drumond, and in further view of Jin (US 2018/0089562 A1)

 Regarding claim 11, Drumond teaches The method of claim 1.  However, Drumond does not explicitly teach, wherein: the input tensor has three dimensions X, Y, and N, the performing the at least one operation comprises applying a convolution kernel having four dimensions K, L, N, and P to the input tensor, the method further comprising: converting the input tensor into a matrix having two dimensions K×L×N and M. flattening the input tensor into a two-dimensional matrix having two dimensions N×M and K; and .

Jin who teaches a related art of using convolutional neural networks teaches wherein: the input tensor has three dimensions X, Y, and N, the performing the at least one operation comprises ([¶0046] "Referring to FIG. 2A, input data 200 on the leftmost side may include a plurality of channels. FIG. 2 shows an example in which the input data 200 includes three channels. The input data 200 may be expressed in width, height and depth." ) applying a convolution kernel having four dimensions K, L, N, and P to the input tensor, the method further comprising: (FIG. 2B shows that convolution operation is dependent on four dimensional variables a,b,c, and d. [¶0047] "A convolution layer may perform a convolution operation on two weight sets 220 having a size of C×C×D and each of the input data...A convolution operation may be identically performed on all of depths Di (d=0,1,2)" ) converting the input tensor into a matrix having two dimensions K×L×N and M. (FIG. 2B 250 shows pooling layer outputting 2 dimensional representation of input. Figure 2B explicitly shows that output of one pooling layer can be used for a subsequent pooling layer, see also FIG 1 ) flattening the input tensor into a two-dimensional matrix having two dimensions N×M and K; and (FIG. 2B 250 shows pooling layer outputting 2 dimensional representation of input. Figure 2B explicitly shows that output of one pooling layer can be used for a subsequent pooling layer, see also FIG 1).  

Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the CNN from Jin with the accelerator in Drumond. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Jin ([¶0040] “In accordance with an embodiment, the CNN architecture can reduce an operation latency using a drop-out method, e.g., a drop-out method, that is, a regularization method, for improving performance of an algorithm in a fully connected layer.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Dipankar (“MIXED PRECISION TRAINING OF CONVOLUTIONAL NEURAL NETWORKS USING INTEGER OPERATIONS”, 2018).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124