Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
The instant application having Application No. 16/221187 filed on 12/14/2018 is presented for examination by the examiner.

Examiner Notes
Examiner cites particular columns and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Drawings
The applicant’s drawings submitted are acceptable for examination purposes.

Information Disclosure Statement
As required by M.P.E.P. 609, the applicant’s submissions of the Information Disclosure Statement dated 04/24/2020, 05/02/2019 and 04/26/2019 are acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending.

Claim Rejections - 35 USC § 101
5.	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 is directed to an abstract idea without significantly more.  The independent claim recites converting input vectors from floating-point numbers to a common exponent and mantissa values; decomposing the mantissa values into quantized values and residual values; for each quantized value and residual value: performing a dot product using the quantized value and weights to provide a first output; selectively choosing whether to perform a dot product on the residual value depending on a desired level of precision needed; if a lower level of precision is needed, using the first output as a node output; if a higher level of precision is needed, performing a dot product using the residual value and weights to obtain a second output; combining the first output to the second output to generate the node output.
The limitations, as drafted, describe a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “in a neural network executing on hardware” nothing in the claim elements preclude the step from practically being performed in the mind. For example, but for the noted language, all of the limitations including “converting input vectors from floating-point numbers to a common exponent and mantissa values; decomposing the mantissa values into quantized values and residual values; for each quantized value and residual value: performing a dot product using the quantized value and weights to provide a first output; selectively choosing whether to perform a dot product on the residual value depending on a desired level of precision needed; if a lower level of precision is needed, using the first output as a node output; if a higher level of precision is needed, performing a dot product using the residual value and weights to obtain a second output; combining the first output to the second output to generate the node output.” are pre/post-activity solutions for converting/decomposing/performing/choosing/using/performing without significantly more.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the hardware executing a neural network are recited at a high-level of generality (i.e., as a generic hardware) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a hardware to execute a neural network amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.  Additionally, the dependent claims comprise insignificant extra-solution activity, and thus do not add limitations that would render the claims eligible.  Dependent claims 1-7 and 9 are rejected on the same basis as independent claim 1.
Claim 8 is rejected on the same basis as independent claim 1. Furthermore, this judicial exception is not integrated into a practical application. In particular, “the selectively choosing is controlled by a user through an Application Programming Interface (API) to the neural network such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
Claim 10 is directed to an abstract idea without significantly more.  The independent claim recites hardware designed to perform a dot product on tensors of a first size, which is a portion of normal-precision floating-point number of a second size; hardware for converting the normal-precision floating point number so as to reduce the normal-precision floating-point number to a quantized tensor of the first size and a residual tensor of the first size; and hardware for selectively determining whether to perform the dot product on the quantized tensor only or the quantized tensor and the residual tensor based on a desired level of precision.
The limitations, as drafted, describe a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “hardware designed to perform …”, “hardware for converting …”,  and “hardware for selectively determining …” nothing in the claim elements preclude the step from practically being performed in the mind. For example, but for the noted language, all of the limitations including “hardware designed to perform a dot product on tensors of a first size, which is a portion of normal-precision floating-point number of a second size; hardware for converting the normal-precision floating point number so as to reduce the normal-precision floating-point number to a quantized tensor of the first size and a residual tensor of the first size; and hardware for selectively determining whether to perform the dot product on the quantized tensor only or the quantized tensor and the residual tensor based on a desired level of precision.” are pre/post-activity solutions for perform/converting/determining  without significantly more.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the hardware executing a neural network are recited at a high-level of generality (i.e., as a generic hardware) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a hardware to execute a neural network amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.  Additionally, the dependent claims comprise insignificant extra-solution activity, and thus do not add limitations that would render the claims eligible.  Dependent claims 11-16 are rejected on the same basis as independent claim 10.
Claim 17 is directed to an abstract idea without significantly more.  The independent claim recites instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format having a first size; instructions that cause the system to convert the tensors to primary values expressed in a quantized-precision format, of a second size, smaller than the first size, and residual values in the quantized-precision format of the second size; instructions that cause the system to perform at least one mathematical operation on the primary values in the quantized-precision format, producing modified primary tensors; instructions that selectively perform the at least one mathematical operation on the residual values in the quantized-precision format, producing modified residual tensors; and instructions that add the modified primary tensors and the modified residual tensors to produce output tensors.”
The limitations, as drafted, describe a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “memory”, “one or more processor coupled to the memory” and “one or more computer readable storage media storing computer-readable instructions that when executed by the at least one processor” nothing in the claim elements preclude the step from practically being performed in the mind. For example, but for the noted language, all of the limitations including “instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format having a first size; instructions that cause the system to convert the tensors to primary values expressed in a quantized-precision format, of a second size, smaller than the first size, and residual values in the quantized-precision format of the second size; instructions that cause the system to perform at least one mathematical operation on the primary values in the quantized-precision format, producing modified primary tensors; instructions that selectively perform the at least one mathematical operation on the residual values in the quantized-precision format, producing modified residual tensors; and instructions that add the modified primary tensors and the modified residual tensors to produce output tensors.” are pre/post-activity solutions for evaluate/convert/perform at least one mathematical operation without significantly more.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the hardware executing a neural network are recited at a high-level of generality (i.e., as a generic hardware such as memory and processor) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a hardware to execute a neural network amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.  Additionally, the dependent claims comprise insignificant extra-solution activity, and thus do not add limitations that would render the claims eligible.  Dependent claims 18-20 are rejected on the same basis as independent claim 17.


Allowable Subject Matter
Claims 1-20 would be allowable if rewritten to overcome the rejection(s) under 101, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
Prior arts:
US 2019/0324748 to Fowers
[0041] FIG. 3 is a block diagram of a vector register file (VRF) 300 in accordance with one example. VRF 300 may be used to implement at least a portion of VRF 112 of FIG. 1. VRF 300 may also be used to implement at least a portion of VRF 210 of FIG. 2. In this example, the read/write data interfaces to the VRF 300 may consist of a set of ITensorRead and ITensorWrite interfaces, each of which may read/write LANES of data elements per cycle. Each interface may follow a tensor protocol and may be independently back-pressured. VRF 300 read/write interfaces may include input buffers 302, 304, and 306 and output buffers 352, 354, and 356. In response to a write operation, the input buffers may be used to receive tensors (e.g., input vector data corresponding to a layer of a neural network). In this example, VRF 300 may process vector data in a block-floating point (BFP) format. The exponent (e.g., a shared exponent) may be stored in exponent registers (e.g., EXP REGs 314, 316, and 318) and the mantissa may be stored in shift registers (e.g., SHIFT REGs 308, 310, and 312). The outputs of the exponent registers may be coupled to a multiplexer 322 and the outputs of the shift registers may be coupled to a multiplexer 320, which may be written to a memory (e.g., a block RAM (BRAM)). Thus, the exponent may be written to EXP BRAM 352 and the mantissa may be written to MANTISAA BRAM 330. In response to a read, data from the memory may be output to output interfaces. Thus, the exponents may be output to exponent registers (e.g., EXP REGs 344, 346, and 348) and the mantissa may be output to shift registers (e.g., SHIFT REGs 338, 340, and 342). From these registers the BFP vector data may be provided to output buffers (e.g., buffers 352, 354, and 356). Control logic 370 may control the movement of the vector data through the various components of VRF 300, including for example, multiplexers 320 and 322. Control logic 370 may include address decoding (e.g., ADDRESS DECODE 372), command decoding (e.g., via COMMAND DECODE 374), and read/write control (e.g., via READ FSM 378 and WRITE FSM 376). Thus, upon receipt of a command and address via INfuCmd bus, the command and the address may be decoded to determine the address access pattern and port selection. Table 4 below shows command fields or parameters that may be decoded by control logic 370.

US 2013/0246496 to Craske
[0081] FIG. 2 schematically illustrate a vector normalisation operation. The normalised vector a is given by the input vector components a.sub.i each divided by the magnitude of the sum of the input vector. If the numerator and the denominator of FIG. 2 are both multiplied by a scaling factor k, then there is no overall effect upon the size or direction of the normalised vector a. The normalised vector has unit length a direction which is the same as the input vector. In order to avoid changes in the input vector direction introduced by rounding errors and other calculation in accuracies when manipulating the floating point numbers so that they are scaled as illustrated in FIG. 2, the scaling vector k may be selected such that it has a mantissa of value 1 and an exponent chosen to avoid overflows and underflows when performing the calculations as part of the normalisation operation. In particular, the scaling value is chosen such that the sum of the squares of the plurality of scaled components is less than a predetermined limit value where this limit value is the maximum size floating point number that can be represented with the exponent value and the mantissa value of the floating point format being utilised.

US 2005/0156766 to Melanson
[0061] "c.sub.t-y.sub.t" represent a quantization error value for time t, t=1 through M, in the quantization error vector C.sub.i-Y.sub.i, w.sub.t represents a weight for time t in the weight cost function vector W. Vector W represents one embodiment of a non-uniform weight vector. "W.multidot.[C.sub.i-Y.sub.i].sup.(2).sub.min" represents the minimum weighted, quantizer input/output difference power. The y(n) selector module 808 selects y(n) as the leading bit of output candidate vector Y.sub.i from W.multidot.[C.sub.iY.sub.i].sup.(2).sub.min. The non-uniform weight vector W includes, for example, at least one non-zero weight element that is different from another weight element value or at least one non-unity, non-zero weight value. Weighting the quantization error vector can be accomplished by determining the dot product of the quantization error vector and the time-domain weight vector.

US 2019/0294413 to Vantrease
[0100] As described above, the shifted quantized inputs X.sub.adj and Y.sub.adj may be represented by 9-bit signed integer (INT9) and thus the multiplication may be done using INT9 operands, where 9 is not a power of 2. Storage and management of data-types with a number of bits that is not a power of 2 in memory may not be very efficient. Storing the INT9 data in 16-bit format may waste memory space. According to certain embodiments, the quantized inputs X.sub.q and w.sub.q may be saved in and retrieved from the memory in the UINT8 format. A pre-processing module that includes subtraction engines may be used to shift the quantized inputs X.sub.q and w.sub.q by subtracting the zero-point integer X.sub.qz or W.sub.qz from the quantized inputs according to equations (17) and (18) before the matrix multiplication. The results of the matrix multiplication using the shifted quantized inputs X.sub.adj and Y.sub.adj may be scaled by a floating point scaling factor S.sub.XS.sub.W to convert the results of the matrix multiplication to the more precise real value in floating point format as inputs to a subsequent layer.

[0124] In some implementations, techniques disclosed above can be used to reduce the storage space, transportation bandwidth, and computing power used to perform convolution operations or other matrix multiplications for data in high-precision format, such as 16-bit, 32-bit, 64-bit, or more floating point or decimal point data, or 16-bit, 32-bit, 64-bit or more integer data. In some implementations, the data in high-precision data can be quantized to a low-precision data, such as, for example, 16-bit floating point data or 16-bit or 8-bit integer data. The high-precision data and low-precision data may represent data in the same range but in different precisions or resolutions. For example, when 16-bit data is used to represent values between 0.0 to 1.0, the precision or resolution may be 2.sup.−16, while the precision or resolution of 8-bit data that represents values between 0.0 to 1 is 2.sup.−8. As discussed above, in neural network, high-precision data may be used for training, while low-precision data may be sufficient for at least some operations during inference. In various implementations, the number of bit in each element of the low-precision data may be a multiple of 8 (e.g., byte-aligned) such that the low-precision data can be more efficiently stored in a storage device and managed by a controller or software.

The prior art of record (Fowers in view of Craske, Melanson and Vantrease) does not disclose and/or fairly suggest at least claimed limitations recited in such manners in independent claim 1 "... decomposing the mantissa values into quantized values and residual values; for each quantized value and residual value: performing a dot product using the quantized value and weights to provide a first output; selectively choosing whether to perform a dot product on the residual value depending on a desired level of precision needed; if a lower level of precision is needed, using the first output as a node output; if a higher level of precision is needed, performing a dot product using the residual value and weights to obtain a second output; combining the first output to the second output to generate the node output.”.
The prior art of record (Fowers in view of Craske, Melanson and Vantrease) does not disclose and/or fairly suggest at least claimed limitations recited in such manners in independent claim 10 “hardware designed to perform a dot product on tensors of a first size, which is a portion of normal-precision floating-point number of a second size; hardware for converting the normal-precision floating point number so as to reduce the normal-precision floating-point number to a quantized tensor of the first size and a residual tensor of the first size; and hardware for selectively determining whether to perform the dot product on the quantized tensor only or the quantized tensor and the residual tensor based on a desired level of precision.”.
The prior art of record (Fowers in view of Craske, Melanson and Vantrease) does not disclose and/or fairly suggest at least claimed limitations recited in such manners in independent claim 17 “… instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format having a first size; instructions that cause the system to convert the tensors to primary values expressed in a quantized-precision format, of a second size, smaller than the first size, and residual values in the quantized-precision format of the second size; instructions that cause the system to perform at least one mathematical operation on the primary values in the quantized-precision format, producing modified primary tensors; instructions that selectively perform the at least one mathematical operation on the residual values in the quantized-precision format, producing modified residual tensors; and instructions that add the modified primary tensors and the modified residual tensors to produce output tensors.”.

Conclusion
The following prior art made of record and not relied upon is cited to establish the level of skill in the applicant’s art and those arts considered reasonably pertinent to applicant’s disclosure. See MPEP 707.05(c).
Any inquiry concerning this communication should be directed to examiner Tuan Dao, whose telephone/fax numbers are (571) 270 3387 and (571) 270 4387, respectively. The examiner can normally be reached on every Monday-Thursday, and the second Friday of the bi-week from 7:30AM to 5:00PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do, can be reached at (571) 272 3721.
The fax phone number for the organization where this application or proceeding is assigned is (571) 273 8300.
Any inquiry of a general nature of relating to the status of this application or proceeding should be directed to the TC 2100 Group receptionist whose telephone number is (571) 272 2100.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/TUAN C DAO/            Primary Examiner, Art Unit 2193