DETAILED ACTION

Priority

Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement

The information disclosure statements (IDS) submitted on October 11, 2018, July 21, 2020, and June 7, 2022 were filed after the mailing date of the application on January 15, 2018.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Allowable Subject Matter

Claims 1-20 are allowed.

The following is an examiner’s statement of reasons for allowance:  Prior art includes McGuire et al. (US 2014/0176575)(cited in the Information Disclosure Statement (IDS) filed June 7, 2022) which disclose an accelerator (Figure 2, parallel processing unit (PPU) 200, where [0032] notes PPU 200 may be a graphics processing unit (GPU)) comprising: a multiprocessor (streaming multiprocessors (SMs) 250) to execute parallel threads of an instruction stream ([0024] notes each streaming multiprocessor 250 is configured to execute a plurality of threads concurrently, each thread is an instantiation of a set of instructions executing within a particular SM 250, [0029] notes the PPU 200 implements a SIMT (single-instruction, multiple thread) architecture), the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream (Figure 3 further illustrates each SM 250 comprises a plurality of functional units including (L) processing cores 350, (M) double precision units (DPUs) 351, (N) special function units (SFUs) 352, and (P) load/store units (LSUs) 353, where [0037] notes threads are scheduled as groups of parallel threads (e.g. warps) to the various functional units) as similarly recited by claims 1 and 9.  McGuire et al. also disclose a data processing system (system of Figure 2) comprising: a memory device (memory devices 204, where [0030] notes memory devices 204 may include graphics double-data-rate, version 5, synchronous dynamic random access memory (GDDR5 SDRAM)); and a general-purpose graphics processing unit (Figure 2, parallel processing unit (PPU) 200, where [0032] notes PPU 200 may be a graphics processing unit (GPU)) including an instruction decoder decode logic (host interface unit 210) to decode a single instruction ([0024] notes each streaming multiprocessor 250 is configured to execute a plurality of threads concurrently, each thread is an instantiation of a set of instructions executing within a particular SM 250, [0026] notes host interface unit 210 decodes the commands, [0029] notes the PPU 200 implements a SIMT (single-instruction, multiple thread) architecture) including multiple operands into a single decoded instruction ([0039] notes each SM includes a register filed for storing operands connected to the data paths of the functional units), and a compute unit (Figure 3 further illustrates each SM 250 comprises a plurality of functional units including (L) processing cores 350, (M) double precision units (DPUs) 351, (N) special function units (SFUs) 352, and (P) load/store units (LSUs) 353) as recited by independent claim 16.  Chetlur et al. (US 2016/0062947) (cited in the Information Disclosure Statement filed October 11, 2018) disclose a similar system to that of McGuire above, where the method is directed to performing multi-convolution operations in the parallel processing system, e.g. PPU described above.  Additional prior art includes Koster et al. (US 2017/0316307)(cited in the Information Disclosure Statement (IDS) filed June 7, 2022) which disclose a system implemented as a multiprocessor architecture for executing tensor instructions, such as instructions for performing a neural network computation; and Bruestle et al. (US 2017/0200094)(cited in the Information Disclosure Statement (IDS) filed June 7, 2022) which disclose a system comprising a hardware accelerator including a plurality of operational units supporting instruction sets specific to machine learning, including optimizations for performing tensor operations and convolutions.  However, the prior art cited fails to specifically teach or suggest, singly or combined, the limitations, “…wherein the compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type…” as recited by independent claim 1; “…in response to the single instruction, scaling data of an input tensor associated with a layer of a neural network according to a scale factor, executing one or more compute operations on scaled data of the input tensor, and re-scaling computed data of the compute operations to generate re-scaled computed data; down-converting the re-scaled computed data; and storing down-converted re-scaled computed data to a 16-bit floating-point data type…” as recited by independent claim 9; and “…at least one of the multiple operands associated with input tensor data of a layer of a neural network, and a compute unit including compute logic configured to execute the single instruction to scale data of the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type…” as recited by independent claim 16.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACINTA M CRAWFORD whose telephone number is (571)270-1539. The examiner can normally be reached 9:00 a.m. to 5:00 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571)272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JACINTA M CRAWFORD/Primary Examiner, Art Unit 2612