Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-3, 14-15, 18, 20-23, and 24-26 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Yang et al. (WO 2021138842).
With respect to claim 1, Yang et al. disclose a computer-implemented method, comprising:
partitioning inputs to a neural network model into portions that are each sized for execution in a processing core by a single thread block (paragraph 42, with reference to FIG. 2, CU 208 can transmit a plurality of inputs to a plurality of cores 202 (e.g., cores 202a-202d) of HAPU 200 respectively. In the case that the number of the cores is N, the HAPU may perform an initial round of loading of the inputs to respective cores of the HAPU and (N-1) rounds of communications of the current inputs in the cores to other cores of the HAPU in sequence);
loading weights for the neural network model into a register file within a processor once to process the inputs (paragraph 44, At step 403, at each of the plurality of cores, a computation is repeatedly performed using the part of a weight matrix corresponding to the core and the input received at the core, during the initial loading of the inputs or each round of communication of the inputs from other cores, each of the plurality of cores can perform a computation using the part of the weight matrix corresponding to the core and an input received (e.g., loaded from an external memory or communicated from another core) at the core);
independently processing the portions in parallel by a set of processing cores within the processor, wherein weights for a first layer of the neural network model are applied to each of the portions to generate intermediate results for each portion (paragraph 44, With reference to FIG. 2, each core 202 (e.g., core 202a, core 202b, core 202c or core 202d) can perform a computation using the part of the weight matrix corresponding to the core and each input loaded or communicated to the core by CU 208, paragraph 47, with reference to FIG. 2, communication of input_ a from core 202a to core 202d can be performed in parallel with computation on core 202a using input_a and corresponding part_a of the weight matrix, communication of input_ b from core 202b to 202a can be performed in parallel with computation on core 202b using input_b and corresponding part_b of the weight matrix, and so on);
storing the intermediate results for each portion into a memory that is shared between the set of processing cores (paragraph 45, Each core can store a corresponding part of the weight matrix in its local memory. For example, with reference to FIG. 2, CU 208 can load a plurality of parts (e.g., part_ a, part_ b, part_ c and part_ d) of a weight matrix into local memories 2022 of a plurality of cores 202 (e.g., core 202a, core 202b, core 202c and core 202d)); and
processing the intermediate results for each portion by a subsequent layer of the neural network model to produce subsequent intermediate results until a last layer of the neural network model generates outputs (paragraph 48, At step 405, results of computations using an input received from another core can be communicated to the core which the input is initially loaded to, transmission engine 2026 can perform the communication by, e.g., reading the result from local memory and transmitting it to CU 208, paragraph 52, At step 407, results of the computations can be output. The results can include computation results using all inputs and all parts of the weight matrix).
	With respect to claim 2, Yang et al. disclose the computer-implemented method of claim 1, wherein the set of processing cores and the memory are included within a graphics processing unit (paragraph 36, a HAPU (e.g., HAPU 200 of FIG. 2 or HAPU 308 of FIG.3A-3B) can be a computing device for accelerating neural network processing tasks, e.g., neural network training or inference. In some embodiments, HAPU 308 can be configured to be used as a co-processor of host unit 302, paragraph 78, embodiments of the disclosure can be applied to Ali-NPU (e.g., Hanguang NPU), Ali-Cloud, Ali-DAU (Database Acceleration Unit), Ali-Al platfom1, GPU).
	With respect to claim 3, Yang et al. disclose the computer-implemented method of claim 1, wherein the memory comprises at least one of low-level caches, shared on-chip memory (paragraph 32, With local memory 2022, part or all of data access can be performed within each core 202a-202d, reducing the latency caused by data access), and registers.
	With respect to claim 14, Yang et al. disclose the computer-implemented method of claim 1, wherein the neural network model is trained on a server or in a data center and the outputs are streamed to a user device (paragraph 36, a HAPU (e.g., HAPU 200 of FIG. 2 or HAPU 308 of FIG.3A-3B) can be a computing device for accelerating neural network processing tasks, e.g., neural network training or inference, paragraph 39, computing server 312 can, for example, include the machine learning system 300, which includes HAPU 308).
	With respect to claim 15, Yang et al. disclose the computer-implemented method of claim 1, wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed within a cloud computing environment (paragraph 39, The cloud system 310 can include a plurality of computing servers (e.g., computing servers 312 and 314). In some embodiments, computing server 312 can, for example, include the machine learning system 300, which includes HAPU 308. The cloud system 310 may be connected to user devices via a network. With the assistance of HAPU 308, cloud system 310 can provide extended AI capabilities of image recognition, facial recognition, translations, 3D modeling, and the like).
	With respect to claim 18, Yang et al. disclose the computer-implemented method of claim 1, wherein the outputs are used for training, testing, or certifying a neural network employed in a machine (paragraph 48, Results of computations using input_ a and a part of the weight matrix stored at core 202d can be communicated by CU 208 to core 202a, results of computations using input_b and a part of the weight matrix stored at core 202a can be communicated by CU 208 to core 202b, and so on. In some embodiments, transmission engine 2026 can perform the communication by, e.g., reading the result from local memory and transmitting it to CU 208), robot, or autonomous vehicle. The CU 208 is in HAPU 200, which comprises a machine.
	With respect to claim 20, Yang et al. disclose a system (paragraph 35, FIG. 3A illustrates an exemplary machine learning system 300), comprising: a global memory storing weights (paragraph 35, host memory 306); a processor that is connected to the global memory and executes a neural network model (paragraph 35, host unit 302) by executing the method of claim 1; see rationale for rejection of claim 1.
	With respect to claim 21, Yang et al. disclose the system of claim 20 for executing the method of claim 2; see rationale for rejection of claim 2.
	With respect to claim 22, Yang et al. disclose the system of claim 20 for executing the method of claim 3; see rationale for rejection of claim 3.
	With respect to claim 24, Yang et al. disclose a non-transitory computer-readable media storing computer instructions (paragraph 79, The various example embodiments described herein are described in the general context of method steps or processes, which may be implemented in one aspect by a computer program product, embodied in a computer readable medium, including computer executable instructions, such as program code, executed by computers in networked environments) that, when executed by one or more processors, cause the one or more processors to perform the steps of claim 1; see rationale for rejection of claim 1.
	With respect to claim 25, Yang et al. disclose the non-transitory computer-readable media of claim 24 for implementing the method of claim 2; see rationale for rejection of claim 2.
	With respect to claim 26, Yang et al. disclose the non-transitory computer-readable media of claim 24 for implementing the method of claim 3; see rationale for rejection of claim 3.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 4, 23, and 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (WO 2021138842) in view of Dahm et al. (U.S. PGPUB 20180018814).
	With respect to claim 4, Yang et al. disclose the computer-implemented method of claim 1. However, Yang et al. do not expressly disclose the neural network model implements a radiance cache for performing light transport path tracing.
	Dahm et al., who also deal with neural networks, disclose a method wherein the neural network model implements a radiance cache for performing light transport path tracing (paragraph 23, a data structure is initialized that is configured to provide an importance value for each incident sample in a three-dimensional (3D) scene. The data structure stores incident radiance values and can be queried given a position in a three-dimensional (3D) scene and a direction).
	Yang et al. and Dahm et al. are in the same field of endeavor, namely neural networks.
	Before the effective filing date of the claimed invention, it would have been obvious to apply the method wherein the neural network model implements a radiance cache for performing light transport path tracing, as taught by Dahm et al., to the Yang et al. system, because a data structure would allow fast retrieval of radiance values for performing lighting calculations.
With respect to claim 23, Yang et al. as modified by Dahm et al. disclose the system of claim 20 for executing the method of claim 4; see rationale for rejection of claim 4.
With respect to claim 27, Yang et al. as modified by Dahm et al. disclose the non-transitory computer-readable media of claim 24 for implementing the method of claim 4; see rationale for rejection of claim 4.

Claim(s) 16-17 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (WO 2021138842) in view of Hoppert et al. (U.S. PGPUB 20180218473).
With respect to claim 16, Yang et al. disclose the computer-implemented method of claim 1. However, Yang et al. do not expressly disclose one or more of the steps of partitioning, loading, independently storing, and processing are performed on a server or in a data center and the image is streamed to a user device.
Hoppert et al., who also deal with neural networks, disclose a method wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed on a server or in a data center and the image is streamed to a user device (paragraph 45, the GPU-input data 314 may represent a stream of video data received over the network 108 that is to be processed for output via the virtual machine and ultimately the client device 102).
	Yang et al. and Hoppert et al. are in the same field of endeavor, namely neural networks.
	Before the effective filing date of the claimed invention, it would have been obvious to apply the method wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed on a server or in a data center and the image is streamed to a user device, as taught by Hoppert et al., to the Yang et al. system, because the functionality provided by the desktop interface may be furnished largely using the processing and storage resources of the enterprise's servers, rather than resources of the computing devices the individuals interact with directly (paragraph 2 of Hoppert et al.).
With respect to claim 17, Yang et al. as modified by Hoppert et al. disclose the computer-implemented method of claim 1, wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed on a server or in a data center and the neural network model is streamed to a user device (Hoppert et al.: paragraph 45, the GPU-input data 314 may represent a stream of video data received over the network 108 that is to be processed for output via the virtual machine and ultimately the client device 102. Examples of GPU-input data include a streaming television data stream, a streaming movie stream, data for a cloud-hosted video game environment, raw video, and a deep learning data set).
	With respect to claim 19, Yang et al. as modified by Hoppert et al. disclose the computer-implemented method of claim 1, wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed on a virtual machine comprising a portion of a graphics processing unit (Hoppert et al.: paragraph 29, the respective driver enables the virtual machine to manipulate the allocated partition of the GPU to process the data and produce GPU-processed data (e.g., rendered scenes, encoded video, decoded video, learned models and neural networks)). It would have been obvious to apply the method wherein one or more of the steps of partitioning, loading, independently storing, and processing are performed on a virtual machine comprising a portion of a graphics processing unit, as taught by Hoppert et al., to the Yang et al. system, because virtual machines may leverage GPUs of a host device to provide rendered scenes (e.g., for cloud-hosted video games, high-definition (HD) three-dimensional (3D) images, or virtual reality environments), video encoding and decoding, data processing, massive-scale computing, and so on (paragraph 14 of Hoppert et al.).
Allowable Subject Matter
Claims 5-13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  Bond (U.S. PGPUB 20220284662) discloses spatial location and viewing direction as inputs and view dependent emitted radiance at output. Lu et al. (CN 111583371) disclose inputting spatial position and outputting multiple scattering radiance values. However, none of the cited art teaches or suggests the claimed inputs and outputs of the neural network model, i.e., the inputs are three-dimensional (3D) positions associated with light transport paths through a scene and the outputs are radiance predictions at the 3D positions.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. Patent No. 10,909,442 to Szarvas et al. for a method of running a neural network model on a virtual machine instantiated at hosts comprising graphics processing units.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW GUS YANG whose telephone number is (571)272-5514. The examiner can normally be reached M-F 9 AM - 5:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached on (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANDREW G YANG/Primary Examiner, Art Unit 2619                                                                                                                                                                                                        
11/4/22