DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/19/2022 was filed after the mailing date of the application.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 2-7, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Chinya et al. (US. Patent App. Pub. No. 2014/0208042, “Chinya”, hereinafter) in view of Blinzer et al. (US. Patent App. Pub. No. 2012/0162234, “Blinzer”).
As per claim 2, as shown in Fig. 1, Chinya teaches an apparatus comprising: 
a local memory comprising a plurality of stacked dynamic random access memory, DRAM, dies (¶ [19], local memory 180, which can be a DRAM); 
a graphics processor (¶ [19], accelerator card 150 which is GPU) comprising: 
a processing cluster array comprising a plurality of compute units (Fig. 8, accelerators 630s) to execute parallel compute instructions and process data stored in the local memory (addressed below with reference to Blinzer); 
a memory interconnect (SoC fabric 170, ¶ [19]) to couple the processing cluster array to the local memory; 
a host processor (CPU 120) interconnect to couple the processing cluster array to a host processor (via interconnect 140 and bridge 155);
wherein at least a portion of the local memory is to be shared with a host processor, the compute units and the host processor to access data from the portion of the local memory using a shared virtual memory (SVM) address space (¶ [24]); and 
coherency hardware logic to maintain coherency of the data stored in the portion of local memory (¶ [45]), the coherency hardware logic to maintain a directory to track the portion of the local memory in accordance with access by the host processor and the compute units (¶ [40] and [42], by using attribute bit), 
wherein the cache coherency logic is to perform state updates associated with the data in response to both updates received from the host processor over the host processor interconnect and updates received from the compute units (impliedly taught with reference to Fig. 6, ¶ [44], i.e. updating the page table entry that includes number of access from remote (CPU) or local (accelerator)).  
Chinya does not expressly teach the processing cluster array comprising a plurality of compute units to execute parallel compute instructions and process data stored in the local memory.
However, in a very similar method of maintaining coherence of local memory between a GPU and a host CPU as shown in Fig. 2, ¶ [98-99], wherein the graphics processor includes a processing clusters comprising a plurality of compute units to execute parallel compute instructions (¶ [47], [50]), and process data stored in the local memory (¶ [55]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the GPU as taught by Blinzer and incorporate into the apparatus as taught by Chinya as addressed above, the advantage of which is to perform graphics intensive application, such as three dimensional processing.
As per claim 3, the combined teachings of Chinya and Blinzer also include address translation hardware logic coupled to the compute units, the address translation hardware logic to translate a virtual address received from a compute unit to a physical address within the portion of local memory, wherein the physical address is to match a physical address provided by an external address translation hardware logic of the host processor to access the portion of the local memory (see Chinya, ¶ [46], “…  the allocation and management of memory on the accelerator can be carried out in coordination with the memory manager of the OS which allocates and manages the system memory pages that are given to the application and manages the page tables which are utilized by the CPU to translate virtual addresses to physical addresses”. See further, Chinya, ¶ [33]).
As per claim 4, the combined Chinya-Blinzer further teaches wherein the directory comprises a table to track memory pages within the portion of the local memory accessed by the host processor and the compute units (Chinya, ¶ [24], and Fig. 6, ¶ [42] and [44]).
As per claim 5, as addressed, the combined Chinya-Blinzer also teaches wherein the table comprises a plurality of entries, each entry indicating a state value associated with a corresponding memory page, the state value to be used by the coherency hardware logic to manage coherency of the page (Chinya, Fig. 6, ¶ [44]).  
As per claim 6, the combined Chinya-Blinzer impliedly teach wherein the state value comprises a bias value indicating whether the corresponding memory page is biased in favor of the compute units or the host processor (see Chinya, ¶ [40-42], i.e. by examining the attribute bit, “…each TLB entry can include an attribute bit to indicate if the corresponding entry is in remote or local system memory”, and “the OS page table format can include an attribute bit to indicate whether the corresponding page is stored in local or remote memory”. Thus, based on this attribute bit, the memory page to be accessed is determined to be in remote memory (CPU side) or local memory (accelerator side).
As per claim 7, as addressed in claim 6 above, the combined Chinya-Blinzer does also impliedly teach wherein if the state value is in a first state (i.e. attribute bit indicates local memory), the compute units are to be permitted to access the corresponding memory page without first sending a request to the host processor (because the memory page is indicated to be local).
As per claim 9, as addressed in claim 2 referring to Fig. 1 of Chinya, the combined Chinya-Blinzer does also teach wherein the coherency hardware logic is to transmit and receive coherency transactions over the host processor interconnect to maintain coherency of at least some of the data stored in the portion of local memory (Chinya, ¶ [19-20]).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Chinya et al. (US. Patent App. Pub. No. 2014/0208042) in view of Blinzer et al. (US. Patent App. Pub. No. 2012/0162234) further in view of Blinzer (US. Patent App. Pub. No. 2018/0181341, “Blinzer ‘341”).
As per claim 8, the combined Chinya-Blinzer does not expressly teaches wherein the local memory comprises a high bandwidth memory (HBM).  
However, it is well known in the art that high bandwidth memory (HBM) to be used as local memory of a graphics processor. One of such is disclosed in Blinzer’341 at paragraph [0010].
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the HBM as local memory as taught by Blinzer’341 and incorporate into the local memory as taught by the combined Chinya-Blinzer as addressed above, the benefit of which is to obtain a memory with higher bandwidth while using less power for use in high-performance graphics accelerators.




Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hau H. Nguyen whose telephone number is: 571-272-7787.  The examiner can normally be reached on MON-FRI from 8:30-5:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571) 272-7794.
The fax number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/HAU H NGUYEN/Primary Examiner, Art Unit 2611