Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Daniel Yeates on 3/8/2022.

The application has been amended as follows: 
1. (Currently Amended) A computer-implemented method for rerouting device allocation commands to a GPU having GPU memory through a driver and a software stack, wherein the rerouting further comprises 
updating the software stack, wherein the updated software stack allows for adding drivers to monitor for relevant device allocation commands to be redirected to an address translation service;  
subsequent to the updating, performing the steps of:   
receiving, on a host having a host memory
determining that the call includes a relevant device allocation command, wherein the device allocation command is configured to allocate a set of data to the graphical processing unit (GPU)
dynamically intercepting the call to prevent the call from being passed to the driver on the host; 
initiating an alternate data allocation command, wherein the alternate data allocation command allocates data to a coherent memoryand the coherent memory uses a portion of the GPU memoryand is physically located on the GPU, and further the data stored in the coherent memory is accessible to the host through the address translation service located on the host without mirroring the data to the host memory, wherein the address translation service allows the host to access the data in the coherent memory by translating a virtual GPU address into a physical address usable by the host;
completing the alternate data allocation command; and
returning the completed alternate data allocation command to the application.
2. (Cancelled) 
3. (Original) The method of claim 1 
wherein the first state and the second state are the same.
4. (Original) The method of claim 1 where the alternate data allocation command includes a first command and a second command.
5. (Original) The method of claim 4, wherein the first command allocates data on the host, and the second command allocates data on the GPU.

7. (Original) The method of claim 5, wherein the first command includes a glibc malloc command, and the second command includes a cudaMemPrefetchAsynce command.
8. (Previously Canceled)
9. (Original) The method of claim 1, wherein the call is a cudaMalloc call.
10. (Original) The method of claim 1, wherein the call is a cudaMallocManaged call.
11. (Original) The method of claim 1, wherein the host transfers data to the GPU via a NVLink.
12. (Currently Amended) A system for rerouting device allocation commands to a GPU having GPU memory through a driver and a software stack, wherein the system comprises 
a processor; and 
a computer-readable storage medium communicatively coupled to the processor and storing program instructions which, when executed by the processor, are configured to cause the processor to: 
update the software stack, wherein the updated software stack allows for adding drivers to monitor for relevant device allocation commands to be redirected to an address translation service;
subsequent to the updating, performing the steps of:
on a host having a host memory
 	determine that the call includes a relevant device allocation command, wherein the device allocation command is configured to allocate a set of data to the graphical processing unit (GPU) 
dynamically intercept the call to prevent the call from being passed to the driver on the host; 
initiate an alternate data allocation command, wherein the alternate data allocation command allocates data to a coherent memory and the coherent memory uses a portion of the GPU memoryand is physically located on the GPU, and further the data stored in the coherent memory is accessible to the host though the address translation service located on the host without mirroring the data to the host memory, wherein the address translation service allows the host to access the data in the coherent memory by translating a virtual GPU address into a physical address usable by the host;
complete the alternate data allocation command; and 
return the completed alternate data allocation command to the application.
13. (Cancelled)

15. (Original) The system of claim 14, wherein the first command allocates data on the host, and the second command allocates data on the GPU.
16. (Previously Canceled).
17. (Currently Amended) A computer program product for rerouting device allocation commands to a GPU having GPU memory though a driver and a software stack, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processing unit to cause the processing unit to: 
	update the software stack, wherein the updated software stack allows for adding drivers to monitor for relevant device allocation commands to be redirected to an address translation service;
	subsequent to the updating, performing the steps of:
receive, on a host having a host memory
determine that the call includes a relevant device allocation command, wherein the device allocation command is configured to allocate a set of data to the graphical processing unit (GPU)
dynamically intercept the call to prevent the call from being passed to the driver on the host;
 initiate an alternate data allocation command, wherein the alternate data allocation command allocates data to a coherent memoryand the coherent memory uses a portion of the GPU memoryand is physically located on the GPU, and further the data stored in the coherent memory is accessible to the host through the address translation service located on the host without mirroring the data to the host memory, wherein the address translation service allows the host to access the data in the coherent memory by translating a virtual GPU address into a physical address usable by the host;
complete the alternate data allocation command; and
return the completed alternate data allocation command to the application.
18. (Cancelled)
19. (Previously Presented) The computer program product of claim 17, wherein the alternate data allocation command includes a first command and a second command, the first command allocates data on the host, and the second command allocates data on the GPU.
20. (Previously Canceled).
21. (Previously Canceled).
22. (Cancelled)
23. (Cancelled)



Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: 
After a thorough search the examiner has not been able to find art that teaches or suggests the invention as claimed.  In particular, the examiner was unable to adequately disclose :
updating the software stack, wherein the updated software stack allows for adding drivers to monitor for relevant device allocation commands to be redirected to an address translation service;  
subsequent to the updating, performing the steps of:   
receiving, a call from an application on a host having a host memory;
determining that the call includes a relevant device allocation command, wherein the device allocation command is configured to allocate a set of data to the graphical processing unit (GPU);
dynamically intercepting the call to prevent the call from being passed to the driver on the host; 
initiating an alternate data allocation command, wherein the alternate data allocation command allocates data to a coherent memory and the coherent memory uses a portion of the GPU memory and is physically located on the GPU, and further the data stored in the coherent memory is accessible to the host through 
completing the alternate data allocation command; and
returning the completed alternate data allocation command to the application.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK A GOORAY whose telephone number is (571)270-7805. The examiner can normally be reached Monday - Friday 10:00am - 6:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on 571-272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/LEWIS A BULLOCK  JR/Supervisory Patent Examiner, Art Unit 2199