DETAILED ACTION
Claims 1-33 are pending.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 13-21, 24-26, 31, and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Carey et al. (US 2019/0163447 A1) in view of Marshall et al. (US 6,353,841 B1), in further view of Dube et al. (US 10,503,551 B2).

Regarding claim 1, Carey teaches the invention substantially as claimed including a system comprising: 
a host processor operable to offload tasks (Fig. 1, Processor 110; [0040]: off-load compute-intensive processing from processor 310); and 
a[n] coprocessor [accelerator] coupled to the host processor via a host interface ([0033]: computer system 300 comprises one or more processors 310, a programmable device 312 (Accelerator(s)314), a main memory 320, a mass storage interface 330, a display interface 340, and a network interface 350. These system components are interconnected through the use of a system bus 360) , wherein the coprocessor [accelerator] is operable to receive the offloaded tasks and to provide hardware acceleration for the host processor ([0040]: an accelerator deployment tool as described herein may be practiced using a computer system that has multiple processors and/or multiple buses. In addition, the interfaces that are used preferably each include separate, fully programmed microprocessors that are used to off-load compute-intensive processing from processor 310; Abstract: A code portion in the computer program is identified that will be improved from being deployed to a hardware accelerator), and wherein the coprocessor [accelerator] comprises: 
a reconfiguration region loaded with an accelerator function unit (AFU), wherein the AFU is subdivided into a plurality of accelerator function unit contexts (AFCs) (Referring to FIG. 2, a programmable device 200 represents any suitable programmable device. For example, the programmable device 200 could be an FPGA or an ASIC… accelerator 1 220A, accelerator 2 220B, . . . , accelerator N 220N (i.e., AFCs)… The accelerator image (i.e., AFU), once loaded into the programmable device such as 200 in FIG. 2, creates an accelerator in the programmable device that may be called as needed by one or more computer programs to provide the hardware accelerator(s); Fig. 22, Accelerator for Loop Portion, Accelerator for Branching Tree Portion, Accelerator for Lengthy Serial Portion; [0035]: The accelerator image generator 327 dynamically generates an accelerator image corresponding to the code portion 326 in the computer program 323 identified by the code analyzer 325. The accelerator image generator 327 may generate an accelerator image from code portion 326 using any suitable method. For example, the accelerator image generator 327 could generate an equivalent hardware description language (HDL) representation of the code portion 326, then synthesize the HDL representation into a suitable accelerator image for the programmable device 312. The accelerator implementer 328 preferably takes an accelerator image generated by the accelerator image generator 327, and uses the accelerator image to program the programmable device 312, thereby generating a hardware accelerator 314 in programmable device 312 that corresponds to the code portion 326.); and 
an interface circuit operable to map at least one of the plurality of AFCs to a corresponding host-assignable interface at least partially spanning the host interface between the host processor and the coprocessor [accelerator] ([0002]; [0029]: the Open Coherent Accelerator Processor Interface (OpenCAPI) is a specification that defines an interface that allows any processor to attach to coherent user-level accelerators and I/O devices. Referring to FIG. 1, a sample computer system 100 is shown to illustrate some of the concepts related to the OpenCAPI interface 150. A processor 110 is coupled to a standard memory 140 or memory hierarchy, as is known in the art. The processor is coupled via a PCIe interface 120 to one or more PCIe devices 130. The processor 110 is also coupled via an OpenCAPI interface 150 to one or more coherent devices, such as accelerator 160, coherent network controller 170, advanced memory 180, and coherent storage controller 190 that controls data stored in storage 195. While the OpenCAPI interface 150 is shown as a separate entity in FIG. 1 for purposes of illustration, instead of being a separate interface as shown in FIG. 1, the OpenCAPI interface 150 can be implemented within each of the coherent devices. Thus, accelerator 160 may have its own OpenCAPI interface, as may the other coherent devices 170, 180 and 190. One of the significant benefits of OpenCAPI is that virtual addresses for the processor 110 can be shared with coherent devices that are coupled to or include an OpenCAPI interface, permitting them to use the virtual addresses in the same manner as the processor 110.).

	Carey teaches configuring different portions of an accelerator with different functionalities but does not expressly teach a coprocessor and a partial reconfiguration region, wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine.

	However, Marshall in a similar field of endeavor discusses programmable integrated circuits a coprocessor and a partial reconfiguration region (Col. 1, line 47 through Col. 2, line 8: A commercially successful form of reconfigurable device is the field-programmable gate array (FPGA). These devices consist of a collection of configurable processing elements embedded in a configurable interconnect network. Configuration memory is provided to describe the interconnect configuration--often SRAM is used. These devices have a very fine-grained structure: typically each processing element of an FPGA is a configurable gate. Rather than being-concentrated in a central ALU, processing is thus distributed across the device and the silicon area of the device is used more effectively. An example of a commercially available FPGA series is the Xilinx 4000 series. Such reconfigurable devices can in principle be used for any computing application for which a processor or an ASIC is used. However, a particularly suitable use for such devices is as a coprocessor to handle tasks which are computationally intensive, but which are not so common as to merit a purpose built ASIC. A reconfigurable coprocessor could thus be programmed at different times with different configurations, each adapted for execution of a different computationally intensive task, providing greater efficiency than for a general purpose processor alone without a huge increase in overall cost. In recent FPGA devices, scope is provided for dynamic reconfiguration, wherein partial or total reconfiguration can be provided during the execution of code so that time-multiplexing can be used to provide configurations optimized for different subtasks at different stages of execution of a piece of code)

	It would have been obvious to one of ordinary skill in the art to combine the teachings of Marshall with the teachings of Carey to further define coprocessors as accelerators comprising cores programmable to execute specific functions to aid in the execution of a host processor. The modification would have been motivated by the desire of designing and implementing dynamically configurable many-core systems to increase efficiency, flexibility, and scalability associated with performing parallel processing tasks for certain applications.
	
	Carey and Marshall do not expressly teach wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine.
	
	However, Dube teaches wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine (Col. 10, line 58 through Col. 11, line 3: In the illustrated embodiment, information handling system 100 also includes an FPGA 190, which may be programmed to provide services to software applications running on any of the virtual machines 105 by loading a bitstream file for each such service into a respective reconfigurable region of FPGA 190. The bitstream file, which may include a binary image or an object file that is consumed by the FPGA in order to program its gates, may be referred to herein as simply a “bitstream.” In various embodiments, a bitstream loaded on FPGA 109 may implement an accelerator program or another type of specialized routine that is called by application software on one or more of the virtual machines 105.; Col. 12, lines 48-51: In certain embodiments, a guest OS 108 or software application running on guest OS 108 on a virtual machine 105 may access services implemented on FPGA 190. In certain embodiments, FPGA 190 may be installed in hypervisor 104 as a virtual PCIe device.; Col. 12, line 65 through Col. 13, line 12: The systems and method disclosed herein may allow one or more FPGAs in a server to be used as a shared resource for virtual machines. For example, the techniques disclosed herein may be used to manage an FPGA card that is being used as a shared resource between virtual machines, various ones of which might or might not reside on the same physical computer hardware. More specifically, these techniques allow an FPGA service manager, such as FPGA service manager 220, to control or coordinate the programming, by users of virtual machines (e.g., guest operating systems, software applications, or other clients of the virtual machines) of reconfigurable regions of an FPGA with services that they need to perform their functions.; Col. 13, lines 23-44: The techniques described herein may allow virtual machines to load or reconfigure respective regions of an FPGA themselves. In certain embodiments of the present disclosure, a virtual machine may provide a bitstream to the FPGA service manager for a service to be loaded into an FPGA for the use of a guest OS or software application running on the virtual machine. Alternatively, if a bitstream for a requested service is stored in a catalog maintained by the hypervisor, the FPGA service manager may obtain the bitstream from the catalog for loading into the FPGA. If a bitstream for a requested shared service has already been loaded into an FPGA, it may be accessed by one or more virtual machines other than the virtual machine at whose request the bitstream was loaded into the FPGA. In certain embodiment, a single virtual machine may access multiple services implemented by bitstreams loaded into an FPGA. This may include both a bitstream (or multiple bitstreams) provided by the same virtual machine (or a different virtual machine) and a bitstream (or multiple bitstreams) obtained from the catalog maintained by the hypervisor, in some embodiments.).
	
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Dube with the teachings of Carey and Marshall to utilize FPGA regions to provide different services to at least one virtual machine. The modification would have been motivated by the desire of incorporating the benefits of FPGAs as computing accelerators into virtualization environments (See at least Background).

Regarding claim 2, Carey teaches wherein the host interface is a selected one of a Peripheral Component Interconnect Express (PCIe) interface, Cache Coherent Interconnect for Accelerators (CCIX) interface, Gen-Z interface, Open Coherent Accelerator Processor Interface (OpenCAPI) interface ([0029]: the Open Coherent Accelerator Processor Interface (OpenCAPI) is a specification that defines an interface that allows any processor to attach to coherent user-level accelerators and I/O devices. Referring to FIG. 1, a sample computer system 100 is shown to illustrate some of the concepts related to the OpenCAPI interface 150. A processor 110 is coupled to a standard memory 140 or memory hierarchy, as is known in the art. The processor is coupled via a PCIe interface 120 to one or more PCIe devices 130. The processor 110 is also coupled via an OpenCAPI interface 150 to one or more coherent devices, such as accelerator 160, coherent network controller 170, advanced memory 180, and coherent storage controller 190 that controls data stored in storage 195. While the OpenCAPI interface 150 is shown as a separate entity in FIG. 1 for purposes of illustration, instead of being a separate interface as shown in FIG. 1, the OpenCAPI interface 150 can be implemented within each of the coherent devices. Thus, accelerator 160 may have its own OpenCAPI interface, as may the other coherent devices 170, 180 and 190. One of the significant benefits of OpenCAPI is that virtual addresses for the processor 110 can be shared with coherent devices that are coupled to or include an OpenCAPI interface, permitting them to use the virtual addresses in the same manner as the processor 110.), Intel Accelerator Link (IAL) interface, and NVLink interface.

Regarding claim 3, Dube teaches wherein the host interface is a Peripheral Component Interconnect Express (PCIe) interface that supports single-root input/output virtualization (SR-IOV) or scalable input/output virtualization (Scalable IOV) (Col. 12, lines 48-51: In certain embodiments, a guest OS 108 or software application running on guest OS 108 on a virtual machine 105 may access services implemented on FPGA 190. In certain embodiments, FPGA 190 may be installed in hypervisor 104 as a virtual PCIe device. For example, information handling system 100 may include multiple processors connected to various devices, such as Peripheral Component Interconnect (PCI) devices and PCI express (PCIe) devices, including FPGA 190. The operating system (or BIOS) may include one or more drivers configured to facilitate the use of these devices.).

Regarding claim 5, Carey teaches wherein the host- assignable interface is associated with a task offloading module selected from the group consisting of: a virtual machine on the host processor, a container on the host processor, and a process on the host processor (Fig. 3, Accelerator Deployment Tool hosted within the host system; [0033]).

Regarding claim 13, Carey teaches wherein the coprocessor maintains a device feature list that allows the host processor to enumerate the plurality of AFCs ([0071] FIG. 22 shows a programmable device 2220 that has an OpenCAPI interface 2230 and includes an accelerator for loop portion 2240, an accelerator for branching tree portion 2250, and an accelerator for lengthy serial portion 2260. While these three accelerators are shown to be implemented in the same programmable device 2220 in FIG. 22).

Regarding claim 14, Carey teaches wherein the interface circuit is managed by the host-assignable interface, and wherein the host-assignable interface has a base address register that points to the device feature list (Fig. 10; [0060]: location).

Regarding claim 15, Carey teaches wherein the device feature list comprises a linked list of device feature headers, wherein a first of the device feature headers exposes identifier and location information associated with the AFU, and wherein a series of the device feature headers expose identifier and location information associated with the plurality of AFCs (Fig. 10; [0060]: FIG. 10 shows a sample accelerator catalog 1000, which is one suitable implementation for the accelerator catalog 329 shown in FIG. 3. An accelerator catalog may include any suitable data or information that may be needed for an accelerator or the corresponding code portion. For the specific example shown in FIG. 10, accelerator catalog includes each of the following fields: Name, Location, Least Recently Used (LRU), Most Recently Used (MRU), Dependencies, Capabilities, Latency, and Other Characteristics. The Name field preferably includes a name for the accelerator. The name field may also include a name for a code portion that corresponds to the accelerator.).

Regarding claim 16, Carey teaches wherein the identifier and location information associated with the AFU and the plurality of AFCs is stored in programmable registers within the interface circuit ([0032]: a catalog of previously-generated accelerators is maintained, and when a previously-generated accelerator can be used, a hardware accelerator is dynamically generated in a programmable device using the previously-generated accelerator image specified in the catalog, and the identified code portion of the computer program is replaced with a call to the hardware accelerator; [0034] Main memory 320 preferably contains data 321, an operating system 322, a computer program 323, an accelerator deployment tool 324, and an accelerator catalog 329.).

Regarding claim 17, Carey teaches further comprising an external memory coupled to the host processor, wherein the identifier and location information associated with the AFU and the plurality of AFCs is stored in the external memory ([0037] Computer system 300 utilizes well known virtual addressing mechanisms that allow the programs of computer system 300 to behave as if they only have access to a large, contiguous address space instead of access to multiple, smaller storage entities such as main memory 320 and local mass storage device 355. Therefore, while data 321, operating system 322, computer program 323, accelerator deployment tool 324, and accelerator catalog 329 are shown to reside in main memory 320, those skilled in the art will recognize that these items are not necessarily all completely contained in main memory 320 at the same time. It should also be noted that the term “memory” is used herein generically to refer to the entire virtual memory of computer system 300, and may include the virtual memory of other computer systems coupled to computer system 300.).

Regarding claim 18, Carey teaches further comprising a memory coupled to the host processor, wherein the identifier and location information associated with the AFU is stored in programmable registers within the interface circuit, and wherein the identifier and location information associated with the plurality of AFCs is stored in the memory ([0037] Computer system 300 utilizes well known virtual addressing mechanisms that allow the programs of computer system 300 to behave as if they only have access to a large, contiguous address space instead of access to multiple, smaller storage entities such as main memory 320 and local mass storage device 355. Therefore, while data 321, operating system 322, computer program 323, accelerator deployment tool 324, and accelerator catalog 329 are shown to reside in main memory 320, those skilled in the art will recognize that these items are not necessarily all completely contained in main memory 320 at the same time. It should also be noted that the term “memory” is used herein generically to refer to the entire virtual memory of computer system 300, and may include the virtual memory of other computer systems coupled to computer system 300.).

Regarding claim 19, Carey teaches wherein the device feature headers are implemented as programmable registers within the interface circuit, wherein the host-assignable interface comprises a privileged host-assignable interface, and wherein the programmable registers are programmed using the privileged host-assignable interface ([0029]; [0030]: For example, the programmable device 200 could be an FPGA or an ASIC. An OpenCAPI interface 210 can be implemented within the programmable device. In addition, one or more accelerators can be implemented in the programmable device 200… The accelerator image, once loaded into the programmable device such as 200 in FIG. 2, creates an accelerator in the programmable device that may be called as needed by one or more computer programs to provide the hardware accelerator(s).).

Regarding claim 20, Marshall teaches wherein at least some of the device feature headers are implemented as programmable registers within of the partial reconfiguration region of the AFU (Col. 1, line 59 through Col. 2 line 7: A reconfigurable coprocessor could thus be programmed at different times with different configurations, each adapted for execution of a different computationally intensive task, providing greater efficiency than for a general purpose processor alone without a huge increase in overall cost. In recent FPGA devices, scope is provided for dynamic reconfiguration, wherein partial or total reconfiguration can be provided during the execution of code so that time-multiplexing can be used to provide configurations optimized for different subtasks at different stages of execution of a piece of code.)

Regarding claim 21, Davis teaches wherein the host-assignable interface comprises a privileged host-assignable interface, and wherein each device feature header in the series of device feature headers can be accessed using the base address register of the privileged host-assignable interface ([0066]: In a PCIe context, this can be implemented by assigning different user host partitions to different memory address ranges by configuring the base address registers (BARs) to reserve certain memory address ranges for certain combinations of host partitions and configurable logic partitions.).

Regarding claim 24, Carey teaches a method for operating a system that includes a host processor and a programmable accelerator device, the method comprising: 
offloading tasks from the host processor to the programmable accelerator device (Fig. 1, Processor 110; Abstract: A computer program is monitored as it executes. A code portion in the computer program is identified that will be improved from being deployed to a hardware accelerator; [0040]: off-load compute-intensive processing from processor 310); configuring a slot on the programmable accelerator device to implement an accelerator function unit (AFU), wherein the AFU is subdivided into a plurality of accelerator function unit contexts (AFCs) (Referring to FIG. 2, a programmable device 200 represents any suitable programmable device. For example, the programmable device 200 could be an FPGA or an ASIC… accelerator 1 220A, accelerator 2 220B, . . . , accelerator N 220N (i.e., AFCs)… The accelerator image (i.e., AFU), once loaded into the programmable device such as 200 in FIG. 2, creates an accelerator in the programmable device that may be called as needed by one or more computer programs to provide the hardware accelerator(s); Fig. 22, Accelerator for Loop Portion, Accelerator for Branching Tree Portion, Accelerator for Lengthy Serial Portion; [0035]: The accelerator image generator 327 dynamically generates an accelerator image corresponding to the code portion 326 in the computer program 323 identified by the code analyzer 325. The accelerator image generator 327 may generate an accelerator image from code portion 326 using any suitable method. For example, the accelerator image generator 327 could generate an equivalent hardware description language (HDL) representation of the code portion 326, then synthesize the HDL representation into a suitable accelerator image for the programmable device 312. The accelerator implementer 328 preferably takes an accelerator image generated by the accelerator image generator 327, and uses the accelerator image to program the programmable device 312, thereby generating a hardware accelerator 314 in programmable device 312 that corresponds to the code portion 326.); and 
using an interface circuit in the programmable accelerator device to map the plurality of AFCs to corresponding host-assignable interfaces([0002]; [0029]: the Open Coherent Accelerator Processor Interface (OpenCAPI) is a specification that defines an interface that allows any processor to attach to coherent user-level accelerators and I/O devices. Referring to FIG. 1, a sample computer system 100 is shown to illustrate some of the concepts related to the OpenCAPI interface 150. A processor 110 is coupled to a standard memory 140 or memory hierarchy, as is known in the art. The processor is coupled via a PCIe interface 120 to one or more PCIe devices 130. The processor 110 is also coupled via an OpenCAPI interface 150 to one or more coherent devices, such as accelerator 160, coherent network controller 170, advanced memory 180, and coherent storage controller 190 that controls data stored in storage 195. While the OpenCAPI interface 150 is shown as a separate entity in FIG. 1 for purposes of illustration, instead of being a separate interface as shown in FIG. 1, the OpenCAPI interface 150 can be implemented within each of the coherent devices. Thus, accelerator 160 may have its own OpenCAPI interface, as may the other coherent devices 170, 180 and 190. One of the significant benefits of OpenCAPI is that virtual addresses for the processor 110 can be shared with coherent devices that are coupled to or include an OpenCAPI interface, permitting them to use the virtual addresses in the same manner as the processor 110.).
 
Carey does not expressly teach perform a context-level reset operation on a selected AFC in the plurality of AFCs, wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine.

However, Marshall teaches perform a context-level reset operation on a selected AFC in the plurality of AFCs (Col. 1, line 47 through Col. 2, line 8: A commercially successful form of reconfigurable device is the field-programmable gate array (FPGA). These devices consist of a collection of configurable processing elements embedded in a configurable interconnect network. Configuration memory is provided to describe the interconnect configuration--often SRAM is used. These devices have a very fine-grained structure: typically each processing element of an FPGA is a configurable gate. Rather than being-concentrated in a central ALU, processing is thus distributed across the device and the silicon area of the device is used more effectively. An example of a commercially available FPGA series is the Xilinx 4000 series. Such reconfigurable devices can in principle be used for any computing application for which a processor or an ASIC is used. However, a particularly suitable use for such devices is as a coprocessor to handle tasks which are computationally intensive, but which are not so common as to merit a purpose built ASIC. A reconfigurable coprocessor could thus be programmed at different times with different configurations, each adapted for execution of a different computationally intensive task, providing greater efficiency than for a general purpose processor alone without a huge increase in overall cost. In recent FPGA devices, scope is provided for dynamic reconfiguration, wherein partial or total reconfiguration can be provided during the execution of code so that time-multiplexing can be used to provide configurations optimized for different subtasks at different stages of execution of a piece of code)

	It would have been obvious to one of ordinary skill in the art to combine the teachings of Marshall with the teachings of Carey to further define coprocessors as accelerators comprising cores programmable to execute specific functions to aid in the execution of a host processor. The modification would have been motivated by the desire of designing and implementing dynamically configurable many-core systems to increase efficiency, flexibility, and scalability associated with performing parallel processing tasks for certain applications.

	Carey and Marshall do not expressly teach wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine.

However, Dube teaches wherein the plurality of AFCs are associated with a virtual machine, and wherein respective AFCs of the plurality of AFCs within the AFU are reconfigurable together based at least in part on the association with the virtual machine (Col. 10, line 58 through Col. 11, line 3: In the illustrated embodiment, information handling system 100 also includes an FPGA 190, which may be programmed to provide services to software applications running on any of the virtual machines 105 by loading a bitstream file for each such service into a respective reconfigurable region of FPGA 190. The bitstream file, which may include a binary image or an object file that is consumed by the FPGA in order to program its gates, may be referred to herein as simply a “bitstream.” In various embodiments, a bitstream loaded on FPGA 109 may implement an accelerator program or another type of specialized routine that is called by application software on one or more of the virtual machines 105.; Col. 12, lines 48-51: In certain embodiments, a guest OS 108 or software application running on guest OS 108 on a virtual machine 105 may access services implemented on FPGA 190. In certain embodiments, FPGA 190 may be installed in hypervisor 104 as a virtual PCIe device.; Col. 12, line 65 through Col. 13, line 12: The systems and method disclosed herein may allow one or more FPGAs in a server to be used as a shared resource for virtual machines. For example, the techniques disclosed herein may be used to manage an FPGA card that is being used as a shared resource between virtual machines, various ones of which might or might not reside on the same physical computer hardware. More specifically, these techniques allow an FPGA service manager, such as FPGA service manager 220, to control or coordinate the programming, by users of virtual machines (e.g., guest operating systems, software applications, or other clients of the virtual machines) of reconfigurable regions of an FPGA with services that they need to perform their functions.; Col. 13, lines 23-44: The techniques described herein may allow virtual machines to load or reconfigure respective regions of an FPGA themselves. In certain embodiments of the present disclosure, a virtual machine may provide a bitstream to the FPGA service manager for a service to be loaded into an FPGA for the use of a guest OS or software application running on the virtual machine. Alternatively, if a bitstream for a requested service is stored in a catalog maintained by the hypervisor, the FPGA service manager may obtain the bitstream from the catalog for loading into the FPGA. If a bitstream for a requested shared service has already been loaded into an FPGA, it may be accessed by one or more virtual machines other than the virtual machine at whose request the bitstream was loaded into the FPGA. In certain embodiment, a single virtual machine may access multiple services implemented by bitstreams loaded into an FPGA. This may include both a bitstream (or multiple bitstreams) provided by the same virtual machine (or a different virtual machine) and a bitstream (or multiple bitstreams) obtained from the catalog maintained by the hypervisor, in some embodiments.).
	
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Dube with the teachings of Carey and Marshall to utilize FPGA regions to provide different services to at least one virtual machine. The modification would have been motivated by the desire of incorporating the benefits of FPGAs as computing accelerators into virtualization environments (See at least Background).

Regarding claim 25, Carey teaches wherein the context- level reset operation is initiated via a function-level reset directed at the AFU or via management registers associated with the interface circuit ([0030] Deploying accelerators to programmable devices is well-known in the art. Referring to FIG. 2, a programmable device 200 represents any suitable programmable device. For example, the programmable device 200 could be an FPGA or an ASIC. An OpenCAPI interface 210 can be implemented within the programmable device. In addition, one or more accelerators can be implemented in the programmable device 200. FIG. 1 shows by way of example accelerator 1 220A, accelerator 2 220B, . . . , accelerator N 220N. In the prior art, a human designer would determine what type of accelerator is needed based on a function that needs to be accelerated by being implemented in hardware. The accelerator function could be represented, for example, in a hardware description language (HDL). Using known tools, the human designer can then generate an accelerator image that corresponds to the HDL. The accelerator image, once loaded into the programmable device such as 200 in FIG. 2, creates an accelerator in the programmable device that may be called as needed by one or more computer programs to provide the hardware accelerator(s).).

Regarding claim 26, Carey teaches further comprising: using the interface circuit to filter transactions targeted to the selected AFC (Fig. 11, Steps 1120 Analyze Run-time Performance of Computer Program, Step 1130 Identify Codes Portion in Computer Program that will be improved by use of a hardware accelerator, 1140 Select a code portion, 1150 Previously Generated accelerator in catalog for selected code portion? 1162 revise computer program to replace selected code portion with call to accelerator; Fig. 22).

Regarding claim 31, it is a method type claim having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale above.

Regarding claim 32, Carey teaches further comprising: updating device feature header registers on the acceleration device with new identifier information associated with the AFU and the AFCs (Fig. 2, Fig. 10, Fig. 22; Fig. 10; [0060]: FIG. 10 shows a sample accelerator catalog 1000, which is one suitable implementation for the accelerator catalog 329 shown in FIG. 3. An accelerator catalog may include any suitable data or information that may be needed for an accelerator or the corresponding code portion. For the specific example shown in FIG. 10, accelerator catalog includes each of the following fields: Name, Location, Least Recently Used (LRU), Most Recently Used (MRU), Dependencies, Capabilities, Latency, and Other Characteristics. The Name field preferably includes a name for the accelerator. The name field may also include a name for a code portion that corresponds to the accelerator.).

Claims 4, 6-11, 22, 23, and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Carey, Marshall, and Dube, as applied to claim 1, in further view of Davis (US 2018/0088174 A1).

Regarding claim 4, Dube teaches a PCIe connection to the FPGA but neither Carey, Marshall, nor Dube expressly teaches wherein the host-assignable interface is a selected one of a PCIe physical function, a PCIe SR-IOV virtual function, and a PCIe Scalable IOV assignable device interface
However, Davis teaches wherein the host-assignable interface is a selected one of a PCIe physical function, a PCIe SR-IOV virtual function, and a PCIe Scalable IOV assignable device interface ([0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Davis with the teachings of Carey and Marshall to utilize PCIe to establish communication between a host processor and an accelerator. The modification would have been motivated by the desire of offloading compute intensive tasks to the accelerators.

Regarding claim 6, Davis teaches wherein the plurality of AFCs are provided with unique context identifiers, and wherein transactions between the interface circuit and the AFU are tagged with the unique context identifiers to provide address space isolation ([0045] The CPU 144 is coupled to the configurable hardware 142 via an interface 350. The interface 350 can be implemented with any suitable interconnect technology, including, but not limited to: PCIe, Ethernet, and Infiniband. Each of the application logic portions uses a different reserve portion of the interface 350 in order to communicate to its associated user mode process. For example, each of the user mode processes may be allowed access to a different range of memory addresses, and the host logic 310 in turn couples each of the individual application logic portions to only the memory address ranges associated with their corresponding process.).

Regarding claim 7, Davis teaches wherein the interface circuit uses a context mapping table to map the unique context identifiers to platform-specific identifiers for upstream and downstream memory requests between the host processor and the plurality of AFCs and for requests initiated by the host processor to the plurality of AFCs ([0045]: For example, each of the user mode processes may be allowed access to a different range of memory addresses, and the host logic 310 in turn couples each of the individual application logic portions to only the memory address ranges associated with their corresponding process. Thus, the application logic is further independent because data cannot be sent to or from user mode processes other than those user mode processes associated with the application logic unit; [0061] The control plane software stack can include a management driver 556 for communicating over the physical interconnect connecting the server computer 540 to the configurable hardware platform 510. The management driver 556 can encapsulate commands, requests, responses, messages, and data originating from the management partition 550 for transmission over the physical interconnect. Additionally, the management driver 556 can de-encapsulate commands, requests, responses, messages, and data sent to the management partition 550 over the physical interconnect. Specifically, the management driver 556 can communicate with the host logic 520 of the configurable hardware platform 510 via one or more of the supervisor lanes 525-527. For example, the supervisor lanes can access a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The management driver 556 can communicate with the host logic 520 by addressing transactions to the address range assigned to one or more of the supervisor lanes 525-527; [0066]; [0067]: The application logic 530 can be partitioned into two or more portions, and each of the portions can be assigned to one or more of the user host partitions. Each of the configurable logic partitions are excluded from accessing other partitions of the configurable hardware platform by the host logic 520, which manages partitioning of the application logic 530 resources, and communications between the application logic 530 and user host partitions 560. As shown, the host logic 520 allocates a number of different lanes of interconnect supervisor lanes 525-527. In a PCIe context, each of the lanes can be associated with a user host partition/configurable logic partition pair; [0070] In alternative examples, the partitions within the application logic 530 are configured to communicate their respective associated user host partitions 560 without communicating through the host logic 520. In such examples, the user configurable logic partitions are coupled to a respective one of the user host partitions via one of the user lanes 535-537. Each of the user lanes is configured to transmit data between the host partitions and the configurable logic partitions. For example, in a PCIe context, each lane is associated with a different memory address range; [0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537. Specifically, the application logic 530 can communicate with an application logic management driver 562 to exchange commands, requests, responses, messages, and data over the control plane. The application logic 530 can communicate with an application logic data plane driver 563 to exchange commands, requests, responses, messages, and data over the data plane; [0073]).

Regarding claim 8, Davis teaches wherein the platform- specific identifiers comprise Peripheral Component Interconnect Express (PCIe) bus, device, and function numbers and optionally a process address space identifier (PASID) ([0066]: In a PCIe context, this can be implemented by assigning different user host partitions to different memory address ranges by configuring the base address registers (BARs) to reserve certain memory address ranges for certain combinations of host partitions and configurable logic partitions.).

Regarding claim 9, Davis teaches wherein: the host-assignable interface comprises a PCIe physical function, and all of the AFCs in the AFU are associated with and are accessed through the PCIe physical function during a physical function (PF) mode; 
the host-assignable interface comprises a PCIe virtual function, and all of the AFCs in the AFU are associated with and are accessed through the PCIe virtual function during a virtual function (VF) mode ([0072] The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537.); or 
at least a first portion of the AFCs in the AFU are associated with and are accessed through the PCIe physical function, and at least a second portion of the AFCs in the AFU are associated with and are accessed through the PCIe virtual function during a mixed mode.

Regarding claim 10, Davis teaches wherein the interface circuit further comprises an internal table for saving the unique context identifiers, and wherein the internal table is indexed by PCIe tags automatically associated with the upstream memory requests and returned with the downstream memory requests ([0045]: For example, each of the user mode processes may be allowed access to a different range of memory addresses, and the host logic 310 in turn couples each of the individual application logic portions to only the memory address ranges associated with their corresponding process. Thus, the application logic is further independent because data cannot be sent to or from user mode processes other than those user mode processes associated with the application logic unit; [0061] The control plane software stack can include a management driver 556 for communicating over the physical interconnect connecting the server computer 540 to the configurable hardware platform 510. The management driver 556 can encapsulate commands, requests, responses, messages, and data originating from the management partition 550 for transmission over the physical interconnect. Additionally, the management driver 556 can de-encapsulate commands, requests, responses, messages, and data sent to the management partition 550 over the physical interconnect. Specifically, the management driver 556 can communicate with the host logic 520 of the configurable hardware platform 510 via one or more of the supervisor lanes 525-527. For example, the supervisor lanes can access a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The management driver 556 can communicate with the host logic 520 by addressing transactions to the address range assigned to one or more of the supervisor lanes 525-527; [0066]; [0067]: The application logic 530 can be partitioned into two or more portions, and each of the portions can be assigned to one or more of the user host partitions. Each of the configurable logic partitions are excluded from accessing other partitions of the configurable hardware platform by the host logic 520, which manages partitioning of the application logic 530 resources, and communications between the application logic 530 and user host partitions 560. As shown, the host logic 520 allocates a number of different lanes of interconnect supervisor lanes 525-527. In a PCIe context, each of the lanes can be associated with a user host partition/configurable logic partition pair; [0070] In alternative examples, the partitions within the application logic 530 are configured to communicate their respective associated user host partitions 560 without communicating through the host logic 520. In such examples, the user configurable logic partitions are coupled to a respective one of the user host partitions via one of the user lanes 535-537. Each of the user lanes is configured to transmit data between the host partitions and the configurable logic partitions. For example, in a PCIe context, each lane is associated with a different memory address range; [0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537. Specifically, the application logic 530 can communicate with an application logic management driver 562 to exchange commands, requests, responses, messages, and data over the control plane. The application logic 530 can communicate with an application logic data plane driver 563 to exchange commands, requests, responses, messages, and data over the data plane; [0073]).

Regarding claim 11, Davis teaches wherein the coprocessor further comprises an address decoder configured to decode the unique context identifiers based on a memory- mapped input-output (MMIO) address associated with the host- assignable interface ([0045]; [0061]: Additionally, the management driver 556 can de-encapsulate commands, requests, responses, messages, and data sent to the management partition 550 over the physical interconnect. Specifically, the management driver 556 can communicate with the host logic 520 of the configurable hardware platform 510 via one or more of the supervisor lanes 525-527. For example, the supervisor lanes can access a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The management driver 556 can communicate with the host logic 520 by addressing transactions to the address range assigned to one or more of the supervisor lanes 525-527).

Regarding claim 22, Davis teaches wherein the host- assignable interface comprises an unprivileged host-assignable interface, and wherein only a subset of device feature headers in the series of device feature headers can be accessed using a base address register associated with the unprivileged host-assignable interface ([0066]: In a PCIe context, this can be implemented by assigning different user host partitions to different memory address ranges by configuring the base address registers (BARs) to reserve certain memory address ranges for certain combinations of host partitions and configurable logic partitions; [0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537.).

Regarding claim 23, Davis teaches wherein the host-assignable interface comprises a privileged host-assignable interface that is operable to reprogram the device feature list or at least some of the device feature headers in the device feature list ([0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537.).

Regarding claim 33, Davis teaches further comprising: setting up a context mapping table to include the new identifier information, wherein the context mapping table maps unique context identifiers associated with the plurality of AFCs to platform-specific identifiers ([0045]: For example, each of the user mode processes may be allowed access to a different range of memory addresses, and the host logic 310 in turn couples each of the individual application logic portions to only the memory address ranges associated with their corresponding process. Thus, the application logic is further independent because data cannot be sent to or from user mode processes other than those user mode processes associated with the application logic unit; [0061] The control plane software stack can include a management driver 556 for communicating over the physical interconnect connecting the server computer 540 to the configurable hardware platform 510. The management driver 556 can encapsulate commands, requests, responses, messages, and data originating from the management partition 550 for transmission over the physical interconnect. Additionally, the management driver 556 can de-encapsulate commands, requests, responses, messages, and data sent to the management partition 550 over the physical interconnect. Specifically, the management driver 556 can communicate with the host logic 520 of the configurable hardware platform 510 via one or more of the supervisor lanes 525-527. For example, the supervisor lanes can access a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The management driver 556 can communicate with the host logic 520 by addressing transactions to the address range assigned to one or more of the supervisor lanes 525-527; [0066]; [0067]: The application logic 530 can be partitioned into two or more portions, and each of the portions can be assigned to one or more of the user host partitions. Each of the configurable logic partitions are excluded from accessing other partitions of the configurable hardware platform by the host logic 520, which manages partitioning of the application logic 530 resources, and communications between the application logic 530 and user host partitions 560. As shown, the host logic 520 allocates a number of different lanes of interconnect supervisor lanes 525-527. In a PCIe context, each of the lanes can be associated with a user host partition/configurable logic partition pair; [0070] In alternative examples, the partitions within the application logic 530 are configured to communicate their respective associated user host partitions 560 without communicating through the host logic 520. In such examples, the user configurable logic partitions are coupled to a respective one of the user host partitions via one of the user lanes 535-537. Each of the user lanes is configured to transmit data between the host partitions and the configurable logic partitions. For example, in a PCIe context, each lane is associated with a different memory address range; [0072]: The application logic 530 can be used to communicate with drivers of the user host partitions 560. In, for example, a PCIe context, user lanes 535-537 can be implemented as a physical or virtual function mapped to an address range during an enumeration of devices connected to the physical interconnect. The application drivers can communicate with the application logic 530 by addressing transactions to the address range assigned to a certain one of the user lanes 535-537. Specifically, the application logic 530 can communicate with an application logic management driver 562 to exchange commands, requests, responses, messages, and data over the control plane. The application logic 530 can communicate with an application logic data plane driver 563 to exchange commands, requests, responses, messages, and data over the data plane; [0073])..

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Carey Marshall, and Dube,  as applied to claim 1, in further view of Albot US 2017/0123690 A1.

Regarding claim 12, Carey Marshall, and Dube do not expressly teach wherein a given AFC in the plurality of AFCs is operable to issue an interrupt to the host processor, and wherein the interrupt is tagged with a unique context identifier associated with only the given AFC.

	However, Albot teaches wherein a given AFC in the plurality of AFCs is operable to issue an interrupt to the host processor, and wherein the interrupt is tagged with a unique context identifier associated with only the given AFC (Abstract: a temporary effective address associated with a virtual segment identifier (VSID), wherein the VSID is received by a processor in an asynchronous interrupt generated by a coherent accelerator in response to a page fault generated by the coherent accelerator in executing an instruction).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Albot with the teachings of Carey Marshall, and Dube to notify a processor of a page fault occurrence in an accelerator. The modification would have been motivated by the desire of utilizing hardware accelerator such as FPGAs to handle compute intensive tasks.

Claims 27 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Carey, Marshall, and Dube, as applied to claim 24 above, in further view of Jiang et al. (US 2019/0004842 A1).

Regarding claim 27, Carey, Marshall, and Dube do not expressly teach further comprising: using the interface circuit to send a context-level reset (CLR) message to the selected AFC to direct the AFU to stop issuing requests associated with the selected AFC.

	However, Jiang teaches further comprising: using the interface circuit to send a context-level reset (CLR) message to the selected AFC to direct the AFU to stop issuing requests associated with the selected AFC ([0051]: The virtualization context switch also includes performing a “reset” on the work engine 117, which causes work for the current virtual function to stop in the hardware accelerator(s) 234 and microcontroller(s) 230 and causes the hardware accelerator(s) 234 and microcontroller 230 to restart.).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Jiang with the teachings of Carey, Marshall, and Dube to restart/reset an accelerator. The modification would have been motivated by the desire of re-initializing and/or reprogramming the hardware accelerator. 

Regarding claim 28, Jiang teaches further comprising: in response to receiving the CLR message from the interface circuit, using the selected AFC to return a context-level reset (CLR) acknowledgement to the interface circuit ([0051]: The virtualization context switch also includes performing a “reset” on the work engine 117, which causes work for the current virtual function to stop in the hardware accelerator(s) 234 and microcontroller(s) 230 and causes the hardware accelerator(s) 234 and microcontroller 230 to restart. With the restart, the virtualization scheduler 236 performs a initialization of the hardware accelerator(s) 234 and reprograms the firmware location for the microcontroller 230. Once the re-initialization and reprogramming are done, the virtualization scheduler 236 causes the hardware accelerator(s) 234 and microcontroller 230 to start processing any pending jobs. (i.e., acknowledgement)).

Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Carey, Marshall, Dube, and Jiang, as applied to claim 28, in further view of Hugosson et al. (US 2011/0191539 A1).

Regarding claim 29, Carey, Marshall, Dube, and Jiang do not expressly teach further comprising: 
after sending the CLR message to the selected AFC, waiting for all outstanding upstream and downstream requests associated with the selected AFC to be flushed out before terminating the function-level reset operation for the selected AFC.

However, Hugosson teaches further comprising: 
after sending the CLR message to the selected AFC, waiting for all outstanding upstream and downstream requests associated with the selected AFC to be flushed out before terminating the function-level reset operation for the selected AFC ([0006]: FIG. 1 schematically illustrates such a data processing system 100, comprising host processor 105, coprocessor 110 and system memory 115. The coprocessor 110 can be seen to comprise a set of coprocessor engines 120 and an MMU 125. Coprocessor engine 120 comprise a hardware accelerator 130, a microcontroller 135 and a DMA unit 140. The coprocessor engine 120 accesses system memory 115 via MMU 125. The configuration of MMU 125 (defined in configuration registers 145) is determined by host processor 105, which writes the required configuration data into the configuration registers 145. This enables the host processor to maintain control over the view of system memory 115 that each of the coprocessor engines 120 has. In particular, host processor 105 writes a different page table base address into one of the configuration registers 145, depending on the processing session being carried out by coprocessor 110. MMU 125 further comprises a translation look aside buffer (TLB) 150, which caches translations between virtual addresses and physical addresses previously made by the MMU 125. The MMU 125 has previously retrieved those translations from the page table in system memory 115 indicated by the page table base address stored in configuration registers 145. When coprocessor 110 completes a processing session, and will be required to start another processing session, host processor 105 issues a reset signal ("RESET"), which causes the coprocessor engines 120 to flush their local caches (and any other session specific information) and causes MMU 125 to flush TLB 150. Host processor 105 writes a new set of MMU configuration data into the configuration registers 145, in particular writing a new page table base address appropriate for the next processing session. Coprocessor 110 can then begin the next processing session).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hugosson with the teachings of Carey, Marshall, Dube, and Jiang to allow the coprocessor engines to flush their caches prior to beginning the next processing session. The modification would have been motivated by the desire of ensuring that the coprocessor is configured for the next session.

Allowable Subject Matter
Claim 30 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Response to Arguments
Applicant’s arguments with respect to claims 1-33 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JORGE A CHU JOY-DAVILA whose telephone number is (571)270-0692. The examiner can normally be reached Monday-Friday, 9:00am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai T An can be reached on (571)-272-3756. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JORGE A CHU JOY-DAVILA/Primary Examiner, Art Unit 2195