DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in this application.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
With regard to claims 19-20, the claim is drawn to “a computer-readable recording medium”. The specification recites at ¶ [0083] “The term computer readable media as used herein may include computer storage media. Computer storage media may include volatile and nonvolatile, removable and non- removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program tools. The system memory 704, the removable storage device 709, and the non-removable storage device 710 are all computer storage media examples (e.g., memory storage). Computer storage media may include RAM, ROM, electrically erasable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 700. Any such computer storage media may be part of the computing device 700. Computer storage media does not include a carrier wave or other propagated or modulated data signal.”, but does not describe or define what the claimed “a computer-readable recording medium” is limited to. Thus, applying the broadest reasonable interpretation in light of the claimed language, the “computer-readable recording medium” taking into account the meaning of the words in their ordinary usage as they would be understood by one of ordinary skill in the art (MPEP 2111), that the claim as a whole covers both transitory and non-transitory medium. A transitory medium does not fall into any of the 4 categories of invention (process, machine, manufacture, or composition of matter).


Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
As per claims 1, 13 and 19 (line# refers to claim 1):
Lines 9-10, it recites “a set of heterogeneous accelerators” However, prior to this phrase at lines 1-2, it recites “a plurality of accelerators”. Thus, it is unclear whether the second recitation of “a set of heterogeneous accelerators” is the same or different from the first recitation of “a plurality of accelerators”. If they are the same, same name should be used.

Line 10, it recites “one or more RAN servers” However, prior to this phrase at line 2, it recites “one or more radio access network (RAN) servers”. Thus, it is unclear whether the second recitation of “one or more RAN servers” is the same or different from the first recitation of “one or more radio access network (RAN) servers”. If they are the same, the or said should be used.

As per claims 2, 4 and 14 (line# refers to claim 2):
	Lines 1-2, it recites “the generating the accelerator-agnostic task instruction code” lacks antecedence basis. It is uncertain if this phrase intent to refer to “determining accelerator-agnostic task instruction code” as cited in claim 1, line 7. 

As per claims 3 and 5 (line# refers to claim 3):
	Lines 1-2, it recites “the generating the accelerator-specific executable” lacks antecedence basis. It is uncertain if this phrase intent to refer to “translating…into an accelerator-specific executable” as cited in claim 1, lines 11-12. If they are the same, same term should be used (i.e.,  the translated accelerator-specific executable).

As per claims 6, 15 and 20 (line# refers to claim 6):
In line 2, it recites the phrase “a central processor”. However, prior to this phrase in claim 1, at line 15, it recites “a central processor”. Thus, it is unclear whether the second recitation of “a central processor” is the same or different from the first recitation of “a central processor”. if they are the same, the or said should be used.

As per claims 7-12 and 16-18:
They are computer-implemented method and system claims that depend on claims 1 and 13 respectively above. Therefore, they have same deficiencies as claims 1 and 13 above. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 9-13 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Bernat et al. (US Pub. 2019/0141120 A1) in view of Sun et al. (US Patent. 10,262,390 B1) and further in view of YAMATO (US Pub. 2022/0188086 A1) and Ko (US Pub. 2013/0322270 A1)
Bernat was cited in the IDS filed on 08/19/2022.

As per claim 1, Bernat teaches the invention substantially as claimed including A computer-implemented method for generating instruction code for a plurality of accelerators for execution on one or more radio access network (RAN) servers in a RAN, a method comprising (Bernat, Fig. 1, 100 (as RAN), 140 service provider A-C (as RAN servers), 160, 162, 164, 166, 168, 170 compute/accel devices (as a plurality of accelerators), [0025] lines 1-13, The client compute device 110, edge resources 150, 152, 154 (e.g., the compute devices 160, 162, 164, 166, 168, 170), the edge gateway device 130, the fog nodes 180, the core data center 190, and the compiler compute device 120 are illustratively in communication via a network, which may be embodied as any type of wired or wireless communication network, including…a radio access network (RAN); [0026] lines 1-13, the compiler compute device 120, in operation, may execute a method 300 for compiling source code for an application (e.g., the application 114) that may be selectively offloaded (e.g., from the client compute device 110) to one or more edge resource(s) (e.g., one or more of the edge resources 150, 152, 154). The method 300 begins with block 302, in which the compiler compute device 120 determines whether to compile source code; [0033] lines 6-9, capable of being performed by) each edge resource (e.g., each device available in the edge resources 150, 152, 154, such as FPGAs, GPUs, VPUs, processors, etc. (as accelerators for execution on one or more radio access network (RAN) servers in a RAN)): 
identifying, based on the set of application code, one or more an annotation indicative for performing a task (Bernat, [0026] lines 15-27, determines whether a set of source code (e.g., a portion of the source code to be compiled) for an application (e.g., the application 114) is associated with a target performance objective (e.g., a goal to be satisfied in the execution of that portion of the application). In doing so, in the illustrative embodiment, the compiler compute device 120 determines whether an annotation indicative of a target performance objective is present in the set of the source code (as based on the set of application code), as indicated in block 306. The annotation may be any data (e.g., a language construct, such as a pragma or directive, a keyword, a predefined character sequence followed by instructions, etc.), that specifies to a compiler how the associated source code is to be processed (as identifying, based on the set of application code (i.e., set of source code), one or more an annotation indicative for performing a task (i.e., processing the source code of the application (as task));
determining accelerator-agnostic task instruction code, wherein the accelerator-agnostic task instruction code corresponds to the identified one or more an annotation indicative (Bernat, Fig. 3, 306, determine whether an annotation indicative of a target performance objective is present; [0027] lines 1-31, in determining whether an annotation indicative of a target performance objective is present, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for latency (e.g., the annotation indicates to prioritize reducing the latency with which the operations are performed). Additionally or alternatively, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for cost (e.g., the annotation indicates to prioritize reducing the monetary cost of performing the operations)…As indicated in block 314, the compiler compute device 120 may identify subsets of the source code to be executed in parallel (e.g., to reduce latency) (as determining accelerator-agnostic task instruction code (i.e., identified set of source code that is associated with target performance objective (i.e., latency, cost etc.,), which is corresponds to the identified one or more an annotation indicative)); 
identifying an accelerator for performing the task from a set of heterogeneous accelerators associated with one or more RAN servers (Bernat, Fig. 1, 100 (as RAN), 140 service provider A-C (as RAN servers), 150 edge resources including 160, 162, 164, 166, 168, 170 compute/accel devices (as set of heterogeneous accelerators associated with the RAN servers); Fig. 6, 532 determine available Edge resources; [0028] lines 2-17, determines target architecture(s) to satisfy the target performance objective(s) from block 306. As indicated in block 318, the compiler compute device 120 may determine accelerator device architecture(s) (e.g., GPU, VPU, FPGA, etc.) to reduce the latency in the execution of the operations defined in the annotated set of source code. For example, if the operations are primarily matrix multiply and accumulate operations, the compiler compute device 120 may determine that a GPU is a target architecture for performing the operations. If, on the other hand, the operations are primarily machine-learning or other artificial intelligence related operations, the compiler compute device 120 may determine that a VPU is a target architecture. Additionally or alternatively, the compiler compute device 120 may determine that a target architecture is an FPGA or other accelerator device); 
translating the accelerator-agnostic task instruction code into an accelerator-specific executable, wherein the accelerator-specific executable is executable on the identified accelerator (Bernat, Fig. 4, 324 compile the source code, 326 produce object code, 328 produce separate sections of object code for sets of the source code having target performance objectives; [0010] line 10, translating the instructions to a different format; [0029] lines 3-24, the compiler compute device 120 produces object code (e.g., a sequence of statements or instructions in a computer language, such as a machine code language (i.e., binary) or an intermediate language such as register transfer language (RTL)) (as translating the accelerator-agnostic task instruction code into an accelerator-specific executable), as indicated in block 326. As indicated in block 328, the compiler compute device 120, in the illustrative embodiment, produces one or more separate sections (e.g., binary files) of object code for each set of the source code that has a corresponding set (e.g., one or more) of target performance metrics. Further, as indicated in block 330, the compiler compute device 120 may produce multiple sets of object code (e.g., one for each of multiple architectures) for the same set of source code having a target performance objective. For example, a set (e.g., a portion) of the source code may have an annotation indicating that the set of the source code is to be executed with low latency. As such, the compiler compute device 120 may produce a section of object code with an instruction set usable by a GPU, another section of object code with an instruction set usable by a general purpose processor, and another section of object code defining a configuration of gates for an FPGA (as translated accelerator-specific executable is executable (usable) on the identified accelerator)); 
scheduling the accelerator-specific executable to offload processing of the set of application code from a central processor to the identified accelerator (Bernat, Fig. 9, 596 Offload sections of the application to target edge resource for execution, 602 send object code…to be executed by different target edge resources; [0014] lines 14-28, determine whether a section 116 of an application to be executed by the client compute device 110 is available to be offloaded to one or more of the edge resources 150, 152, 154, determine one or more characteristics of an edge resource 150, 152, 154 (e.g., a latency, a power usage, a cost of usage) available to execute the section 116 (e.g., by sending a request, to the edge gateway device 130, for the characteristics), determine, as a function of the one or more characteristics and a target performance objective associated with the section 116, whether to offload the section 116 to the one or more edge resources 150, 152, 154, and offload, in response to a determination to offload the section 116 (as whole as scheduling for offloading); [0015] lines 23-26, The processor 212 may be embodied as any type of processor capable of performing the functions described herein (e.g., executing one or more sections of the application 114); [0030] lines 10-13, the client compute device 110 may determine to enable selective edge offloading if the client compute device 110 has been requested to execute an application; also see [0034] lines 1-19 [Examiner noted: the code of application is offloaded from the processor 212 in the client device to the edge resource (accelerators) for execution)]; and 
causing the identified accelerator to execute the accelerator-specific executable for performing the task (Bernat, Fig. 9, 614, 616; [0036] lines 19-24, the client compute device 110 may send data that identifies the edge resource that is to execute a corresponding version of the object code for each of multiple different architectures (e.g., data that indicates that a bit stream defining the section is to be executed by a particular FPGA).

Bernat fails to specifically teach receiving a set of application code via an application programming interface (API).

However, Sun teaches receiving a set of application code via an application programming interface (API) (Sun, Col 5, lines 40-45, the GPU API 114 is configured to transmit blocks of application code (e.g., compute kernels) of the GPU-accelerated application 112 and any associated data, which are to be processed by one or more GPU server nodes within the server cluster 150 that have been allocated by the GPU service platform 130 to handle the service request).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat with Sun because Sun’s teaching of receiving the application code via the GPU API would have provided Bernat’s system with the advantage and capability to allow the system to using the application programming interface API for transmitting the code of application which improving the communication speed and system efficiency. 

Bernat and Sun fails to specifically teach the annotation indicative is common functional blocks.

However, YAMATO teaches the annotation indicative is common functional blocks (YAMATO, [0080] lines 1-7, The application code analysis section 112 executes an application code analysis step of analyzing an application code. The application code analysis section 112 analyzes the source code of the processing function to understand the code structure such as loop statements, reference relationships of variables, and functional blocks for processing (FFT: Fast Fourier Transform process) (as determining, based on the application code, common functional blocks); [0090] lines 1-8, The off-loadable area extraction section 114a identifies processing off-loadable to a GPU or FPGA, examples of which processing include loop statements and FFT processing, and extracts an intermediate language according to the off-load processing. The off-loadable area extraction section 114a identifies the off-loadable processing of the application code with reference to the code pattern DB 132; [0152] lines 1-6, the off-load processing designation section 114 identifies off-loadable processes of the application, which off-loadable processes include loop statements that can be processed in parallel, functional blocks of specific processing, and library calls. For each loop statement, the off-load processing designation section 114 specifies a directive specifying a parallel process by the accelerator, and performs compilation).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat and Sun with YAMATO because YAMATO’s teaching of determining the functional blocks based on analyzing the source code would have provided Bernat and Sun’s system with the advantage and capability to allow the system to easily determine the code structure for processing the tasks which improving the system efficiency and performance.

Bernat, Sun and YAMATO fail to specifically teach wherein the task is associated with processing data traffic in the RAN.

However, Ko teaches wherein the task is associated with processing data traffic in the RAN (Ko, Abstract, lines 1-2, A computer-based method for processing user data traffic in a RAN; [0005] lines 1-4, regulating data flow in a Radio Access Network (RAN) of wireless cellular networks. In one aspect of the invention, a computer-based method for processing user data traffic in a radio access network).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat, Sun and YAMATO with Ko because Ko’s teaching of processing the user data traffic in a RAN would have provided Bernat, Sun and YAMATO’s system with the advantage and capability to allow the accelerator resources within the RAN to processing the user data traffics which improving the system efficiency and performance. 

As per claim 9, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat further teaches wherein the task corresponds to processing one or more operational partitions associated with processing in the RAN (Bernat, [0025] lines 1-13, The client compute device 110, edge resources 150, 152, 154 (e.g., the compute devices 160, 162, 164, 166, 168, 170), the edge gateway device 130, the fog nodes 180, the core data center 190, and the compiler compute device 120 are illustratively in communication via a network, which may be embodied as any type of wired or wireless communication network, including…a radio access network (RAN); [0029] lines 16-19, a set (e.g., a portion) of the source code may have an annotation indicating that the set of the source code is to be executed with low latency. As such, the compiler compute device 120 may produce a section of object code with an instruction set usable by a GPU, another section of object code with an instruction set usable by a general purpose processor, and another section of object code defining a configuration of gates for an FPGA (as processing one or more operational partitions associated with processing)). In addition, Ko teaches processing data traffic in the RAN (Ko, Abstract, lines 1-2, A computer-based method for processing user data traffic in a RAN; [0005] lines 1-4, regulating data flow in a Radio Access Network (RAN) of wireless cellular networks. In one aspect of the invention, a computer-based method for processing user data traffic in a radio access network).

As per claim 10, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat further teaches wherein the task corresponds to execution of a service application in the one or more RAN servers (Bernat, [0025] lines 1-13, The client compute device 110, edge resources 150, 152, 154 (e.g., the compute devices 160, 162, 164, 166, 168, 170), the edge gateway device 130, the fog nodes 180, the core data center 190, and the compiler compute device 120 are illustratively in communication via a network, which may be embodied as any type of wired or wireless communication network, including…a radio access network (RAN); [0026] lines 15-27, determines whether a set of source code (e.g., a portion of the source code to be compiled) for an application (e.g., the application 114) (as service application) is associated with a target performance objective (e.g., a goal to be satisfied in the execution of that portion of the application). In doing so, in the illustrative embodiment, the compiler compute device 120 determines whether an annotation indicative of a target performance objective is present in the set of the source code (as based on the set of application code), as indicated in block 306. The annotation may be any data (e.g., a language construct, such as a pragma or directive, a keyword, a predefined character sequence followed by instructions, etc.), that specifies to a compiler how the associated source code is to be processed). In addition, Ko teaches wherein the service application includes one of: video streaming, location tracking, or network monitoring (Ko, [0002] voice call, video call data) and the packet-switched data are carried over by the same network resources (NodeB, RNC, UE, and the interfaces between them) inside RAN (Radio Access Network); [0018] lines 5-10, . Namely, 3G cellular network 100 supports two types of network flows or traffic: circuit-switched (CS) traffic (e.g., voice, real-time video streaming) (as video streaming application) and packet-switched (PS) traffic (e.g., web contents, youtube videos); also see [0005] lines 1-4, regulating data flow in a Radio Access Network (RAN) of wireless cellular networks. In one aspect of the invention, a computer-based method for processing user data traffic in a radio access network).

As per claim 11, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat further teaches wherein the one or more annotation indicative include one or more of: a state store, a caching, a forward error correction (FEC), a data encryption, a data decryption, or a time synchronization (Bernat, [0029] lines 40-46, as indicated in block 334. Similarly, the compiler compute device 120 may convert any annotations associated with security requirements (e.g., an annotation indicating that the section should be executed in a trusted execution environment (TEE), that the section may only be executed by one of an identified set of parties; [0035] lines 12-17, as indicated in block 584, the client compute device 110 may determine whether an available type of security enhanced environment in the edge resources satisfies a set of secure environment parameters (e.g., require hardware-based memory encryption (as data decryption), require the ability to allocate private regions of memory as enclaves, etc.) defined in association with the section 116 of the application 114 (e.g., in an API call from block 336)).

As per claim 12, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat further teaches where the RAN is associated with a far edge data center of a cloud RAN infrastructure (Bernat, [0013] lines 32-35, the edge network may form a portion of or otherwise provide an ingress point into a fog network (e.g., fog nodes 180), which may be embodied as a system-level horizontal architecture that distributes resources and services of computing, storage, control and networking anywhere between a core data center 190 (e.g., a data center that is further away (as far edge data center of a cloud RAN infrastructure) from and in a higher level of a hierarchy of the system 100 than the edge resources 150, 152, 154, and that includes multiple compute devices capable of executing one or more services; also see [0012] lines 9-11, resources, such as compute devices and the components thereof, owned and/or operated by one or more service providers, such as cellular network operators) or other compute devices located in a cloud; [0025] lines 1-13, The client compute device 110, edge resources 150, 152, 154 (e.g., the compute devices 160, 162, 164, 166, 168, 170), the edge gateway device 130, the fog nodes 180, the core data center 190, and the compiler compute device 120 are illustratively in communication via a network, which may be embodied as any type of wired or wireless communication network, including…a radio access network (RAN)).

As per claim 13, it is a system claim of claim 1 above. Therefore, it is rejected for the same reason as claim 1 above. In addition, Bernat further teaches  a processor; and a memory storing computer-executable instructions that when executed by the processor cause the system to (Bernat, [0010] lines 1-7, The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on a transitory or non-transitory machine-readable (e.g., computer-readable) storage medium, which may be read and executed by one or more processors; last line of [0010], memory).

As per claims 17-18, they are system claims of claims 10 and 11 respectively above. Therefore, they are rejected for the same reasons as claims 10 and 11 respectively above.

As per claim 19, it is a computer-readable recording medium claim of claim 1 above. Therefore, it is rejected for the same reason as claim 1 above. 


Claims 2 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Bernat, Sun, YAMATO and Ko, as applied to claims 1 and 13 respectively above, and further in view of Kundu et al. (US Pub. 2022/0276914 A1).

As per claim 2, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat teaches wherein the determining the accelerator-agnostic task instruction code interfaced with the identified one or more annotation indicative (Bernat, Fig. 3, 306, determine whether an annotation indicative of a target performance objective is present; [0027] lines 1-31, in determining whether an annotation indicative of a target performance objective is present, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for latency (e.g., the annotation indicates to prioritize reducing the latency with which the operations are performed). Additionally or alternatively, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for cost (e.g., the annotation indicates to prioritize reducing the monetary cost of performing the operations)…As indicated in block 314, the compiler compute device 120 may identify subsets of the source code to be executed in parallel (e.g., to reduce latency) (as determining accelerator-agnostic task instruction code (i.e., identified set of source code that is associated with target performance objective (i.e., latency, cost etc.,), which is corresponds to the identified one or more an annotation indicative)). In addition, YAMATO teaches the annotation indicative is common functional blocks (YAMATO, [0080] lines 1-7, The application code analysis section 112 executes an application code analysis step of analyzing an application code. The application code analysis section 112 analyzes the source code of the processing function to understand the code structure such as loop statements, reference relationships of variables, and functional blocks for processing (FFT: Fast Fourier Transform process) (as determining, based on the application code, common functional blocks); [0090] lines 1-8, The off-loadable area extraction section 114a identifies processing off-loadable to a GPU or FPGA, examples of which processing include loop statements and FFT processing, and extracts an intermediate language according to the off-load processing. The off-loadable area extraction section 114a identifies the off-loadable processing of the application code with reference to the code pattern DB 132; [0152] lines 1-6, the off-load processing designation section 114 identifies off-loadable processes of the application, which off-loadable processes include loop statements that can be processed in parallel, functional blocks of specific processing, and library calls. For each loop statement, the off-load processing designation section 114 specifies a directive specifying a parallel process by the accelerator, and performs compilation).

Bernat, Sun, YAMATO and Ko fail to specifically teach generating the accelerator-agnostic task instruction code comprises building the API.

However, Kundu teaches generating the accelerator-agnostic task instruction code comprises building the API (Kundu, [0071] lines 1-9,  an example of an interface to pair of accelerators where each accelerator is accessed by an application via a corresponding logical device, according to at least one embodiment. In at least one embodiment, an application 102 hosted on a computer system utilizes a plurality of accelerators to perform various workloads. In at least one embodiment, a logical device is an application programming interface created of executable instructions; [0076] lines 16-20, a library/driver is a set of executable instructions that is specific to an accelerator that, as a result of being executed, creates an interface between manufacturer-specific hardware and another piece of software; [0113] lines 7-10, data center 1400 can implement an API as described above so that applications hosted within data center 1400 can use acceleration resources effectively in a simple way; [0426]  an application generates instructions (e.g., in form of API calls) that cause driver kernel to generate one or more tasks for execution by PPU 3500 (as generate the accelerator-agnostic task instruction code and building the API for processing)).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat, Sun, YAMATO and Ko with Kundu because Kundu’s teaching of building the API which allowing the tasks to be processed on different accelerators would have provided Bernat, Sun, YAMATO and Ko’s system with the advantage and capability to allow the system to use acceleration resources effectively in a simple way by using the API (see Kundu, [0113] lines 7-10).

As per claim 14, it is a system claim of claim 2 above. Therefore, it is rejected for the same reason as claim 2 above.


Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Bernat, Sun, YAMATO, Ko and Kundu, as applied to claim 2 above, and further in view of Pasternak (US Pub. 2012/0096527 A1).

As per claim 3, Bernat, Sun, YAMATO, Ko and Kundu teach the invention according to claim 2 above. Bernat teaches generating the accelerator-specific executable (Bernat, Fig. 4, 324 compile the source code, 326 produce object code, 328 produce separate sections of object code for sets of the source code having target performance objectives; [0010] line 10, translating the instructions to a different format; [0029] lines 3-24, the compiler compute device 120 produces object code (e.g., a sequence of statements or instructions in a computer language, such as a machine code language (i.e., binary) or an intermediate language such as register transfer language (RTL)) (as translating the accelerator-agnostic task instruction code into an accelerator-specific executable), as indicated in block 326. As indicated in block 328, the compiler compute device 120, in the illustrative embodiment, produces one or more separate sections (e.g., binary files) of object code for each set of the source code that has a corresponding set (e.g., one or more) of target performance metrics. Further, as indicated in block 330, the compiler compute device 120 may produce multiple sets of object code (e.g., one for each of multiple architectures) for the same set of source code having a target performance objective. For example, a set (e.g., a portion) of the source code may have an annotation indicating that the set of the source code is to be executed with low latency. As such, the compiler compute device 120 may produce a section of object code with an instruction set usable by a GPU, another section of object code with an instruction set usable by a general purpose processor, and another section of object code defining a configuration of gates for an FPGA); 

Bernat, Sun, YAMATO, Ko and Kundu fail to specifically teach when generating the accelerator-specific executable comprises accessing the API.

However, Pasternak teaches when generating the accelerator-specific executable comprises accessing the API (Pasternak, [0017] lines 5-7, Web services 140 are typically application programming interfaces (API) or web APIs that may be accessed;  [0028] lines 1-8, The cmdlets code generator 209 can invoke a client proxy (e.g., proxy dll 221) and automatically create a snapin dll 223, that contains, for example, cmdlets for the methods in the proxy. The cmdlets code generator 209 can run a proxy dll 221 for an identified Web service, connect to and login to the Web service proxy, identify the Web service methods in the proxy, and create the object driven shell commands code (as accessing the API with generating the object driven shell commands code); also see [0018] lines 8-12, use a client proxy to communicate with a Web service 140 and a software framework 110 compiler (e.g., .NET compiler) generates Common Intermediate Language (CIL) code). 

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat, Sun, YAMATO, Ko and Kundu with Pasternak because Pasternak’s teaching of accessing the API with generating the code (as accelerator-specific executable, please notes, this was taught by Bernat) would have provided Bernat, Sun, YAMATO, Ko and Kundu’s system with the advantage and capability to allow the system to accessing the API for generating the code in order to improve the system efficiency and performance (see Pasternak, [0005] Traditionally…one or more software developers to manually write code to communicate to a Web service…Such projects can take a significant amount of time to develop and manage a team of software developers).


Claims 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over Bernat, Sun, YAMATO and Ko, as applied to claim 1 above, and further in view of Pasternak (US Pub. 2012/0096527 A1).

As per claim 4, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat teaches wherein the determining the accelerator-agnostic task instruction code interfaced with the identified one or more annotation indicative (Bernat, Fig. 3, 306, determine whether an annotation indicative of a target performance objective is present; [0027] lines 1-31, in determining whether an annotation indicative of a target performance objective is present, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for latency (e.g., the annotation indicates to prioritize reducing the latency with which the operations are performed). Additionally or alternatively, the compiler compute device 120 may determine whether the set of source code is associated with a target performance objective for cost (e.g., the annotation indicates to prioritize reducing the monetary cost of performing the operations)…As indicated in block 314, the compiler compute device 120 may identify subsets of the source code to be executed in parallel (e.g., to reduce latency) (as determining accelerator-agnostic task instruction code (i.e., identified set of source code that is associated with target performance objective (i.e., latency, cost etc.,), which is corresponds to the identified one or more an annotation indicative)). In addition, YAMATO teaches the annotation indicative is common functional blocks (YAMATO, [0080] lines 1-7, The application code analysis section 112 executes an application code analysis step of analyzing an application code. The application code analysis section 112 analyzes the source code of the processing function to understand the code structure such as loop statements, reference relationships of variables, and functional blocks for processing (FFT: Fast Fourier Transform process) (as determining, based on the application code, common functional blocks); [0090] lines 1-8, The off-loadable area extraction section 114a identifies processing off-loadable to a GPU or FPGA, examples of which processing include loop statements and FFT processing, and extracts an intermediate language according to the off-load processing. The off-loadable area extraction section 114a identifies the off-loadable processing of the application code with reference to the code pattern DB 132; [0152] lines 1-6, the off-load processing designation section 114 identifies off-loadable processes of the application, which off-loadable processes include loop statements that can be processed in parallel, functional blocks of specific processing, and library calls. For each loop statement, the off-load processing designation section 114 specifies a directive specifying a parallel process by the accelerator, and performs compilation).

Bernat, Sun, YAMATO and Ko fail to specifically teach building a dynamic link library (DLL).

However, Pasternak teaches building a dynamic link library (DLL) (Pasternak, [0027] lines 1-8, The cmdlets code generator 209 can automatically create object driven shell commands code (cmdlets code), such as a shell commands container dll 223 (snapin dll 223) in response to receiving the user input (e.g., URL) and using the proxy code (e.g., proxy dll 221). The cmdlets code generator 209 can create an assembly (e.g., .NET assembly). One type of assembly is library assemblies (dll). The cmdlets code generator 209 can create an empty dll file (e.g., snapin dll 223) as the assembly for the Web service and a module for the assembly).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat, Sun, YAMATO and Ko with Pasternak because Pasternak’s teaching of accessing the API with generating the code (as accelerator-specific executable, please notes, this was taught by Bernat) and building a DLL (as dynamic link library) would have provided Bernat, Sun, YAMATO and Ko’s system with the advantage and capability to allow the system to use fewer resources due to the building the DLL, such that when multiple programs use the same library of functions, a DLL can reduce the duplication of code that is loaded on the disk in order to improve the system efficiency (i.e., advantage of using DLL). 

As per claim 5, Bernat, Sun, YAMATO, Ko and Pasternak teach the invention according to claim 4 above. Bernat teaches generating the accelerator-specific executable (Bernat, Fig. 4, 324 compile the source code, 326 produce object code, 328 produce separate sections of object code for sets of the source code having target performance objectives; [0010] line 10, translating the instructions to a different format; [0029] lines 3-24, the compiler compute device 120 produces object code (e.g., a sequence of statements or instructions in a computer language, such as a machine code language (i.e., binary) or an intermediate language such as register transfer language (RTL)) (as translating the accelerator-agnostic task instruction code into an accelerator-specific executable), as indicated in block 326. As indicated in block 328, the compiler compute device 120, in the illustrative embodiment, produces one or more separate sections (e.g., binary files) of object code for each set of the source code that has a corresponding set (e.g., one or more) of target performance metrics. Further, as indicated in block 330, the compiler compute device 120 may produce multiple sets of object code (e.g., one for each of multiple architectures) for the same set of source code having a target performance objective. For example, a set (e.g., a portion) of the source code may have an annotation indicating that the set of the source code is to be executed with low latency. As such, the compiler compute device 120 may produce a section of object code with an instruction set usable by a GPU, another section of object code with an instruction set usable by a general purpose processor, and another section of object code defining a configuration of gates for an FPGA). In addition, Pasternak teaches accessing the DLL (Pasternak, [0037] lines 1-6, the tool automatically generates invocation infrastructure code (e.g., invocation dll) for communicating with the Web service client proxy to invoke a Web service method hosted by a server. The invocation infrastructure code can be contained in an invocation dll and is to be accessible by an object driven shell runtime, such as PowerShell.TM. runtime).


Claims 6-8, 15-16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Bernat, Sun, YAMATO and Ko, as applied to claims 1, 13 and 19 respectively above, and further in view of Nachimuthu et al. (US Pub. 2019/0253518 A1).

As per claim 6, Bernat, Sun, YAMATO and Ko teach the invention according to claim 1 above. Bernat further teaches receiving status information indicating availability of a central processor of one of the one or more RAN servers (Bernat, Fig. 6;  532 determine available edge resources,  538 receive data indicative of characteristics of available edge resources, 548 receive data indicative of available processor architecture; [0032] lines 25-30, The client compute device 110 may also receive data indicative of available processor architectures (e.g., processors (as include central processor) with hardware support for encryption, compression, or other extended feature sets, processors designed for low power consumption and/or low cost, with reduced feature sets, etc.), as indicated in block 548); 
receiving status information indicating availability of the plurality of accelerators associated with the one or more RAN servers (Bernat, Fig. 6;  532 determine available edge resources,  538 receive data indicative of characteristics of available edge resources, 544 receive data indicative of available graphics processing unit (GPU) devices; [0032] lines 8-17, the client compute device 110 may receive data indicative of available field programmable gate array (FPGA) devices, as indicated in block 542. The client compute device 110 may also receive data indicative of available graphics processing unit (GPU) devices, as indicated in block 544. Further, the client compute device 110 may receive data indicative of available visual processing units (VPU) devices) and 
identifying, based on the received status information, the accelerator from the plurality of accelerators (Bernat, Fig. 6 to Fig. 7, 568, determine, as function of the characteristics of the available edge resources, whether a target performance objective would be satisfied by offloading execution of the corresponding section of the application to one or more available edge resources; [0034] lines 1-15, Referring now to FIG. 7, in determining whether a target performance objective would be satisfied by offloading execution to the edge, the client compute device 110 may determine whether an available edge resource 150, 152, 154 is capable of executing the section 116 with lower latency than the local compute device (e.g., the client compute device 110), as indicated in block 570. For example, the client compute device 110 may compare the latency data obtained in block 552 to latency data for the client compute device 110 (e.g., an average number of operations per second, etc.) to determine whether any of the edge resources 150, 152, 154 is capable of executing the section 116 faster. Similarly, the client compute device 110 may determine whether an available edge resource 150, 152, 154 is capable of executing the section 116 at a lower cost than the client compute device 110; also see [0028] lines 2-17, determines target architecture(s) to satisfy the target performance objective(s) from block 306. As indicated in block 318, the compiler compute device 120 may determine accelerator device architecture(s) (e.g., GPU, VPU, FPGA, etc.) to reduce the latency in the execution of the operations defined in the annotated set of source code. For example, if the operations are primarily matrix multiply and accumulate operations, the compiler compute device 120 may determine that a GPU is a target architecture for performing the operations. If, on the other hand, the operations are primarily machine-learning or other artificial intelligence related operations, the compiler compute device 120 may determine that a VPU is a target architecture. Additionally or alternatively, the compiler compute device 120 may determine that a target architecture is an FPGA or other accelerator device).

Bernat, Sun, YAMATO and Ko fail to specifically teach when receiving status information, it indicating a first workload of a central processor and indicating a second workload of the plurality of accelerators.

However, Nachimuthu teaches when receiving status information, it indicating a first workload of a central processor and indicating a second workload of the plurality of accelerators (Nachimuthu, Fig. 16, 1620 processor, 1622, 1624 accelerator devices; [0083] lines 5-17, In addition to receiving status data indicative of the health of each resource, the orchestrator server 1602 may additionally receive status data that is indicative of the present utilization of each resource (as workload of processor and plurality of accelerators) (e.g., a percentage of the total compute capacity of the processor 1620 that is presently being used, a percentage of the total acceleration capacity being used in each accelerator device 1622, 1624, the number of AFUs 1640, 1642, 1644, 1646 that are presently performing operations of workload(s), etc.), as indicated in block 1726. Subsequently, the method 1700 advances to block 1728 of FIG. 18, in which the orchestrator server 1602 may obtain a request to compose a node from a set of resources to execute a workload; also see Abstract, corresponding resource to be utilized in the execution of a workload).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Bernat, Sun, YAMATO and Ko with Nachimuthu because Nachimuthu’s teaching of receiving the status data that indicate the utilization data (workload utilization) for each different resources would have provided Bernat, Sun, YAMATO and Ko’s system with the advantage and capability to easily determining the workload utilization for each resources and selecting the resources for execution based on the workload resource utilization which improving the system efficiency and performance.

As per claim 7, Bernat, Sun, YAMATO, Ko and Nachimuthu teach the invention according to claim 6 above. Bernat further teaches wherein the plurality of accelerators are heterogeneous accelerators (Bernat, [0033] lines 6-9, capable of being performed by) each edge resource (e.g., each device available in the edge resources 150, 152, 154, such as FPGAs, GPUs, VPUs, processors, etc. (as heterogeneous accelerators)).

As per claim 8, Bernat, Sun, YAMATO, Ko and Nachimuthu teach the invention according to claim 7 above. Bernat further teaches wherein the plurality of heterogeneous accelerators includes two or more of: an ASIC-based network interface card, an FPGA-based network interface card, an NPU-based network interface card, a GPU, an FPGA-based accelerator, or an NPU-based accelerator (Bernat, [0033] lines 6-9, capable of being performed by) each edge resource (e.g., each device available in the edge resources 150, 152, 154, such as FPGAs, GPUs, VPUs, processors, etc).

As per claims 15-16, they are system claims of claims 6 and 8 respectively above. Therefore, they are rejected for the same reasons as claims 6 and 8 respectively above.

As per claim 20, it is a computer-readable recording medium claim of claim 6 above. Therefore, it is rejected for the same reason as claim 6 above. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZUJIA XU whose telephone number is (571)272-0954. The examiner can normally be reached M-F 9:00-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571) 272-3756. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Z.X./Examiner, Art Unit 2195                                                                                                                                                                                                        
/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195