DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

I. ALLOWABNE SUBJECT MATTER
Claim 21 is allowed.

II. REJECTIONS BASED ON PRIOR ART
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8-20 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Zelda et al. (US Patent 11,036,827).

As per claim 1, Zelda teaches/suggests a method for operating an embedded system, the embedded system performing an accelerated processing capability using a Lightweight Intelligent Software Framework (LISF), comprising: initializing and configuring (e.g. associated with programming of hardware accelerator), by a parallelization managing Function Entity (FE), entities present within resources for performing mathematical operations in parallel (e.g. associated with resources of the hardware accelerator for performing GEMM parallel computing); and processing in parallel, by an acceleration managing FE, the mathematical operations using the configured entities (e.g. associated with carrying out GEMM parallel computing) (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48).

As per claim 8, Zelda teaches/suggests all the claim 1 above, where Zelda further teaches/suggests the method comprising wherein the mathematical operations include operations in a neural network  (Fig. 1-4; Fig. 7-8; col. 1, ll. 50-53; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; and col. 11, ll. 21-48; col. 14, ll. 24-26).

As per claim 9, Zelda teaches/suggests all the claim 1 above, where Zelda further teaches/suggests the method comprising wherein the parallelization managing FE allocates a device memory, copies data from a host to a device, sets a kernel, and again copies results of an operation (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 10, Zelda teaches/suggests all the claim 9 above, where Zelda further teaches/suggests the method comprising wherein instances of the kernel are executed in parallel while each of the instances is processing a single work item (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 11, Zelda teaches/suggests all the claim 9 above, where Zelda further teaches/suggests the method comprising wherein instances of the kernel are executed together as multiple work items as a part of a work group (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 12, Zelda teaches/suggests all the claim 11 above, where Zelda further teaches/suggests the method comprising wherein an instance of each kernel in the work group communicates with an additional instance (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 13, Zelda teaches/suggests all the claim 9 above, where Zelda further teaches/suggests the method comprising wherein the parallelization managing FE manages a parallel-processing queue for performing parallel processing depending on a number of devices in the embedded system (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 14, Zelda teaches/suggests all the claim 9 above, where Zelda further teaches/suggests the method comprising wherein the parallelization managing FE divides a matrix with weights and bias values taking a parallel processing performance of the device into consideration to maximize parallelism in multiple device environments, the parallel processing capability of the device is determined by the number of kernel instances that are executed at a time, a maximum work group size of the device or a maximum work item size (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 15, Zelda teaches/suggests all the claim 1 above, where Zelda further teaches/suggests the method comprising wherein the acceleration managing FE controls the resources so that a corresponding device performs a General Matrix Multiply (GEMM) operation on the divided matrix and input data depending on the divided matrix (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 16, Zelda teaches/suggests all the claim 15 above, where Zelda further teaches/suggests the method comprising wherein: the GEMM operation is represented by an equation of C=αAB+βC, where A, B, and C are matrices and α and β are scalar values, sizes of matrices A, B, and C are indicated by M, N, and K, the size of matrix A is M*K, the size of matrix B is K*N, and the size of matrix C is M*N (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features.

As per claim 17, Zelda teaches/suggests all the claim 16 above, where Zelda further teaches/suggests the method comprising wherein the parallelization managing FE divides rows of matrix A by a number of OpenCL devices, and a size of a sub-matrix resulting from division is determined by a number of corresponding OpenCL devices and a number of usable OpenCL devices (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 18, Zelda teaches/suggests all the claim 15 above, where Zelda further teaches/suggests the method comprising wherein the acceleration managing FE shares a memory between a host and devices to minimize the cost of the mathematical operations, each device performs mathematical routines without copying data between the host and the device by accessing the host's a vector and a matrix using a memory address (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 19, Zelda teaches/suggests all the claim 15 above, where Zelda further teaches/suggests the method comprising wherein the acceleration managing FE groups the matrix into vectors to maximize a workload for each kernel (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 20, Zelda teaches/suggests all the claim 19 above, where Zelda further teaches/suggests the method comprising wherein the acceleration managing FE determines on a size of a work group to allow each device to perform parallel processing (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), wherein it would have been obvious to implement the above claimed features for properly carrying out parallel GEMM.

As per claim 22, claim 22 is rejected in accordance to the same rational and reasoning as the above rejection of claim 1, where Zelda further teaches/suggests the embedded system comprising: resources; an acceleration unit, where the acceleration unit is configured to operating accordingly (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48).

Claims 2-7 are rejected under 35 U.S.C. 103 as being unpatentable over Zelda et al. (US Patent 11,036,827) as applied to claim 1 above, and further in view of SINGH et al. (US Pub.: 2018/0189981).

As per claim 2, Zelda teaches/suggests all the claimed features of claim 1 above, where Zelda further teaches/suggests the method comprising wherein the LISF corresponds to a system comprising a platform, a device, a context, and a kernel (Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48), but Zelda does not expressly teaches/suggests a command queue.
SINGH teaches/suggests a method comprising a command queue (e.g. associated with command buffer/queue: [0058] and [0109]).
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include SINGH’s command queuing architecture into Zelda’s method for the benefit of preserving memory bus bandwidth and reducing system memory access power requirements (SINGH, [0032]) to obtain the invention as specified in claim 2.

As per claim 3, Zelda and SINGH teach/suggest all the claimed features of claim 2 above, where Zelda and SINGH teach/suggest the method comprising wherein the platform corresponds to a heterogeneous platforms using at least one Central Processing Unit (CPU) (Zelda,  col. 4, l. 39) and one Graphics Processing Unit (GPU) (Zelda, Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48; and SINGH, [0035]; [0058]; [0109]).

As per claim 4, Zelda and SINGH teach/suggest all the claimed features of claim 2 above, where Zelda and SINGH teach/suggest the method comprising wherein the device comprises actual processors for performing the mathematical operations (Zelda, Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48; and SINGH, Fig. 4; [0035]; [0063]; [0109]).

As per claim 5, Zelda and SINGH teach/suggest all the claimed features of claim 2 above, where Zelda and SINGH teach/suggest the method comprising wherein the context comprises an entity for managing the resources in a device set (Zelda, Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48; and SINGH, Fig. 4; [0035]; [0096]; [0109]).

As per claim 6, Zelda and SINGH teach/suggest all the claimed features of claim 2 above, where Zelda and SINGH teach/suggest the method comprising wherein the command queue comprises an entity for executing a kernel and performing memory mapping/unmapping and synchronization (Zelda, Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48; and SINGH, Fig. 4; [0032]; [0035]; [0051]; [0109]; [0121]).

As per claim 7, Zelda and SINGH teach/suggest all the claimed features of claim 2 above, where Zelda teach/suggest the method comprising wherein the kernel comprises a code running on the device (Zelda, Fig. 1-4; Fig. 7-8; col. 3, l. 15 to col. 5, l. 54; col. 9, l. 19 to col. 10, l. 55; col. 11, ll. 21-48).

III. CLOSING COMMENTS

CONCLUSION
STATUS OF CLAIMS IN THE APPLICATION
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P.  707.07(i):
CLAIMS REJECTED IN THE APPLICATION
Per the instant office action, claims 1-22 have received a first action on the merits and are subject of a first action non-final.
    
DIRECTION OF FUTURE CORRESPONDENCES
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUN KUAN LEE whose telephone number is (571)272-0671.  The examiner can normally be reached on Monday-Friday.				
IMPORTANT NOTE
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHUN KUAN LEE/Primary Examiner
Art Unit 2181                                                                                                                                                                                                        May 27, 2022