Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
1. This Office Action is in response to the application filed on 03/27/2020. Claims 1-25 are pending in this application. Claims 1, 10 and 19 are independent claims. 


Claim Rejections - 35 USC § 101
2. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

3. Claims 1-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because Claims 1-9 and 19-25 are subject to software per se because the apparatus in claim 1 and system in claim 19 are respectively interpreted as software program instead of hardware system with the broadest reasonable interpretation in light of the specification. The apparatus in claim 1 is comprising one or more processors and each processor further comprising a plurality of processing elements. The examiner interprets the processor as software processor with BRI. And the processor in claim 19 is comprising a plurality of processing elements. The examiner also interprets the processor in claim 19 as software processor with BRI. The examiner suggests amending claim 1 and 19 to incorporate a hardware processor, a memory or a storage as a specific hardware. The machine-readable storage medium in Claim 10 is interpreted as transitory such as signal with BRI in light of the spec, where the specification does not clearly define nor describe machine-readable storage medium as non-

                                                                                                                                                                                                   	Claim Rejections - 35 USC § 112
4. The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


5.  As the interpretation on the independent claims 1, 10 and 19 are not required all the limitations because the four different non-sharing features of the compiler in those claims are connected with “Or” and the examiner selected the use of cyclic buffering for rejection purpose, any dependent claims refer to other features are subject to the lack of antecedent basis issue as explained and discussed via a phone interview with the applicant on 12/15/2020. “the memory accesses” , the performance, “” in claims  6, 15 and 25,  “the rolling window” from claims 7-8, 16-17 and 23 “the memory operations”, “the allowance of the memory operations” from claims 9, 18 and 24. There is insufficient antecedent basis for this limitation in the claim. Thus, dependent claims 6-9, 15-18, 23-25 are rejected. 


Claim Rejections - 35 USC § 103
6. In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the 

7. The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

8. Claims 1, 10 and 19 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496).

As per Claim 1, Maor teaches of an apparatus comprising: the one or more processors operable to cause a compiler to: use cyclic buffering for one or more [nested] loops to allow reuse of values from one or more streams associated with the one or more processing elements; or perform one or more memory accesses in an inner loop (nested loop) via the one or more streams based on a rolling window of values; or allow one or more memory operations associated with an unrolled loop to occur at a later time with at least one unit-stride access via the one or more streams; or perform an unroll-and-squash procedure via the one or more streams. (par 39, Further, the buffers 230 that are defined on the SRAM 225 as part of the compilation process of the CVE may be cyclic (e.g., to save the need for copying, such as in implementations of a finite impulse response (FIR) filter using the CVE, where using cyclic buffers 230 allows a respective CBB (e.g., 276, 290a, 290b, etc.) to reuse the history data between tiles of data, among other example uses). Par 41, In this example, the developer-user is forced to understand the nature of the cyclic buffer to read the physical base address and size of every buffer and then to explicitly check and fix the pointer inside the inner-loop every time the pointer moves around the barrier.)
one or more processors, each processor comprising a plurality of processing elements, … for one or more nested loops (par 3 and 48, The following description relates to a technique for processing a nested loop and additionally, to a technique for processing a nested loop by allocating commands included in a nested loop to a plurality of processing elements and processing the allocated commands. Par 9, Typically, an inner loop and an outer loop which are included in a nested loop are processed in series in reconfigurable architecture. However, the series processing may substantially lengthen the processing time of the loop operation. Par 86, Examples of program instructions include machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add one or more processors, each processor comprising a plurality of processing elements, … for one or more nested loops, as conceptually seen from the teaching of Ahn, into that of Maor because this modification can help optimize the compilation by reusing the data in the nested loop for multi processing elements.

Re Claim 10, it is the product claim, having similar limitations of claim 1. Thus, claim 10 is also rejected under the same rationale as cited in the rejection of claim 1.

Re Claim 19, it is the system claim, having similar limitations of claim 1. Thus, claim 19 is also rejected under the same rationale as cited in the rejection of claim 1.


s 2-3, 11-12 and 20-21 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), and further in view of Fleming (US PGPub 20190018815).

As per Claim 2. Fleming teaches of the apparatus of claim 1, wherein the one or more processors are configurable spatial accelerators (CSA) and wherein the compiler is a CSA optimizing complier. (par 108, Certain embodiments herein are directed to a spatial array of processing elements (e.g., a configurable spatial accelerator (CSA)) that targets high performance computing (HPC), for example, of a processor. par 368, FIG. 53 illustrates a compilation toolchain 5300 for an accelerator according to embodiments of the disclosure. This toolchain compiles high-level languages (such as C, C++, and Fortran) into a combination of host code (LLVM) intermediate representation (IR) for the specific regions to be accelerated. The CSA-specific portion of this compilation toolchain takes LLVM IR as its input, optimizes and compiles this IR into a CSA assembly, e.g., adding appropriate buffering on latency-insensitive channels for performance. It then places and routes the CSA assembly on the hardware fabric, and configures the PEs and network for execution. In one embodiment, the toolchain supports the CSA-specific compilation as a just-in-time (JIT), incorporating potential runtime feedback from actual executions. One of the key design characteristics of the framework is compilation of (LLVM) IR for the CSA, rather than using a higher-level language as input.)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add configurable spatial accelerators (CSA) and wherein the compiler is a CSA optimizing complier, as conceptually seen from the teaching of Fleming, into that of Maor and Ahn because this modification can help optimize the compilation by  configurable spatial accelerators.

As per Claim 3, Fleming teaches of the apparatus of claim 2, wherein the one or more streams is a latency insensitive channel (LIC) associated with communication among the one or more processing elements of each of the one or more CSAs. (par 370, This pass takes in a function represented in control flow form, e.g., a control-flow graph (CFG) with sequential machine instructions operating on virtual registers, and converts it into a dataflow function that is conceptually a graph of dataflow operations (instructions) connected by latency-insensitive channels (LICs). Par 372, To support this model, the CSA assembly code supports multiple uses of the same LIC (e.g., data2), with the simulator implicitly creating the necessary copies of the LICs.)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add the one or more streams is a latency insensitive channel (LIC) associated with communication among the one or more processing elements of each of the one or more CSAs, as conceptually seen from the teaching of Fleming, into that of Maor and Ahn because this modification can help optimize the compilation by  configurable spatial accelerators.

Re Claim 11, it is the product claim, having similar limitations of claim 2. Thus, claim 11 is also rejected under the same rationale as cited in the rejection of claim 2.

Re Claim 12, it is the product claim, having similar limitations of claim 3. Thus, claim 12 is also rejected under the same rationale as cited in the rejection of claim 3.

Re Claim 20, it is the system claim, having similar limitations of claim 2. Thus, claim 20 is also rejected under the same rationale as cited in the rejection of claim 2.

Re Claim 21, it is the system claim, having similar limitations of claim 3. Thus, claim 21 is also rejected under the same rationale as cited in the rejection of claim 3.
s 4, 13 and 22 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), and further in view of Muthukumar (US PGPub 20040015934).

As per Claim 4, Muthukumar teaches of the apparatus of claim 1, wherein the cyclic buffering comprises using the one or more streams to circulate data without any redundant trips to memory and wherein the one or more nested loops includes an inner loop with a short trip-count. (par 36, FIG. 2B illustrates application of the present invention to a nested loop 200' having a short trip count inner loop (TC=3<SC=5). Par 39, For the short trip count inner loop of FIG. 2B, the epilog phase of inner loop 210 for the first outer loop iteration is reached before the prolog phase of the loop for the second iteration of the outer loop ends, and the two phases overlap during II(4).)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add using the one or more streams to circulate data without any redundant trips to memory and wherein the one or more nested loops includes an inner loop with a short trip-count, as conceptually seen from the teaching of Muthukumar, into that of Maor and Ahn because this modification can help optimize the compilation by reusing the data in the nested loop for multi processing elements.

Re Claim 13, it is the product claim, having similar limitations of claim 4. Thus, claim 13 is also rejected under the same rationale as cited in the rejection of claim 4.

Re Claim 22, it is the system claim, having similar limitations of claim 4. Thus, claim 22 is also rejected under the same rationale as cited in the rejection of claim 4.

s 5 and 14 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), in view of Muthukumar (US PGPub 20040015934), and further in view of Tokumaru (US Patent 4839839).

As per Claim 5. Tokumaru teaches of the apparatus of claim 4, wherein the data is circulated via a rotate operation. (Col 3, lines 14-19, FIGS. 1 and 2 are diagrams for showing a barrel shifter for implementing a rotation operation such that input data is circulated together with a carry bit by unit of bits (referred to as carry-including rotation operation), in which FIG. 1 shows a shift section thereof and FIG. 2 shows a rotate section thereof.)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add data is circulated via a rotate operation, as conceptually seen from the teaching of Tokumaru, into that of Maor and Ahn because this modification can help optimize the compilation by rotation operation for multi processing elements. 


Re Claim 14, it is the product claim, having similar limitations of claim 5. Thus, claim 14 is also rejected under the same rationale as cited in the rejection of claim 5.

12. Claims 6, 15 and 25 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), in view of Bittel (US Patent 7420568).

As per Claim 6, Bittel teaches of the apparatus of claim 1, wherein the memory accesses associated with the performance of the one or more memory access includes stencil-based memory accesses. (Claim 
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add the memory accesses associated with the performance of the one or more memory access includes stencil-based memory accesses, as conceptually seen from the teaching of Bittel, into that of Maor and Ahn because this modification can help optimize the memory access by stencil-based accesses for multi processing elements.

Re Claim 15, it is the product claim, having similar limitations of claim 6. Thus, claim 15 is also rejected under the same rationale as cited in the rejection of claim 6.

Re Claim 25, it is the system claim, having similar limitations of claim 6. Thus, claim 25 is also rejected under the same rationale as cited in the rejection of claim 6.

13. Claims 7, 16 and 23 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), in view of Li (US PGPub 20150288965).

As per Claim 7, Li teaches of the apparatus of claim 1, wherein the rolling window of values comprises, at each output position, an input value used in a previous position is dropped and a new value is loaded from memory. (par 90, As can be seen, a long-term rolling window component 610 can receive as inputs, a coded picture size (for the current frame) and a coded picture quantization value (for the current frame). These values can be received from the encoder 518 (e.g., from the general encoding control (420) in the encoder of FIG. 4), as shown in FIG. 5 (there is no dependency from the decoder side), which generates these values substantially simultaneously. The long-term rolling window component 610 can generate, from these inputs, an encoded bits output that depends on coded picture sizes for pictures within the rolling window (e.g., an average or weighted average).)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add at each output position, an input value used in a previous position is dropped and a new value is loaded from memory, as conceptually seen from the teaching of Li, into that of Maor and Ahn because this modification can help optimize the compilation and the memory process by rolling window in the nested loop for multi processing elements.

Re Claim 16, it is the product claim, having similar limitations of claim 7. Thus, claim 16 is also rejected under the same rationale as cited in the rejection of claim 7.

Re Claim 23, it is the system claim, having similar limitations of claim 7. Thus, claim 23 is also rejected under the same rationale as cited in the rejection of claim 7.

14. Claims 8 and 17 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), in view of Li (US PGPub 20150288965), and further in view of Tsou (US PGPub 20180189667).

performing a shift operation. (par 83, The window may be a rolling window such that, as new data is generated (e.g., at t.sub.r,1), the window shifts to include the new data and decisions and exclude the oldest data and decision (e.g., to form a new window corresponding to data collected over t.sub.2 to t.sub.n+1, etc.)..)
Therefore, it would have been obvious for one of ordinary skill in the before the effective filing date of the claimed invention to add performing a shift operation, as conceptually seen from the teaching of Tsou, into that of Maor and Ahn because this modification can help optimize the compilation by performing a shift operation for multi processing elements. 

Re Claim 17, it is the product claim, having similar limitations of claim 8. Thus, claim 17 is also rejected under the same rationale as cited in the rejection of claim 8.

15. Claims 9, 18 and 24 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Maor (US PGPub 20190004980), in view of Ahn (US PGPub 20120102496), in view of Farmahini-Farahani (US PGPub 20170048320).

As per Claim 9, Farmahini-Farahani teaches of the apparatus of claim 1, wherein the one or more memory operations include stores and loads and the allowance of the one or more memory operations is based at least in part on a merge operation or a scatter operation. (par 39, As discussed further herein however, an application uses the logic layer of the source and/or destination memory node to “gather” data from a memory array of the destination node, for example, for storage into a memory vector, or to “scatter” data into a memory array of a destination node from a memory vector.)


Re Claim 18, it is the product claim, having similar limitations of claim 9. Thus, claim 18 is also rejected under the same rationale as cited in the rejection of claim 9.

Re Claim 24, it is the system claim, having similar limitations of claim 9. Thus, claim 24 is also rejected under the same rationale as cited in the rejection of claim 9.


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAE UK JEON whose telephone number is (571)270-3649.  The examiner can normally be reached on 9am-6pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached on 571-272-3721.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-






/JAE U JEON/Primary Examiner, Art Unit 2193