Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Detailed Action
This office action is responsive to the application filed on 21 March 2019.  Claims 1-20 are pending in the application.

Information Disclosure Statement
The information disclosure statements (IDSs) submitted on 21 March 2019, 30 September 2019, 15 June 2021, and 29 November 2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements have been considered by the Examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-6 and 11-18 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Aitken et al. (US 11,308,361, hereinafter “Aitken”).

Regarding claim 1, Aitken discloses [a]n electronic apparatus, comprising: a memory (Aitken, Fig. 9 showing memory and storage elements 904, 906, 918, and 928) which stores input data (Aitken, Fig. 2 element 210 input image) and a plurality of second kernel data (Aitken, Fig. 2 elements 232, 234, 236, and 238) obtained from first kernel data (Aitken, Fig. 2 elements 222, 224, 226, and 228 which collectively make up kernel 220) such that each of the plurality of second kernel data comprises a different first kernel element from among a plurality of first kernel elements in the first kernel data; (Aitken, Fig. 2, convolutional output element 232 is generated from kernel element 222,  output element 234 is generated from kernel element 224, etcetera as denoted by the shading of the blocks) and a processor (Aitken, Fig. 9, processing device 902) which performs a convolution operation on each of the plurality of second kernel data with input data and obtains data in which at least a portion of the input data is upscaled by the first kernel data based on the performed convolution operation. (Aitken, Fig. 2 element 240 is the convolutional subpixel output, which is an 8x8 image generated by convolving each element of the input image (4x4, padded to 5x5) with a 2x2 kernel (elements 222, 224, 226, and 228) (corresponds to claimed “in which at least a portion of the input data is upscaled by the first kernel data based on the performed convolution operation.”);
(Aitken, 3:1-11 “In one implementation, for example, FIG. 2 shows an input image 210, convolution kernel 220, output of convolution 230, and/or output of sub-pixel convolution 240. The sub-pixel convolution, as illustrated in FIG. 2, may include convolving the input image 210 with convolution subkernels (222, 224, 226, and 228) of the convolution kernel 220 to generate convoluted images 230 (e.g., 232, 234, 236, and/or 238). The sub-pixel convolution may further include using the pixels of the generated convoluted images (e.g., 232, 234, 236, and/or 238) to generate the output of subpixel convolution 240, for example, a larger image.”)

	Claims 11 and 13 recite similar limitations as claim 1 and are rejected under the same rationale as applied to claim 1 above.

Regarding claim 2, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim 1.  Aitken further discloses wherein the each of the plurality of second kernel data is obtained from an expanded first kernel data based on the plurality of first kernel elements spaced apart at intervals of a multiple (r) of upscaling, where r is a natural number, (Aitken, Fig. 2, the 2x2 convolution kernels (elements 222, 224, 226, and 228) yield 4x4 convolutional results (elements 232, 234,236, and 238), yielding the subpixel convolution output 240, where the similarly-shaded blocks are spaced at intervals of two, yielding an image that has been enlarged by a scaling factor of 2 (a natural number) in both the horizontal and vertical dimensions) and wherein the expanded first kernel data is obtained by expanding the first kernel data based on a size of the first kernel data and the multiple (r). (Aitken, Fig. 2, the first kernel data 2x2 matrices 222, 224, 226, and 226 are expanded to 4x4 matrices 232, 234, 236, and 238 based on the size of the first kernel and the input image 210)

Claims 12 and 14 recite similar limitations as claim 2 and are rejected under the same rationale as applied to claim 2 above.

Regarding claim 3, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim.  Aitken further discloses further comprising: a communicator comprising circuitry, wherein the processor is further configured to receive the plurality of second kernel data from a server via the communicator, and control the memory to store the received plurality of second kernel data. (Aitken, Figure 9 showing a central communication Bus 930, a Processing Device 902, Network 920, Network Interface Device 908, Memories 904 and 906, and a Data Storage Device 918 containing Machine-Readable Storage Medium 928.)  

Claim 15 recites similar limitations as claim 3 and is rejected under the same rationale as applied to claim 3 above.

Regarding claim 4, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim 1.  Aitken further discloses further comprising: a communicator comprising circuitry, wherein the processor is further configured to receive the first kernel data from a server via the communicator, obtain the plurality of second kernel data from the first kernel data, and control the memory to store the plurality of second kernel data.  (Aitken, Figure 9 showing a central communication Bus 930, a Processing Device 902, Network 920, Network Interface Device 908, Memories 904 and 906, and a Data Storage Device 918 containing Machine-Readable Storage Medium 928.)

Claim 16 recites similar limitations as claim 4 and is rejected under the same rationale as applied to claim 4 above.

Regarding claim 5, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim 1.  Aitken further discloses wherein the processor is further configured to perform the convolution operation on each of the plurality of second kernel data with an element to be upscaled and a plurality of peripheral elements that surround the element to be upscaled in the portion of the input data, to obtain a plurality of upscaling elements with respect to the element to be upscaled, wherein a first sum of the element to be upscaled and the plurality of peripheral elements is the same as a second sum of a plurality of second elements respectively included in the plurality of second kernel data. (Aitken, Fig. 2 showing convolution performed by processing device 902, wherein the dimensionality of the input image (ignoring padding) and the kernel data 230 are both 4x4)

Claim 17 recites similar limitations as claim 5 and is rejected under the same rationale as applied to claim 5 above.

Regarding claim 6, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim 1.  Aitken further discloses wherein the processor is further configured to determine positions of the plurality of upscaling elements with respect to the element to be upscaled based on a position of the element to be upscaled with respect to the input data.  (Aitken, Fig. 2 and 2:62 – 3:11 the position of a pixel within input image 210 (corresponds to claimed “position of the element to be upscaled” and “input data”) determines which 2x2 subkernel (222, 224, 226, or 228) (corresponds to claimed “upscaling elements”) of kernel 220 the pixel is to be convolved with.)

Claim 18 recites similar limitations as claim 6 and is rejected under the same rationale as applied to claim 6 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 7-8 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Aitken in view of He et al., "A Configurable SIMD Architecture with Explicit Datapath for Intelligent Learning," In 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS) 2016 Jul 17 (pp. 156-163). IEEE. (hereinafter “He”).

Regarding claim 7, Aitken as applied to claim 1 above discloses [t]he electronic apparatus as claimed in claim 1.
Aitken does not explicitly disclose wherein the processor comprises: a convolution array comprising circuitry and which performs the convolution operation on the each of the plurality of second kernel data with the input data; and a line memory comprising circuitry and which stores upscaled data.  
He teaches wherein the processor comprises: a convolution array comprising circuitry and which performs the convolution operation on the each of the plurality of second kernel data with the input data; and a line memory comprising circuitry and which stores upscaled data (He, Fig. 1, “Proposed configurable SIMD [Single Instruction / Multiple Data] architecture” showing instruction memory, processing elements, and vector data memory (corresponds to claimed “line memory comprising circuitry and which stores upscaled data”);
He, pg. 156, last paragraph, “In this paper, a MAC unit equipped with configurable number of accumulator registers are proposed to exploit the benefits of merged operation and register tiling. The proposed MAC unit is integrated into our configurable SIMD architecture with explicit datapath [31]. A CNN-based intelligent learning application is chosen to demonstrate the efficiency of the proposed architecture.” [The system of He may be used to perform convolution operations.]
	He is analogous art, as it is directed to the tasks of convolution and matrix/vector multiplication.
	It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to utilize the SIMD architecture of He to perform the convolutions of Aitken, the benefit being that “[a] Multiply-Accumulate (MAC) unit, which exploits register tiling, is thus expected to be beneficial to both performance and energy efficiency,” as recited by He on pg 156, second to last paragraph.
Claim 19 recites similar limitations as claim 7 and is rejected under the same rationale as applied to claim 7 above.

Regarding claim 8, the combination of references as applied to claim 7 above teaches [t]he electronic apparatus as claimed in claim 7.  Further, He teaches wherein the convolution array comprises a plurality of processing elements (He, Fig. 1 showing multiple processing elements PE, each having an associated vector memory) respectively including a plurality of register files, (He, Table 1 “Key Features of the Proposed SIMD ISA,” listing N 32-bit/16-bit register files) and wherein each of the plurality of processing elements performs a multiplying operation of a second kernel element input to the plurality of processing elements, from among the plurality of second kernel data, and accumulates and stores a result of the multiplying operation in a register file corresponding to the second kernel element from among the plurality of register files.  (He, pg. 156, last paragraph, “In this paper, a MAC unit equipped with configurable number of accumulator registers are proposed to exploit the benefits of merged operation and register tiling. The proposed MAC unit is integrated into our configurable SIMD architecture with explicit datapath [31]. A CNN-based intelligent learning application is chosen to demonstrate the efficiency of the proposed architecture.” [“MAC” in this context refers to “Multiply-ACumulate” circuitry that performs convolution multiplication operations as claimed and utilizes a number of accumulator register files to store the results, as claimed.]

Claim 20 recites similar limitations as claim 8 and is rejected under the same rationale as applied to claim 8 above.


Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Aitken in view of He and further in view of Xie, R.Z., "A flexible memory shuffling unit for image processing accelerators," Master's Thesis, Faculty of Electrical Engineering, Eindhoven University of Technology, November 11, 2013. (hereinafter “Xie”)

Regarding claim 9, the combination of references as applied to claim 8 above teaches [t]he electronic apparatus as claimed in claim 8. 
The above combination does not explicitly disclose wherein the processor further comprises: a shuffler comprising circuitry and positioned between the convolution array and the line memory, wherein the shuffler shuffles a plurality of operation results output from the plurality of processing elements and outputs the plurality of operation results shuffled by the shuffler to the line memory.  
Xie teaches wherein the processor further comprises: a shuffler comprising circuitry and positioned between the convolution array and the line memory, wherein the shuffler shuffles a plurality of operation results output from the plurality of processing elements and outputs the plurality of operation results shuffled by the shuffler to the line memory (Xie, Abstract, “To improve bandwidth for locality optimized access patterns, this work proposes a memory shuffling unit that provides an interface between DMA and the accelerator IP (Intellectual Property). To provide enough flexibility in the access patterns, the DMA controller and the shuffling unit are programmable.”; Xie, § 1.2 “Problem description”, “When focusing[sic] on applications which use image processing, a wide range of them are based on the same basic operation, namely 2D convolution. A few well known examples are photo filtering and augmented reality applications. There are accelerators specialized in convolution operations having a small storage (for area, energy and flexibility), that can process data efficiently when data is provided in certain patterns. On the other hand, there is a DMA that can efficiently transfer large blocks of data. The problem is that these large blocks of data do not match with the patterns that are favorable for the accelerator, there is no efficient interface available between these two units. To provide a flexible and efficient interface this work proposes a memory shuffling unit.” [The shuffling unit is designed to work with processors performing image processing and convolution.]; Xie, pg. 23 Fig. 21 and § 5.1 “Architecture”, ¶ 2, “In this setup the shuffling in is responsible for receiving the data from the DMA and supply the data to the accelerator in the correct pattern. The shuffling out receives the results from the accelerator, reorders the data and sends it back to the SDRAM through the DMA.” [The shuffler takes outputs from the accelerator circuitry, reorders it, and send them to memory via a “Data out” line.]
Xie is analogous art, as it is directed to the task of using hardware circuity to perform convolutions, e.g. in the context of image processing.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the shuffling units of Xie with the MAC unit of He, the benefit being an increase in memory bandwidth and reduction in energy consumption, as cited by Xie in the Abstract “Our shuffling unit increases memory bandwidth by 300x and decreasing the energy consumption by 300x compared to supplying the data in patterns directly from the on-chip memory.”
Regarding claim 10, the combination of references as applied to claim 9 above teaches [t]he electronic apparatus as claimed in claim 9.  Further, Xie teaches wherein the shuffler comprises: a plurality of buffer sets comprising circuitry; (Xie, pg. 10, ¶ 1 “An SDRAM consists of multiple memory banks, where each bank consists multiple rows and a row buffer (Figure 5).  For every SDRAM access, the address of the request is first decoded into a bank, row and column addresses using a memory map [1]. Using the bank and row addresses a bank can be selected and the corresponding row can be requested. This row will then be loaded into a row buffer, which stores the most recently activated row.” and a first in first out (FIFO) memory comprising circuitry and which receives at least two processing elements from among the plurality of processing elements, output from corresponding register files from among the plurality of register files, and outputs the plurality of operation results to the plurality of buffer sets, wherein the plurality of buffer sets store each of the plurality of operation results in a buffer corresponding to each of the plurality of buffer sets, and  wherein, based on the plurality of operation results being stored in all buffers included in the plurality of buffer sets, outputs the plurality of processing elements stored in one of the plurality of buffer sets to the line memory in a pre-set order. (Xie, pg. 23, Figure 21 and § 5.1 “Architecture”, “Furthermore there are FIFO buffers placed between the shuffing units and the accelerator, these are placed to facilitate for the speed differences between these components. For instance, if the shuffling unit wants to send data to the accelerator while it is busy, instead of being stalled, the shuffling unit can continue processing after writing the data to the FIFO. In this setup the shuffling in is responsible for receiving the data from the DMA and supply the data to the accelerator in the correct pattern. The shuffling out receives the results from the accelerator, reorders the data and sends it back to the SDRAM through the DMA.”  [The system of Xie may transfer data in both directions between the Accelerator and the SDRAM (including the SDRAM buffers described above) via FIFO circuitry in both directions.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SCOTT R GARDNER whose telephone number is (469)295-9128. The examiner can normally be reached 8:00am - 5:00pm M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J Lo can be reached on 571-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SCOTT R GARDNER/Examiner, Art Unit 2126 
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126