DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 17 recites the limitation "the set of sub-kernel rules".  There is insufficient antecedent basis for this limitation in the claim.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –



Claim(s) 1-6, 8-16 and 18-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Mills (US20190340498).
Regarding claim 1, Mills discloses a computer-implemented method of performing a convolution between an input data array and a kernel to generate an output data array (fig. 9B, para. [0015]), the method comprising: 
decomposing the kernel (Kernel Decompose of 924 in fig. 9B, para. [0091]) into a plurality of sub-kernels (934-940 in fig. 9B, para. [0091]) each having a respective position relative to the kernel (each sub-kernel 934-940 has a respective position in kernel 924 according to the pattern) and respective in-plane dimensions less than or equal to a target kernel dimension (the dimensions of sub-kernels 934-940 in fig. 9B. A target kernel dimension is not defined and is interpreted to be an arbitrary value. Also, Mills teaches zero-padding a kernel to obtain kernel data having a desired spatial shape in para. [0098]); and 
for each of the plurality sub-kernels: 
determining a respective portion of the input data array on the basis of the respective in-plane dimensions of the sub-kernel and the respective position of the sub-kernel relative to the kernel (926-932 in fig. 9B are the respective portions of the input data array 922 corresponding to sub-kernels 934-940. Each of the respective portions and sub-kernel are part of the same sub-channel as taught in para. [0091] and have the same pattern in fig. 9B); 

performing a convolution between the retrieved respective portion of the input data array and the sub-kernel (para. [0091], The at least one neural engine 314 convolves each sub-channel 926, 928, 930, 932 and kernel coefficients from a corresponding sub-kernel 934, 936, 938, 940) to generate a respective intermediate data array (output of 412 in fig. 4, para. [0063], [0091], processed values 412 generated by each sub-channel convolution), 
the method further comprising summing the generated intermediate data arrays to generate at least a portion of the output data array (942 in fig. 9B, para. [0091], Then, processed values 412 generated by each sub-channel convolution are accumulated by accumulators 414 to generate (after pre-processing in post-processor 428) processed values 417 and output data 328 (e.g., output data 942 of FIG. 9B)).


Regarding claim 2, Mills discloses a method wherein: 
at least one in-plane dimension of the kernel is greater than and indivisible by the target kernel dimension (as previously stated, the target kernel dimension is interpreted to be an arbitrary value. With that said, kernel 924 in fig. 9B has dimensions 5 x 5 which would be greater than a target kernel dimension of 4x4); and 
at least one of the in-plane dimensions of at least one of the sub-kernels is less than the target kernel dimension (sub-kernels 934-940 in fig. 9B each have dimensions less than an arbitrary target kernel dimension of 4x4).


Regarding claim 3, Mills discloses a method wherein the respective in-plane dimensions of at least one of the sub-kernels are different from the respective in-plane dimensions of at least one other of the sub-kernels (each sub-kernel 934-940 in fig. 9B has different dimensions).


Regarding claim 4, Mills discloses a method wherein for each sub-kernel, performing the convolution between the respective portion of the input data array and the sub-kernel (para. [0091]) to generate the respective intermediate data array (output of 412 in fig. 4, para. [0063], [0091]) comprises performing a plurality of multiply-accumulate operations in parallel (para. [0053], [0069]).


Regarding claim 5, Mills discloses a method comprising, for each of the plurality of sub-kernels: 
determining a respective target data offset (Sx in fig. 9A, para. [0089]-[0090]) and perimeter data configuration (Kh and Kw in fig. 9A, para. [0089]) corresponding to the determined respective portion of the input data array (904 in fig. 9A); and 
retrieving the respective portion of the input data array in accordance with the determined respective target data offset and perimeter data configuration (para. [0089], kernel data 904 in fig. 9A is interpreted to correspond to kernel data 924 in fig. 9B).


Regarding claim 6, Mills discloses a method comprising determining the respective dimensions of each sub-kernel and the respective position of each sub-kernel relative to the kernel using a set of decomposition rules (para. [0091], The kernel data 326 (kernel data 924) may be decomposed offline (e.g., by compiler) into sub-kernels 934, 936, 938, 940 of smaller spatial support than that of the original 5x5 kernel data 326).


Regarding claim 8, Mills fails to explicitly disclose a method wherein the target kernel dimension is programmable. However, as stated above, the target kernel dimension is not defined and is interpreted to be an arbitrary value. Therefore one of ordinary skill in the art would be able to select any desired dimension, making it programmable.


Regarding claim 9, the limitations have been analyzed and rejected with respect to claim 1. Mills further discloses a memory circuit to store an input data array and a kernel (para. [0004]) and processing circuitry (para. [0004], neural engine circuit).





Regarding claim 11, the limitations have been analyzed and rejected with respect to claim 3.


Regarding claim 12, the limitations have been analyzed and rejected with respect to claim 5.


Regarding claim 13, Mills discloses a system wherein: 
the memory circuitry comprises first memory circuitry and second memory circuitry (230 and 318 in fig. 3); and 
the processing circuitry is arranged to: 
transfer at least part of the input data array from the first memory circuitry and the second memory circuitry (para. [0056], [0058], buffer 318 may store input data, Buffer DMA 320 includes a read circuit that receives a portion (e.g., tile) of the input data from a source (e.g., system memory 230) for storing in data buffer 318); and 
update, for each of the plurality of sub-kernels, the respective target data offset and perimeter data configuration (dimensions of 934-940 in fig. 9B, para. [0089], it 


Regarding claim 14, Mills discloses a system wherein the first memory circuitry comprises static random-access memory (SRAM) and the second memory circuitry comprises dynamic random-access memory (DRAM) (para. [0038]). Both SRAM and DRAM are well-known types of memory components in computing systems.


Regarding claim 15, Mills discloses a system wherein the processing circuitry (fig. 3) comprises a multiply-accumulate array (314 in fig. 3, 404 in fig. 4, para. [0044]) comprising a plurality of accumulators arranged to generate elements of the intermediate data arrays in parallel (each component 314 in fig. 3 comprises an accumulator 414 shown in fig. 4).


Regarding claim 16, Mills discloses a system wherein the accumulators are configured to sum respective corresponding elements of the intermediate data arrays to generate respective elements of the at least portion of the output data array (para. [0091], Then, processed values 412 generated by each sub-channel convolution are accumulated by accumulators 414 to generate (after pre-processing in post-processor 428) processed values 417 and output data 328).


Regarding claim 18, the limitations have been analyzed and rejected with respect to claim 8.


Regarding claim 19, the limitations have been analyzed and rejected with respect to claim 1.


Regarding claim 20, the limitations have been analyzed and rejected with respect to claim 2.



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mills (US20190340498) in view of Lee et al (US20190065896).
Regarding claim 7, Mills teaches a method wherein the input data array corresponds to an image or an input feature map (para. [0057], [0094], image data de-interleave pixels of portion 1020 of input data 322).
 
Mills fails to teach wherein the output data array corresponds to an output feature map. However Lee teaches an input data array corresponding to an image or an input feature map (410 in fig. 4) and an output data array corresponds to an output feature map (430 in fig. 4).
Therefore taking the combined steps of Mills and Lee as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the step of Lee into the method of Mills. The motivation to combine Lee and Mills would be to efficiently reduce an operation count of a convolution operation (para. [0070] of Lee) and more quickly perform convolution (para. [0154]).


Claim 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mills (US20190340498).
Regarding claim 17, Mills teaches a system wherein: 

Mills fails to explicitly teach wherein the memory circuitry is arranged to store a set of decomposition rules. However one of ordinary skill in the art would have found it obvious to store the decomposition rules used to decompose kernel 324 in fig. 9B into sub-kernels 936-940 in a local memory. The motivation to do so would be to decrease processing time as opposed to if the decomposition rules were required to be retrieved from another device.


Related Art
The following prior art is deemed relevant by the examiner but not relied upon in this office action:
Son et al (US20180181858)
Brothers et al (US20160358070)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEON VIET Q NGUYEN whose telephone number is (571)270-1185.  The examiner can normally be reached on Mon-Fri 10AM-6PM.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/LEON VIET Q NGUYEN/           Primary Examiner, Art Unit 2663