Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION


Reasons for Allowance

1.	Claims 1- 11,13-20  are allowed over prior art.

2.	The following is an examiner’s statement of reasons for allowance: 
Prior art made of record fails to teach the limitations underlined within the independent claims mentioned below, 


Regarding Claim 1,
A method, comprising: loading a first weight data element of an array of weight data elements from a memory into a systolic array, the first weight data element being at first coordinates within the array of weight data elements; receiving a selection of a first subset of input data elements of an array of input data elements, the first subset being selected based on the first coordinates of the first weight data element, a stride of a dilated convolution operation, and a rate of the dilated convolution operation, wherein the first subset is further selected based on a set of parameters that include a first address of the first subset in the memory, a gap between input data elements of the first subset of input data elements, or a count of the first subset of input data elements; streaming each input data element of the selected first subset starting from the first address from the memory into the systolic array to multiply with the first weight data element to compute first partial sums; loading a second weight data element from the streaming each input data element of the selected second subset starting from a second address from the memory into the systolic array to multiply with the first weight data element to compute second partial sums; and generating an output data array of the dilated convolution operation based on the first partial sums and the second partial sums.

Regarding Claim 6,
A non-transitory computer readable medium storing instructions that, when executed by one or more hardware processors, cause the one or more hardware processors to: load a first weight data element of an array of weight data elements from a memory into a systolic array; select a first subset of input data elements to be loaded from the memory into the systolic array, the first subset being selected based on information indicating a rate of a dilated convolution operation and coordinates of the first weight data element within the array of weight data elements; control the systolic array to perform  first computations based on the first weight data element and the first subset to generate first partial sums; control a summation buffer to accumulate the first partial sums; load a second weight data element of the array of weight data elements from the memory into the systolic array; select a second subset of the input data elements to be loaded from the memory into the systolic array; control the systolic array to perform second computations based on the second weight data element and the second subset to generate second partial sums; control the summation buffer to accumulate the second partial sums; and generate output data elements of an output data array based on the first partial sums and the second partial sums.





An apparatus comprising: a memory that stores a set of instructions; one or more hardware processors configured to execute the set of instructions to: receive first information indicative of a rate and a stride of a dilated convolution operation to be performed by a systolic array based on a weight data array and an input data array to generate an output data array; receive second information indicative of a dimension of a summation buffer; determine, for each weight data element of the weight data array, a subset of input data elements of the input data array to multiply with the  each weight data element to compute partial sums, the subset of input data elements being determined based on a projection operation from the summation buffer and based on the rate of the dilated convolution, the stride of the dilated convolution, the dimension of the summation buffer, and coordinates of the each weight data element in the weight data array; determine, for the each weight data element of the weight data array, destination addresses of the summation buffer to receive the partial sums based on the subset of input data elements; and generate a computation instruction for each weight data element of the weight data array to include third information indicative of the destination addresses and the subset of input data elements and to include fourth information indicating the address, a number of input data elements included in the subset of the input data array, or a step size in the computation instruction to enable the systolic array to load the subset of input data elements from the memory.


Regarding Claim 1: Claim 1 is   rejected under  Delaye et al.  (USPUB 20190114499) in view of Dongseok Im ( NPL Doc: “DT-CNN: Dilated and Transposed Convolution Neural Network Accelerator for Real-time Image Segmentation on Mobile Devices”, 01/05/2019,  2019 IEEE International Symposium on Circuits and Systems (ISCAS) , Pages 1-5) teaches 
A method, comprising: loading a first weight data element of an array of weight data elements from a memory into a systolic array, the first weight data element being at first coordinates within the array of weight data elements; receiving a selection of a first subset of input data elements of an array of input data elements, the first subset being selected based on the first “wherein the first subset is further selected based on a set of parameters that include a first address of the first subset in the memory, a gap between input data elements of the first subset of input data elements,… streaming each input data element of the selected second subset starting from a second address from the memory into the systolic array to multiply with the first weight data element to compute second partial sums; and generating an output data array of the dilated convolution operation based on the first partial sums and the second partial sums.”


Regarding Claim 6: Claim 6 is   rejected under Whatmough et al. (USPUB 20190311243) in view of Delaye et al.  (USPUB 20190114499) teaches A non-transitory computer readable medium storing instructions that, when executed by one or more hardware processors, cause the one or more hardware processors to: load a first weight data element of an array of weight data elements from a memory into a systolic array; select a first subset of input data elements to be loaded from the memory into the systolic array, the first subset being selected based on information indicating a rate of a dilated convolution operation and coordinates of the first weight data element within the array of weight data elements; control the systolic array to perform  first computations based on the first weight data element and the first subset to  “control a summation buffer to accumulate the first partial sums; load a second weight data element of the array of weight data elements from the memory into the systolic array; select a second subset of the input data elements to be loaded from the memory into the systolic array; control the systolic array to perform second computations based on the second weight data element and the second subset to generate second partial sums; control the summation buffer to accumulate the second partial sums; and generate output data elements of an output data array based on the first partial sums and the second partial sums.”


Regarding Claim 17: Claim 17  is   rejected under Delaye et al.  (USPUB 20190114499) in view of Franco-Neto (USPUB 20190244106)  teaches An apparatus comprising: a memory that stores a set of instructions; one or more hardware processors configured to execute the set of instructions to: receive first information indicative of a rate and a stride of a dilated convolution operation to be performed by a systolic array based on a weight data array and an input data array to generate an output data array; receive second information indicative of a dimension of a summation buffer; determine, for each weight data element of the weight data array, a subset of input data elements of the input data array to multiply with the  each weight data element to compute partial sums, the subset of input data elements being determined based on a projection operation from the summation buffer and based on the rate of the dilated convolution, the stride of the dilated convolution, the dimension of the summation buffer, and coordinates of the each weight data element in the weight data array; determine, for the each weight data element of the weight data array, destination addresses of the summation buffer to receive the partial sums based on the subset of input data elements;  (detailed rejection of the claim mentioned within Office Action dated  12/03/2021)  within claim 17, but does not teach the limitations as mentioned within the claim: “generate a computation instruction for each weight data element of the weight data array to include third information indicative of the destination addresses and the subset of input data elements and to include fourth information indicating the address, a number of input data elements included in the subset of the input data array, or a step size in the computation instruction to enable the systolic array to load the subset of input data elements from the memory.”




3.	The examiner found no suggestions or motivations to combine similar teachings from prior art made of record to overcome the limitations as discussed above. 

4.	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion

5. 	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of Reference Cited for a listing of analogous art.
6. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OMAR S ISMAIL whose telephone number is (571) 272-9799 and FAX number (571) 273-9799.  The examiner can normally be reached on M-F 9:00am-6:00pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David C. Payne can be reached on (571) 272-3024.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-3024.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/OMAR S ISMAIL/
Primary Examiner, Art Unit 2637