DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1-20 are objected to because of the following informalities.  
Claim 1 line 9, claim 14 line 9, claim 19, and claim 20 recites “the convolution transpose”. For antecedent basis reasons, this should recite “the convolution transpose operation”.  Claims 2-13, and 17-18 inherit the same deficiency as claim 1 by reason of dependence.  Claims 15-16 inherit the same deficiency as claim 14 by reason of dependence.
Claim 15 line recites “forming a filter”.  For further clarity, Examiner suggests amending to recite “forming a second filter” or equivalent to further distinguish from “the filter” recite in claim 14 line 4. 
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.

Regarding claim 1, under the Alice framework Step 1, the claim recites a machine.
Under the Alice framework Step 2A prong 1, claim 1 recites an abstract idea in the grouping of Mathematical Concepts. The claim recites mathematical calculations to perform a convolution transpose operation, wherein a convolution transpose operation between an input tensor and a plurality of filter weights is described in the specification by equation (2): BV = AV * CT at [0007].  The claim further recites mathematical relationships and calculations to perform the convolution transpose operation by performing a direct convolution between the input tensor and a plurality of sub-filters (weights), wherein a convolution operation is described in the specification by equation (1): BV = AV * C at [0006].  These steps recite mathematical steps to perform a convolution transpose by performing convolutions of smaller portions of the weights, and reassembling the full convolution transpose by interleaving results.
Under the Alice framework Step 2A prong 2 analysis, claim 1 recites the following additional elements: one or more convolution engines, and an interleave engine. However, these elements are recited at a light-level of generality, For instance the claims fail to include limitations that detail the structure of the claimed engines, or how they function beyond merely arranging the structures that perform math, wherein the structures including arranging one or more convolution engines to perform convolution, and an interleave engine to interleave. Accordingly, the elements discussed above fail to provide a meaningful limitation on the claimed steps, and amount to no more than 
Moreover, under the Alice Framework Step 2B analysis, the claim, considered individually and as an ordered combination does not include additional elements that are sufficient to amount to significantly more than the abstract idea.  Any purported innovation is in the math, i.e., how the transpose convolution is ordered into smaller convolutions, and a final result is from interleaving results of the convolutions.  It is the math that drives the selection of a generic convolution engine(s) and interleave engine in a manner such that the combination of generic circuits flows as a natural consequence of the math. For these reasons claim 1 does not amount to significantly more than the abstract idea.

Claims 2-13, and 17-18 are rejected for at least the reasons provided with respect to claim 1. Claims 2-6, and 10 merely recite further mathematical limitations to perform the convolution transpose.  Claims 2-6, and 10 recite no further additional elements that would require further analysis under Step 2A prong 2 or Step 2B.
Claim 7 recites the following additional element: a storage unit and reading the output elements of the plurality of sub-output tensors from the storage unit in a predetermined order.  A storing unit and reading output elements of the plurality of sub-output tensors from the storage unit comprises an insignificant extra solution activity under step 2A prong 2.  Under Step 2B this element comprises a well understood, routine, and conventional activity.  See D.A Patterson et al., Computer Organization and Design: The Hardware/Software Interface, Elsevier Science & Technology, 2007, 
Claims 8, 9, 11, and 13 recite additional elements of performing mathematical steps over a plurality of hardware passes (claim 8, 9, 13), or in first and second hardware passes (claim 11).    Performing mathematical steps over a plurality of hardware passes comprises an insignificant extra solution activity under step 2A prong 2.  Under Step 2B this element comprises a well understood, routine, and conventional activity.  See e.g., R.E. Bryant, Pipelined Implementation, Part I, CS:APP Ch 4, Computer Architecture, found at http://www.cs.cmu.edu/afs/cs/academic/class/15349-s02/lectures/class4-pipeline-a.4up.pdf, 2018, (hereinafter “Bryant”), which discloses performing hardware passes in a pipeline configuration, slides 13-18.
Claim 12 recites the additional element of writing the first and second sets of blocks to memory.   Writing the first and second sets of blocks to memory comprises an insignificant extra solution activity under step 2A prong 2.  Under Step 2B this element comprises a well understood, routine, and conventional activity.  See Patterson Ch 1, fig 1.5 which depicts use of memory and writing first and second sets of blocks to memory.
Claim 17 recites the additional element comprising the system of claim 1 in a deep neural network hardware accelerator.  Under the step 2A prong 2 analysis, implementing the convolution transpose operation in a deep neural network hardware accelerator merely comprises generally linking the mathematical concepts to the technological field of a deep neural network hardware accelerator.  For instance the claims fail to include limitations that detail the structure of the deep neural network hardware accelerator, or how it functions beyond merely perform convolution steps.  
Claim 18 recites the additional element comprising the system of claim 1 embodied in hardware on an integrated circuit.  Under the step 2A prong 2 analysis, implementing the additional element  merely comprises generally linking the mathematical concepts embodied in hardware on an integrated circuit.  For instance the claims fail to include limitations that detail the structure of the integrated circuit, or how it functions beyond merely perform convolution steps.  Accordingly, the element discussed above fail to provide a meaningful limitation on the claimed steps.  Under step 2B, the analysis under step 2A prong 2 applies equally.  There is no innovative concept claimed in the embodiment of an integrated circuit. 

Regarding claim 14, under the Alice framework Step 1, the claim recites a method.
Under the Alice framework Step 2A prong 1, claim 14 recites an abstract idea in the grouping of Mathematical Concepts. The claim recites mathematical calculations to perform a convolution transpose operation, wherein a convolution transpose operation between an input tensor and a plurality of filter weights is described in the specification by equation (2): BV = AV * CT at [0007].  The claim further recites mathematical relationships and calculations to perform the convolution transpose operation by dividing the input tensor into a plurality of sub-filters (weights), wherein a convolution V = AV * C at [0006].  These steps recite mathematical steps to perform a convolution transpose by performing convolutions of smaller portions of the weights, and reassembling the full convolution transpose by interleaving results.
Under the Alice framework Step 2A prong 2 analysis, claim 1 recites the following additional elements: hardware logic. However, this element is recited at a light-level of generality, For instance the claims fail to include limitations that detail the structure of the claimed hardware logic, or how they function beyond merely arranging the structures that perform math, wherein the structures including arranging one or more convolution engines to perform convolution, and an interleave engine to interleave. Accordingly, the elements discussed above fail to provide a meaningful limitation on the claimed steps, and amount to no more than mere instructions to apply the exception using generic hardware logic in a manner that flows as a natural consequence of the math. 
Moreover, under the Alice Framework Step 2B analysis, the claim, considered individually and as an ordered combination does not include additional elements that are sufficient to amount to significantly more than the abstract idea.  Any purported innovation is in the math, i.e., how the transpose convolution is ordered into smaller convolutions, and a final result is from interleaving results of the convolutions.  It is the math that drives the selection of hardware logic in a manner such that the combination of generic circuits flows as a natural consequence of the math. For these reasons claim 1 does not amount to significantly more than the abstract idea.


Regarding claims 19 and 20, under the Alice framework Step 1, the claim recites a machine.
Under the Alice framework Step 2A prong 1, claims 19 and 20 recite an abstract idea in the grouping of Mathematical Concepts. The claim recites mathematical calculations to perform a convolution transpose operation, wherein a convolution transpose operation between an input tensor and a plurality of filter weights is described in the specification by equation (2): BV = AV * CT at [0007].  The claim further recites mathematical relationships and calculations to perform the convolution transpose operation by performing a direct convolution between the input tensor and a plurality of sub-filters (weights), wherein a convolution operation is described in the specification by equation (1): BV = AV * C at [0006].  These steps recite mathematical steps to perform a convolution transpose by performing convolutions of smaller portions of the weights, and reassembling the full convolution transpose by interleaving results.
Under the Alice framework Step 2A prong 2 analysis, claims 19 and 20 recite the following additional elements: one or more convolution engines, and an interleave engine, a computer readable storage medium having stored thereon a computer readable dataset description (claim 19 and 20), and a layout processing system 
Moreover, under the Alice Framework Step 2B analysis, the claim, considered individually and as an ordered combination does not include additional elements that are sufficient to amount to significantly more than the abstract idea.  Any purported innovation is in the math, i.e., how the transpose convolution is ordered into smaller convolutions, and a final result is from interleaving results of the convolutions.  It is the math that drives the selection of a generic convolution engine(s) and interleave engine in a manner such that the combination of generic circuits flows as a natural consequence of the math. For these reasons claim 20 does not amount to significantly more than the abstract idea.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3-4, 7-15, and 17-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated US 20180032857 A1, Lele et al., (hereinafter “Lele”).

Regarding claim 1, Lele teaches the following:
A system to perform a convolution transpose operation between an input tensor comprising a plurality of input elements and a filter comprising a plurality of filter weights (abstract, CNN accelerator for system, deconvolution for convolution transpose, fig 3 310 input feature map for input tensor, fig 6A 600 filter comprising a plurality of filter weights) the system comprising: 
one or more convolution engines configured to perform a direct convolution between the input tensor and each of a plurality of sub-filters to generate a plurality of sub- output tensors comprising a plurality of output elements, each sub-filter comprising a subset of the filter weights of 
an interleave engine configured to interleave the output elements of the plurality of sub-output tensors to generate a final output tensor for the convolution transpose (Fig 6C, 11A,B, [0061] last sentence,  interlace for interleave by sequencer unit or in another embodiment by transformation elements 1180 as in Fig 11B [0090]).

Regarding claim 3, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
wherein the plurality of sub-filters comprises non-overlapping subsets of the filter weights of the filter ([0055], Fig 6A).

Regarding claim 4, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
wherein the filter weights of a sub-filter are in a reverse order in the sub-filter with respect to the filter ([0056-0057], Fig 6B).


wherein interleaving the output elements of the plurality of sub-output tensors to form the final output tensor for the convolution transpose comprises storing the output elements of each of the plurality of sub-output tensors in a storage unit and reading the output elements of the plurality of sub-output tensors from the storage unit in a predetermined order to form the final output tensor (Fig 11A,B convolution output stored in buffer before output to sequencer, [0087] convolution kernels in the PP arrays are directly connected to the buffers without routing through any other kernel, [0061] last sentence sequencer coordinates interlacing).

Regarding claim 8, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
wherein the system is configured to perform the convolution transpose over a plurality of hardware passes of the system (Fig 11B, Fig 17 1707,  [0084] PE arrays time multiplex computations).

Regarding claim 9, in addition to the teachings addressed in the claim 8 analysis, Lele teaches:
wherein the one or more convolution engines are configured to generate a different subset of the plurality of sub-output tensors in each of the plurality of 

Regarding claim 10, in addition to the teachings addressed in the claim 9 analysis, Lele teaches:
wherein each subset of the plurality of sub-output tensors comprise output elements in a same row of the final output tensor (Fig 6C color coding shows each having output elements in a same row of the final output tensor).

Regarding claim 11, in addition to the teachings addressed in the claim 9 analysis, Lele teaches the following:
wherein in a first hardware pass of the system the one or more convolution engines are configured to generate a first subset of the sub-output tensors from a first subset of the plurality of sub-filters and the interleave engine is configured to generate a first set of blocks of the final output tensor by interleaving the output elements of the first subset of the sub-output tensors (Fig 6C upper left subfilter convolution, 17-1707 first of multiple convolutions, Fig 17-1707 interleave results [0129]); and
in a second hardware pass of the system the one or more convolution engines are configured to generate a second subset of the sub-output tensors from a second subset of the plurality of sub-filters and the interleave engine is configured to generate a second set of blocks of the final output tensor by interleaving the output elements of the second subset of the sub-output tensors 

Regarding claim 12, in addition to the teachings addressed in the claim 11 analysis, Lele teaches the following:
wherein the interleave engine is further configured to write the first and second sets of blocks to memory such that the first and second sets of blocks together form at least a portion of the final output tensor  ([0105]).

Regarding claim 13, in addition to the teachings addressed in the claim 8 analysis, Lele teaches the following:
wherein in each of a plurality of hardware passes of the system the one or more convolution engines are configured to generate a portion of one or more of the sub-output tensors based on a portion of the input elements of the input tensor ([0129] sequential for one or more hardware passes in the Fig 11B CONV engine, Fig 6C upper left subfilter convolution, 17-1707 for generate a portion of one or more of the sub-output tensors). 

	Claim 14 is directed to a method that would be practiced by the system of claim 1.  All operation steps performed by the system of claim 1 are performed by the method of claim 14.  The claim 1 analysis applies equally to claim 14.


wherein the convolution transpose is performed in one or more dimensions according to a stride in that direction ([0054]), and dividing the filter into a plurality of sub-filters comprises: 
forming a base block of filter weights from an origin of the filter, the base block having dimensions equal to the stride in each of the one or more dimensions ([0055], stride 2, Fig 6A); and 
forming each sub-filter by: 
forming a filter from the filter weights of the filter at the stride increments in each of the one or more dimensions starting from one of the filter weights in the base block (Fig 6A 611-614 result from increment of 2 weight positions); and 
generating a reflected version of the formed filter (fig 6B).

Regarding claim 17, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
	a deep neural network hardware accelerator comprising the system as set forth in claim 1 (Fig 10A,B a CNN multilayers deep and including convolutional and deconvolutional layers for deep neural network).

Regarding claim 18, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:


Claim 19 is directed to a computer readable storage medium having stored thereon a computer readable dataset description of the system as set forth in claim 1, that when processed in an integrated circuit manufacturing system, causes the integrated circuit manufacturing system to manufacture an integrated circuit embodying said system.  
	The claim 1 mapping applies to the system set forth in claim 1 as in claim 19.  Lele further discloses the system as in claim 1 in a target integrated circuit embodying the system of claim 1 ([0064], fig 8) including a non-transitory computer readable medium having stored thereon a computer readable dataset description of the system ([0149], fig 8).

Claim 20 is directed to an integrated circuit manufacturing system ([0064], fig 8) comprising a  computer readable storage medium having stored thereon a computer readable dataset description of the system as set forth in claim 1 (see claim 1 mapping, [0149], fig 8); a layout processing system configured to process the computer readable description so as to generate a circuit layout description of an integrated circuit embodying the system ([fig 8]); and an integrated circuit generation system configured to manufacture the system according to the circuit layout description (fig 8).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2, 5-6, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Lele in view of US 20190138898 A1 Song et al., (hereinafter “Song”).

Regarding claim 2, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
wherein: the input tensor is multi-dimensional (fig 3 310, [0047-0048], k x k x D).
the convolution transpose is performed in a first dimension of the input tensor at a first stride and a second dimension of the input tensor at a second stride ([0045], [0054]); 
Lele does not, however, explicitly disclose the plurality of sub-filters comprises q sub-filters, wherein q is equal to the product of the first stride and the second stride.  However, in the same field of endeavor Song discloses:
the plurality of sub-filters comprises q sub-filters, wherein q is equal to the product of the first stride and the second stride (Fig 6 4 sub-kernels resulting from a stride of 2, 2, stride of 2 in the x direction, stride of 2 in the y direction; [0087-0089]) .  


Regarding claim 5, in addition to the teachings addressed in the claim 1 analysis, Lele teaches:
wherein the convolution transpose is performed in a first dimension according to a first stride and the interleave engine is configured to interleave the output elements of the plurality of sub-output tensors to form the final output tensor for the convolution transpose by, for each row of the final output tensor, selecting output elements from a set of k sub-output tensors in a round-robin manner, wherein k is equal to the first stride (Fig 6 4 sub-kernels resulting from a stride of 2 in the x direction [0087-0089], fig 5 2 subkernels in x direction forms two sub outputs in the x direction of the sub output tensor, fig 5 showing round robin, fig 7 showing interleave).
It would have been obvious to one of ordinary skill in the art before the effective filing date to substitute Song’s method for dividing the filter into a plurality of sub-filters for Lele’s method of dividing the filter into a plurality of subfilters, and technique for interleaving. It is obvious to one of ordinary skill in the art to perform a simple substitution of one known element for another to obtain predictable results. See MPEP 2141.III.(B).

wherein the convolution transpose is further performed in a second dimension according to a second stride and the interleave engine (2212) is further configured to interleave the output elements of the plurality of sub-output tensors to form the final output tensor for the convolution transpose by, for every jth row of the final output tensor selecting output elements from a same set of k sub- output tensors, wherein j is equal to the second stride. (Fig 6 4 sub-kernels resulting from a stride of 2 in the y direction [0087-0089], fig 5 2 subkernels in y direction forms two sub outputs in the y direction of the sub output tensor, fig 5 showing round robin, fig 7 showing interleave).
It would have been obvious to one of ordinary skill in the art before the effective filing date to substitute Song’s method for dividing the filter into a plurality of sub-filters for Lele’s method of dividing the filter into a plurality of subfilters and technique for interleaving. It is obvious to one of ordinary skill in the art to perform a simple substitution of one known element for another to obtain predictable results. See MPEP 2141.III.(B).

Regarding claim 16, in addition to the teachings addressed in the claim 14 analysis, Lele teaches:
wherein the convolution transpose is performed in one or more dimensions according to a stride in that direction (Fig 3, [0047-0048]).  
the filter.  However in the same field of endeavor Song discloses:
dividing the filter into a plurality of sub-filters comprises: 
generating a reflected version of the filter (Fig 5 530); 
forming a base block of filter weights from an origin of the reflected version of the filter, the base block having dimensions equal to the stride in each of the one or more dimensions (Fig 5 530 with dimensions and stride as split in sub-kernel 540, [0079]); and 
forming each sub-filter from the filter weights of the reflected version of the filter at the stride increments in each of the one or more dimensions starting from one of the filter weights in the base block (Fi g5 540).
It would have been obvious to one of ordinary skill in the art before the effective filing date to substitute Song’s method for dividing the filter into a plurality of sub-filters for Lele’s method of dividing the filter into a plurality of subfilters. It is obvious to one of ordinary skill in the art to perform a simple substitution of one known element for another to obtain predictable results. See MPEP 2141.III.(B).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

US 20200110986 A1 Michiels et al., (hereinafter “Michiels”) discloses an apparatus to perform deconvolution implemented via a set of convolutions applied to segments (abstract, fig 32).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY E LAROCQUE whose telephone number is (469)295-9289.  The examiner can normally be reached on 10:00am - 1200pm, 2:00pm - 8pm ET M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Jyoti Mehta can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 



/EMILY E LAROCQUE/Primary Examiner, Art Unit 2182