DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-5, 8, 10 and 16-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ferdman et al U.S. Pub No 20190220734 A1.
       As per claim 1, Ferdman et al teach  method for adapting feature data in a convolutional neural network, comprising: selecting a plurality of consecutive layers in the convolutional neural network (see para [0066] for…… FIGs. 2-3, the convolutional layers 110, 112, . . . , and 114 can be decomposed into two pyramids 203, 205. This organization of the convolutional neural network 100 will require greater off-chip transfer to external memory 106 because the output of convolutional layer 112 of the first pyramid 203 must be written off chip to external memory 106 and then read back on-chip in order to process output feature maps 210 through convolution layer 114 of the second pyramid 205), a sum of sizes of output feature data and related parameters of a last layer in the plurality of consecutive layers being less than or equal to a capacity of a cache for caching data relating to operations of the convolutional neural network (see figs.1C, 2-3 element 110 and para [0067]  for…The benefit of the multi-pyramid approach is that the sizes of the feature maps and sizes of three-dimensional tile structures associated with the feature maps of the convolutional layers 110, 112, . . . , and 114 will require decreased on-chip storage..); determining an expected number of subdata blocks and a layout position (see para [0056]  for… the fused layer convolutional neural network 100 exploits inter-layer data locality among feature-map data (e.g., the three-dimensional tile structures and para [0058] for… More specifically, a three-dimensional tile structure of the input feature maps 102 is an input data region from which other data regions, which can also be three-dimensional tile structures) Examiner note: one of ordinary skill in the art would know that data locality or data region is equivalent to the claimed subject matter layout position), a width and a height of each subdata block in the output feature data of the last layer (see figs.3C-D elements 322 and 332 and abstract for….The tile structure of each iteration represents a different subset of data values in the input feature maps and  para [0088] for… a certain number of the same intermediate data values can be used to process the output three-dimensional tile structures 322, 332 in the output feature maps 314. Examiner note: one of ordinary skill in the art would know that  each three dimensional tile base pyramid is equivalent to the claimed subject matter expected number of subdata blocks having width and height (see fig.3B, 3C and 3D and); determining, for each current layer in the plurality of layers starting from the last layer, a layout position, a width, and a height of each subdata block of an input feature data for the current layer according to the layout position, the width, and the height of each subdata block of the output feature data of the current layer, until the layout position, the width, and the height of each subdata block of the input feature data for a first layer in the plurality of layers are determined, each subdata block of the input feature data for the first layer being capable of entirely stored in the cache (see para [0076] for…. In reasoning out the convolutional processing, an output data region (e.g., three-dimensional tile structure 322) is selected in the output feature maps 314, and then traced backwards through an intermediate data region of the intermediate feature maps 308 (e.g., three-dimensional tile structure 320), to an input data region (e.g., a three-dimensional tile structure 318) in the input feature maps 302 on which the output region of the output feature maps 314 depends. Examiner note: in figs.3C-D, the process start from the final output layer (314) and work backwards to the input layer (302) to find the dimension of the pyramid which is moved so as to perform the convolutions over the entire input feature map); and determining an actual position of each subdata block of the input feature data for the first layer in the input feature data of the first layer (see para [0084] for…. The output three-dimensional tile structure 332 is considered the tip of the pyramid, includes 1×1×P output data values, and extends through all P output feature maps. Examiner note: one of ordinary skill in the art would know that the method depicted in figs.3C-D  would imply determining the actual location or position of three-dimensional tile base pyramid structure until all pyramids of the input features maps for the first convolutional layer have been processed because it would not be possible to perform the convolutions over the entire input feature map).
       As per claims 2 and 18, Ferdman et al inherently teaches obtaining each subdata block of the input feature data for the first layer according to the actual position, the width and the height of the subdata block(see para [0084] for…. The output three-dimensional tile structure 332 is considered the tip of the pyramid, includes 1×1×P output data values, and extends through all P output feature maps. Examiner note: one of ordinary skill in the art would know that the method depicted in figs.3  would imply determining the actual location or position of three-dimensional tile base pyramid structure because it would not be possible to perform the convolutions over the entire input feature map); and storing the subdata block entirely in the cache for a convolutional operation of the first layer (see fig.1C element 106 and para [0079] for….The data values of the output three-dimensional tile structure 322 are stored to one or more on-chip buffers and para [0080] for…When the tip of the pyramid is reached, the data values of the output three-dimensional tile structure 322 for the output feature maps 314 can be written from the one or more on-chip buffers to external memory 106).
         As per claim 3, Ferdman et al inherently teaches  wherein the sum of sizes (see para [0067] for….The benefit of the multi-pyramid approach is that the sizes of the feature maps and sizes of three-dimensional tile structures associated with the feature maps of the convolutional layers. Examiner note: one of ordinary skill in the art would know addition pyramids are pyramids of numbers. Each number in the upper rows is the sum of the two numbers below it. Therefore Ferdman does teach the sum of sizes) of the output feature data and related parameters of the last layer is less than or equal to 2/3 of the capacity of the cache (see para [0067] for… the benefit of the multi-pyramid approach is that the sizes of the feature maps and sizes of three-dimensional tile structures associated with the feature maps of the convolutional layers 110, 112, . . . , and 114 will require decreased on-chip storage.).
         As per claim 4, Ferdman et al inherently teaches, wherein the output feature data of the last layer has a width and a height equal to those of data obtained by splicing ( Examiner note: one of ordinary skill in the art would know that splicing is equivalent to fusing or combined or joint together) all subdata blocks of the output feature data of the last layer together according to the layout position (see figs.3A-D)  of each subdata block without overlapping each other (see para [0015] for….The operations further include computing intermediate non-overlapping convolved data values that are associated with the subset of the data values in the current three-dimensional tile structure).
         As per claim 5, Ferdman et al inherently teaches wherein the expected number depends on a reference value and a size of the input feature data for each layer of the plurality of layers (see para [00711] for….The input feature maps 302 include N different feature maps (e.g., N=3) of example size R×C (e.g., R=7 and C=7). More specifically, there are N-number of feature maps 302, with each of the feature maps 302 including R×C data elements 304. There are associated with the first convolutional layer 307).
         As per claim 8 , Ferdman et al inherently teaches further comprising: determining an overlapping width and an overlapping height of an overlapping portion between adjacent subdata blocks (see para [0058] for… the key to accelerating processing of the convolutional neural network 100 is restructuring the conventional CNN layer-by-layer processing with fusion of adjacent CNN layers and iterative processing across all of the fused layers using data regions of feature-map data (e.g., three-dimensional tile structures)) of the input feature data for the first layer according to the layout position, the width and the height of each subdata block of the input feature data for the first layer and the width and height of the input feature data for the first layer (see fig.3D and para [0083] for…The convolutional layer 307 reuses intermediate overlapping data 330 that is associated with overlapping data 325 and convolves only the new data values (1×5×N) with M filters 306 (3×3×N) across the three-dimensional tile structure 324, producing the intermediate three-dimensional tile structure 328 (e.g., dashed box labeled Intermediate Tile 2). As particularly illustrated in FIG. 3D, the convolutional layer 307 reuses already computed overlapping data 330 in intermediate three-dimensional tile structure 328 that is associated with the overlapping data 325 in intermediate three-dimensional tile structure 324).
         As per claim 10 , Ferdman et al inherently teaches, further comprising: performing operations of the plurality of layers on each subdata block obtained for the first layer to obtain a corresponding output subdata block; and combining all of the obtained output subdata blocks together to obtain an actual output feature data for the last layer (see figs.3A-3D and para [0070] for… example convolutional layers 307, 309 are fused together. Examiner note: one of ordinary skill in the art would know that fused together is functionally equivalent the claimed subject matter combiner).
         As per claim 16, Ferdman et al inherently teaches apparatus for adapting feature data in a convolutional neural network, comprising: a memory having instructions stored thereon (see figs.1C or 6 element 106 or 606); and  40Attorney Docket No.: H2094-700110one or more processors (see fig.6 element 602) configured to execute the instructions, execution of the instructions causing the one or more processors to perform the method (see para [0012] for… The system includes a processor device, and a memory device to storing instructions that, when executed by the processing device, cause the processing device to perform the following operations.).
         As per claim 17, Ferdman et al teaches an apparatus for adapting feature data in a convolutional neural network, comprising: a selector to select a plurality of consecutive layers in the convolutional neural network (see para [0066] for…… FIG. 2B, the convolutional layers 110, 112, . . . , and 114 can be decomposed into two pyramids 203, 205. This organization of the convolutional neural network 100 will require greater off-chip transfer to external memory 106 because the output of convolutional layer 112 of the first pyramid 203 must be written off chip to external memory 106 and then read back on-chip in order to process output feature maps 210 through convolution layer 114 of the second pyramid 205), a sum of sizes of output feature data and related parameters of a last layer in the plurality of consecutive layers being less than or equal to a capacity of a cache for caching data relating to operations of the convolutional neural network (see figs.1C, 2-3 element 110 and para [0067]  for…The benefit of the multi-pyramid approach is that the sizes of the feature maps and sizes of three-dimensional tile structures associated with the feature maps of the convolutional layers 110, 112, . . . , and 114 will require decreased on-chip storage..); a splitter (see  para [0063] for it should be noted that the convolutional neural network 100 can be partitioned such that pyramid 205 includes multiple convolution layers that are fused. Examiner note: one of ordinary skill in the art would know that a splitter is functionally equivalent to partition. Hence Ferdman inherently teach a splitter. ) to determine an expected number of subdata blocks and a layout position (see para [0056]  for… the fused layer convolutional neural network 100 exploits inter-layer data locality among feature-map data (e.g., the three-dimensional tile structures and para [0058] for… More specifically, a three-dimensional tile structure of the input feature maps 102 is an input data region from which other data regions, which can also be three-dimensional tile structures) Examiner note: one of ordinary skill in the art would know that data locality or data region is equivalent to the claimed subject matter layout position), a width and a height of each subdata block in the output feature data of the last layer (see figs.3 elements 318 or 320 or 324 or 326 and abstract for….The tile structure of each iteration represents a different subset of data values in the input feature maps and  para [0077] for… the convolutional layer 307 reads the input three-dimensional tile structure 318 (e.g., dashed box labeled Input Tile 1) of its input feature maps 302. The input three-dimensional tile structure 318 is considered as the base of the pyramid, includes 5×5×N input data values, and extends through all N input feature maps 302. Examiner note: one of ordinary skill in the art would know that  three dimensional tile base pyramid is equivalent to the claimed subject matter subdata blocks having width and height (see fig.3B, 3C and 3D and); determine, for each current layer in the plurality of layers starting from the last layer, a layout position, a width, and a height of each subdata block of an input feature data for the current layer according to the layout position, the width, and the height of each subdata block of the output feature data of the current layer, until the layout position, the width, and the height of each subdata block of the input feature data for a first layer in the plurality of layers are determined, each subdata block of the input feature data for the first layer being capable of entirely stored in the cache (see para [0076] for…. In reasoning out the convolutional processing, an output data region (e.g., three-dimensional tile structure 322) is selected in the output feature maps 314, and then traced backwards through an intermediate data region of the intermediate feature maps 308 (e.g., three-dimensional tile structure 320), to an input data region (e.g., a three-dimensional tile structure 318) in the input feature maps 302 on which the output region of the output feature maps 314 depends. Examiner note: in fig.3, the process start from the final layer and work backwards to find the dimension of the pyramid which is moved so as to perform the convolutions over the entire input feature map); and determine an actual position of each subdata block of the input feature data for the first layer in the input feature data of the first layer (see para [0084] for…. The output three-dimensional tile structure 332 is considered the tip of the pyramid, includes 1×1×P output data values, and extends through all P output feature maps. Examiner note: one of ordinary skill in the art would know that the method depicted in figs.3  would imply determining the actual location or position of three-dimensional tile base pyramid structure because it would not be possible to perform the convolutions over the entire input feature map).
         As per claim 19, Ferdman et al inherently teaches further comprising: an operator configured to perform operations of the plurality of layers for each 41Attorney Docket No.: H2094-700110 subdata block of the first layer to obtain a corresponding output subdata block (see fig.4 and para [00911] for…. At operation 404, a number of convolutional layers in a pyramid of fused layers is determined for the convolutional neural network being processed).
         As per claim 20, Ferdman et al inherently teaches further comprising: a combiner (see figs.3A-3D and para [0070] for… example convolutional layers 307, 309 are fused together. Examiner note: one of ordinary skill in the art would know that fused together is functionally equivalent the claimed subject matter combiner) configured to combine each output subdata block output from the operator to obtain an actual output feature data of the last layer (se fig.4 element 420 or 426 and para [0097] and [0099] for…. If it is determined at operation 420 that there are more convolutional layers in the pyramid of the convolutional neural network being processed, then at operation 422 the buffered intermediate data are selected as the current set of input feature maps… If it is determined at operation 426 that there are more three-dimensional structures in the input feature maps of the first convolution layer for processing, then the method 400 iterates through operations 406-426 for each subsequent three dimensional structure in the set of input feature maps in order to process the next pyramid of fused layers, until all pyramids of the input features maps for the first convolutional layer have been processed).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ferdman et al U.S. Pub No 20190220734 A1 in view of Lele et al US 20180032857 A1.
As per claim 7, Ferdman et al does not teach wherein the width and height of each subdata block of the input feature data for the current layer further depends on a width and a height of a kernel of the related parameters for the current layer, strides of the kernel of the related parameters in width and height, and one or more padding quantities for padding the subdata blocks in one or more of width and height at the current layer.
Lele et al teach wherein the width and height of each subdata block of the input feature data for the current layer further depends on a width and a height of a kernel of the related parameters for the current layer (see para [0048] In the relationship above, D represents an input depth, and k represents a height and width of a region in an input feature map and para [0136] for….The characteristics of the CNN algorithm may also include sizes and coefficients of filters, and sizes and strides of images to be processed. The CNN feature identification unit 1920 also identifies parameters of the CNN accelerator by identifying parameters for the one or more CNN algorithms that the CNN accelerator is desired to support. The parameters of a CNN algorithm may include a number of kernels to instantiate for each layer identified..), strides of the kernel of the related parameters in width and height, and one or more padding quantities for padding the subdata blocks in one or more of width and height at the current layer (see para [0045] for…Each layer in a CNN algorithm may include one or more filters, stride, and other parameters. The characteristics of the CNN algorithm may also include sizes and coefficients of filters, and sizes, strides, and padding of images to be processed).
It would have been obvious to one of ordinary skill in the art, at the time of filing or before the effective filing date of the claimed invention, to modify Ferdman to include Lele width and height of each subdata block of the input feature data for the current layer further depends on a width and a height of a kernel of the related parameters for the current layer by performing convolution and noise filtering to adjust a number of output results and further identify resources available on a target to implement the CNN accelerator. Such modification would be utilized to match input and output images and optimize performance utilizing resources available on a target used to implement the CNN accelerator, as taught by Lele.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 11-12 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ferdman et al U.S. Pub No 20190220734 A1 in view of JagaNathan et al US 20190114391 A1.
As per claim 11, Ferdman et al does not wherein the operations of the plurality of layers include an elementwise add operation performed on the output feature data of a prior layer in the plurality of layers and the output feature data of a hind layer after the prior layer in the plurality of layers.
JagaNathan et al teach wherein the operations of the plurality of layers include an elementwise add operation performed on the output feature data of a prior layer in the plurality of layers and the output feature data of a hind layer after the prior layer in the plurality of layers. (see figs. 9 and 10 and  para [0144- [0145] for…. a residual connection comprises reinjecting previous representations into the downstream flow of data by adding a past output tensor to a later output tensor…. at the end of the block, the two streams are merged using an element-wise sum).
It would have been obvious to one of ordinary skill in the art, at the time of filing or before the effective filing date of the claimed invention, to modify Ferdman to include JagaNathan operations of the plurality of layers include an elementwise add operation performed on the output feature data of a prior layer in the plurality of layers and the output feature data of a hind layer after the prior layer in the plurality of layers to make the output of an earlier layer available as input to a later layer, effectively creating a shortcut in a sequential network. Such modification would prevent information loss along the data-processing flow and further enhance the CNN to allow the gradient to flow through the network more easily, as taught by JagaNathan.
As per claim 12, Ferdman and JagaNathan in combination would teach determining a position, a width, and a height of repetitively calculated data in each 39Attorney Docket No.: H2094-700110 subdata block of the output feature data of the prior layer to make the output of an earlier layer available as input to a later layer, effectively creating a shortcut in a sequential network. Such modification would prevent information loss along the data-processing flow and further enhance the CNN to allow the gradient to flow through the network more easily, as taught by JagaNathan.
As per claim 15, Ferdman and JagaNathan in combination would teach when performing the elementwise add operation, determining an actual data range used for the elementwise add operation in the output feature data of the prior layer according to the position, width and height of the repetitively calculated data in the output feature data of the prior layer to make the output of an earlier layer available as input to a later layer, effectively creating a shortcut in a sequential network. Such modification would prevent information loss along the data-processing flow and further enhance the CNN to allow the gradient to flow through the network more easily, as taught by JagaNathan

Allowable Subject Matter
Claims 6, 9 and 13-14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
           US 20200117993 A1.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMMANUEL BAYARD whose telephone number is (571)272-3016. The examiner can normally be reached 6-9.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahn K Sam can be reached on 571-272-3044. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EMMANUEL BAYARD/Primary Examiner, Art Unit 2633