DETAILED ACTION
Currently claims 1-20 are pending for application 16/797210, filed 21 February 2020 which is a CIP of application 16/748375, filed 21 January 2020. It is noted that an IDS has not been filed with this application. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “may be” in each of claim 1 at line 17 and claim 19 at line 16 renders each respective claim indefinite since each respective claim does not definitely recite whether or not the output work data arrays are combined or not to form an output feature map. Claims 2-18 are also rejected because they depend from claim 1.

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 of this application is patentably indistinct from claims 1-20 respectively of pending Application No. 16/748375 (reference application). Pursuant to 37 CFR 1.78(f), when two or more applications filed by the same applicant or assignee contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.
Claims 1-20 are provisionally rejected on the ground of statutory double patenting as being unpatentable over claims 1-20 of copending Application No. 16/748375 (reference application) because of the following analysis:


Co-Pending Application 16/748375
Instant Application
Claim 1
Claim 1
A computer-implemented method, performed in a neural processing system comprising control processor circuitry and arithmetic logic circuitry, of performing a convolution between an input feature map and convolutional filter data, resulting in an output feature map, the method comprising: obtaining, in the control processor circuitry: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to a convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and performing, in the arithmetic logic processing circuitry, convolutions between: the plurality of input work batch data arrays, generated from the input feature map based at least in part on the one or more dimensional characteristic parameters and the one or more positional characteristic parameters; and one or more work batch filter data arrays corresponding to the convolutional filter data, to produce a plurality of output work batch data arrays which may be combined to generate an output feature map.  
A computer-implemented method, performed in a neural processing system comprising control processor circuitry and arithmetic logic circuitry, of performing a convolution between an input feature map and convolutional filter data, resulting in an output feature map, the method comprising: obtaining, in the control processor circuitry: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to a convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and performing, in the arithmetic logic processing circuitry, convolutions between: the plurality of input work batch data arrays, generated from the input feature map based at least in part on the one or more dimensional characteristic parameters and the one or more positional characteristic parameters; and one or more work batch filter data arrays corresponding to the convolutional filter data, to produce a plurality of output work batch data arrays which may be combined to generate an output feature map.  
Claim 1 of the instant application and claim 1 of co-pending application ‘375 are identical.
Claim 2/1
Claim 2/1
wherein the method comprises: determining one or more locational characteristic parameters relating to the relative locations of the feature map content in each of the plurality of input work batch data arrays; and combining the plurality of output work batch data arrays based on the one or more locational characteristic parameters to generate the output feature map.  

wherein the method comprises: determining one or more locational characteristic parameters relating to the relative locations of the feature map content in each of the plurality of input work batch data arrays; and combining the plurality of output work batch data arrays based on the one or more locational characteristic parameters to generate the output feature map.  
Claim 2/1 of the instant application and claim 2/1 of co-pending application ‘375 are identical.
Claim 3/1
Claim 3/1
wherein the method comprises: determining a plurality of array position data from the input feature map based on the one or more positional characteristics; and  - 38 -E3428.USCP1+PATENT determining the positions of feature map content within the plurality of input work batch data arrays based on the plurality of array position data.  
wherein the method comprises: determining a plurality of array position data from the input feature map based on the one or more positional characteristics; and  - 37 -E3428.US#+PATENT determining the positions of feature map content within the plurality of input work batch data arrays based on the plurality of array position data.  
Claim 3/1 of the instant application and claim 3/1 of co-pending application ‘375 are identical.
Claim 4/1
Claim 4/1
wherein the method comprises, in the control processor circuitry: receiving convolution configuration data relating to the convolution to be performed; and determining from the convolution configuration data the one or more dimensional characteristic parameters and the one or more positional characteristic parameters.
wherein the method comprises, in the control processor circuitry: receiving convolution configuration data relating to the convolution to be performed; and determining from the convolution configuration data the one or more dimensional characteristic parameters and the one or more positional characteristic parameters.  

Claim 4/1 of the instant application and claim 4/1 of co-pending application ‘375 are identical.
Claim 5/4
Claim 5/4
wherein the method comprises one or more of: determining the one or more positional characteristic parameters at least in part based on a dimension of the one or more work batch filter data arrays; and adjusting the one or more positional characteristic parameters at least in part based on a dimension of the input feature map.  
wherein the method comprises one or more of: determining the one or more positional characteristic parameters at least in part based on a dimension of the one or more work batch filter data arrays; and adjusting the one or more positional characteristic parameters at least in part based on a dimension of the input feature map.  
Claim 5/4 of the instant application and claim 5/4 of co-pending application ‘375 are identical.
Claim 6/1
Claim 6/1
wherein the plurality of input work batch data arrays are generated from the input feature map based at least in part on dimensions of the plurality of output work batch data arrays.  
wherein the plurality of input work batch data arrays are generated from the input feature map based at least in part on dimensions of the plurality of output work batch data arrays. 
Claim 6/1 of the instant application and claim 6/1 of co-pending application ‘375 are identical.
Claim 7/1
Claim 7/1
wherein the one or more positional characteristic parameters relates to an amount of edge elements required by the one or more input work batch data arrays.  
wherein the one or more positional characteristic parameters relates to an amount of edge elements required by the one or more input work batch data arrays.  
Claim 7/1 of the instant application and claim 7/1 of co-pending application ‘375 are identical.
Claim 8/1
Claim 8/1
in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from contiguous areas of the input feature map into each input work batch data array respectively.  
in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from contiguous areas of the input feature map into each input work batch data array respectively.  
Claim 8/1 of the instant application and claim 8/1 of co-pending application ‘375 are identical.
Claim 9/1
Claim 9/1
in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from noncontiguous areas of the input feature map into each input work batch data array respectively. 
in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from noncontiguous areas of the input feature map into each input work batch data array respectively.
Claim 9/1 of the instant application and claim 9/1 of co-pending application ‘375 are identical.
Claim 10/1
Claim 10/1
wherein the method comprises upsampling the one or more positional characteristic parameters.
wherein the method comprises upsampling the one or more positional characteristic parameters.  
Claim 10/1 of the instant application and claim 10/1 of co-pending application ‘375 are identical.
Claim 11/1
Claim 11/1
wherein the method comprises downsampling one or more positional characteristic parameters to determine the amount of input feature map content required by the one or more input work batch data arrays.  
wherein the method comprises downsampling one or more positional characteristic parameters to determine the amount of input feature map content required by the one or more input work batch data arrays.
Claim 11/1of the instant application and claim 11/1 of co-pending application ‘375 are identical.
Claim 12/1
Claim 12/1
wherein the method comprises upsampling the input work batch data arrays for convolution. 
wherein the method comprises upsampling the input work batch data arrays for convolution.
Claim 12/1 of the instant application and claim 12/1 of co-pending application ‘375 are identical.
Claim 13/2
Claim 13/2
wherein the method comprises determining the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution.  
wherein the method comprises determining the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution.
Claim 13/2 of the instant application and claim 13/2 of co-pending application ‘375 are identical.
Claim 14/2
Claim 14/2
wherein the method comprises receiving output feature map dimensions as part of the convolution configuration data.  
wherein the method comprises receiving output feature map dimensions as part of the convolution configuration data.  
Claim 14/2 of the instant application and claim 14/2 of co-pending application ‘375 are identical.
Claim 15/2
Claim 15/2
wherein the method comprises: receiving a convolutional operation mode as part of the convolution configuration data; and determining the output feature map dimensions using the convolutional operation mode.  
wherein the method comprises: receiving a convolutional operation mode as part of the convolution configuration data; and determining the output feature map dimensions using the convolutional operation mode.  
Claim 15/2 of the instant application and claim 15/2 of co-pending application ‘375 are identical.
Claim 16/1
Claim 16/1
wherein the method comprises generating the input work batch data arrays in the arithmetic logic processing circuitry.  
wherein the method comprises generating the input work batch data arrays in the arithmetic logic processing circuitry.  
Claim 16/1 of the instant application and claim 16/1 of co-pending application ‘375 are identical.
Claim 17/1
Claim 17/1
wherein the method comprises generating the input work batch arrays in the control processor circuitry.  
wherein the method comprises generating the input work batch arrays in the control processor circuitry.
Claim 17/1of the instant application and claim 17/1 of co-pending application ‘375 are identical.
Claim 18/1
Claim 18/1
wherein the method comprises performing convolutions between the plurality of input work batch data arrays and one or more work batch filter data arrays by storing the plurality of input work batch data arrays and one or more work batch filter data arrays in a data buffer.  
wherein the method comprises performing convolutions between the plurality of input work batch data arrays and one or more work batch filter data arrays by storing the plurality of input work batch data arrays and one or more work batch filter data arrays in a data buffer.  
Claim 18/1 of the instant application and claim 18/1 of co-pending application ‘375 are identical.
Claim 19
Claim 19
A neural processing system comprising: storage circuitry arranged to store an input feature map, convolutional filter data, and an output feature map; control processor circuitry arranged to obtain: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and arithmetic logic processing circuitry arranged to perform convolutions between: the plurality of input work batch data arrays, generated from the input feature map based at least in part on the one or more dimensional characteristics and the one or more positional characteristics; and one or more work batch filter data arrays corresponding to the convolutional filter data, to produce a plurality of output work batch data arrays which may be combined to generate an output feature map.  
A neural processing system comprising: storage circuitry arranged to store an input feature map, convolutional filter data, and an output feature map; control processor circuitry arranged to obtain: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and arithmetic logic processing circuitry arranged to perform convolutions between: the plurality of input work batch data arrays, generated from the input feature map based at least in part on the one or more dimensional characteristics and the one or more positional characteristics; and one or more work batch filter data arrays corresponding to the convolutional filter data, to produce a plurality of output work batch data arrays which may be combined to generate an output feature map.  
Claim 19 of the instant application and claim 19 of co-pending application ‘375 are identical.
Claim 20
Claim 20
A non-transitory computer-readable storage medium comprising a set of computer- readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to output data for controlling the performance of convolutions by: receiving convolution configuration data relating to a convolution to be performed; determining from the convolution configuration data: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and outputting data for controlling the performance of a convolution between an input feature map and convolutional filter data, based at least in part on the one or more dimensional characteristics and the one or more positional characteristics.
A non-transitory computer-readable storage medium comprising a set of computer- readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to output data for controlling the performance of convolutions by: receiving convolution configuration data relating to a convolution to be performed; determining from the convolution configuration data: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; and outputting data for controlling the performance of a convolution between an input feature map and convolutional filter data, based at least in part on the one or more dimensional characteristics and the one or more positional characteristics.
Claim 20 of the instant application and claim 20 of co-pending application ‘375 are identical.


This is a provisional statutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1-20 are rejected under 35 U.S.C. 101. because the claims are directed to an abstract idea; and because the claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more than the abstract idea, see Alice Corporation Pty. Ltd. v. CLS Bank International, et al, 573 U.S. (2014). In determining whether the claims are subject matter eligible, the Examiner applies the 2019 USPTO Patent Eligibility Guidelines. (2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50, Jan. 7, 2019.)
Step 1: Is the claim to a process, machine, manufacture, or composition of matter? Yes—claim 1 recites a method which is a process. Claims 20 and 19 recite a product and system, respectively.
Step 2A, prong one: Does claim 1 recite an abstract idea, law of nature or natural phenomenon? Yes—the limitations of “performing a convolution between an input feature map and convolutional filter data, resulting in an output feature map” “performing convolutions between: the plurality of input work batch data arrays … and one or more work batch filter data arrays corresponding to the convolutional filter data”, “to produce a plurality of output work batch data arrays”, and  “combined to generate an output feature map”   as drafted, are mathematical steps of performing convolution operations between various sets of arrays to generate a set of convolution outputs and of combining those arrays to form an output array/feature map in which one of the sets of arrays used in the convolution (work batch data arrays) are formed/mathematically computed/generated from input arrays according to obtained parameters. These limitations, therefore fall within the mathematical concepts group.
Step 2A, prong two: Does the claim recite additional elements that integrate the judicial exception into a practical application? No—the judicial exception is not integrated into a practical application. Although the claim recites that the recited functionality includes “A computer-implemented method”,  “a neural processing system”, “control processor circuitry”, and “arithmetic logic circuitry”, the computer is recited at a high-level of generality such that it amounts to no more than a mere instructions to apply the exception using a generic computer component. Further, the elements of “a neural processing system”, is recited at a high level of generality that merely generally links the judicial exception to a particular, respective, technological environment and does not impose a meaningful limitation on the judicial exception.  In addition, in the limitation “obtaining … one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to a convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; … generated from the input feature map based at least in part on the one or more dimensional characteristic parameters and the one or more positional characteristic parameters”, each of the functions of obtaining parameters for use in the mathematical steps and the formation of arrays according to parameters are mere data gathering steps that is recited at a high level of generality that does not impose a meaningful limit on the judicial exception. 
Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No—the only limitation on the performance of the described method is that it must be computer implemented with other limitations reciting “a neural processing system”. These elements are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity (generic computer system, processing resources, links the judicial exception to a particular, respective, technological environment).  The claim thus recites computing components only at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using generic computer components; mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. In addition, the claimed extra-solution data gathering (obtaining data/parameters/characteristics/arrays and forming arrays using parameters) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). 
Taken alone, their additional elements do not amount to significantly more than the above- identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation.
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claims 19, which recites a (neural) processing system including storage circuitry (for the feature maps and convolutional filter data) and computer-readable storage medium with computer readable instruction, and wherein it is noted that storing and obtaining data are data gathering steps.
With respect to claim 20, the corresponding analysis is as follows: Step 2A, prong one: Does claim 1 recite an abstract idea, law of nature or natural phenomenon? Yes—the limitations of  “determining from the convolution configuration data: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays” as drafted, are mental steps of determining parameters from (configuration) data which are acts of observation (and which can be performed on pin and paper). These limitations, therefore fall within the mental processes group.
Step 2A, prong two: Does the claim recite additional elements that integrate the judicial exception into a practical application? No—the judicial exception is not integrated into a practical application. Although the claim recites that the recited functionality includes “A non-transitory computer-readable storage medium”,  “a set of computer- readable instructions stored thereon”, “at least one processor”, the computer (processor, storage) is recited at a high-level of generality such that it amounts to no more than a mere instructions to apply the exception using a generic computer component. Further, the elements of “controlling the performance of convolutions”, is recited at a high level of generality that merely generally links the judicial exception to a particular, respective, technological environment and does not impose a meaningful limitation on the judicial exception.  In addition, in the limitations “output data …”, “receiving convolution configuration data relating to a convolution to be performed”, “outputting data for … of a convolution between an input feature map and convolutional filter data, based at least in part on the one or more dimensional characteristics and the one or more positional characteristics”, each of the functions of receiving data and outputting data  parameters for use in the mental step of determining parameters from data and for controlling the performance of the convolution are mere data gathering steps that are recited at a high level of generality that does not impose a meaningful limit on the judicial exception. 
Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No- the only limitation on the performance of the described method is that it must be computer implemented with other limitations reciting “controlling the performance of convolutions”. These elements are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity (generic computer system, processing resources, links the judicial exception to a particular, respective, technological environment).  The claim thus recites computing components only at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using generic computer components; mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. In addition, the claimed extra-solution data gathering (obtaining and outputting data/parameters/characteristics/arrays) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). In addition, it is noted that the limitation of “controlling the performance of a convolution between an input feature map and convolutional filter data, based at least in part on the one or more dimensional characteristics and the one or more positional characteristics” is also well-understood, conventional, and routine (see, for example, Judd et al. (“Stripes: Bit-Serial Deep Neural Network Computing”, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2016, pp. 1-12) who teach a control architecture for efficiently processing the computations in a CNN, including the partitioning of feature maps into blocks/batches and convolving them with kernel filter weights across a set of processing elements (e.g., [p. 3, Section IV, Figures 1, 2])).
Taken alone, their additional elements do not amount to significantly more than the above- identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation.
As to dependent claims 2-7, 13, and 15, additional limitations are recited that fall under Step2A prong 1 as mental steps: 
Claims 2: … “determining one or more locational characteristic parameters relating to the relative locations of the feature map content in each of the plurality of input work batch data arrays;” (including pen and paper)
Claims 3: … “determining a plurality of array position data from the input feature map based on the one or more positional characteristics;” (including pen and paper)
Claims 3: … determining the positions of feature map content within the plurality of input work batch data arrays based on the plurality of array position data (including pen and paper)
Claims 4: … determining from the convolution configuration data the one or more dimensional characteristic parameters and the one or more positional characteristic parameters. (including pen and paper)
Claims 5: determining the one or more positional characteristic parameters at least in part based on a dimension of the one or more work batch filter data arrays;  (including pen and paper) 
Claims 6: wherein the plurality of input work batch data arrays are generated from the input feature map based at least in part on dimensions of the plurality of output work batch data arrays.  ;  (including pen and paper)
Claims 7: wherein the one or more positional characteristic parameters relates to an amount of edge elements required by the one or more input work batch data arrays.;  (including pen and paper)
Claims 13 : “wherein the method comprises determining the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution” (pen and paper/observation) 
Claims 15 : “determining the output feature map dimensions using the convolutional operation mode” (pen and paper/observation) 
In addition, it is noted that claims 2, 3, 5, 6, 10-12, and 16-18 recite additional limitations that fall under Step2A prong 1 as mathematical steps in the mathematical concepts group:
Claims 2: “combining the plurality of output work batch data arrays based on the one or more locational characteristic parameters to generate the output feature map”  
Claims 3: “combining the plurality of output work batch data arrays based on the one or more locational characteristic parameters to generate the output feature map”  
Claims 5: and adjusting the one or more positional characteristic parameters at least in part based on a dimension of the input feature map.  
Claims 5: and adjusting the one or more positional characteristic parameters at least in part based on a dimension of the input feature map.  
Claims 6: wherein the plurality of input work batch data arrays are generated from the input feature map based at least in part on dimensions of the plurality of output work batch data arrays. 
Claim 10: wherein the method comprises upsampling the one or more positional characteristic parameters  
Claim 11: wherein the method comprises downsampling one or more positional characteristic parameters to determine the amount of input feature map content required by the one or more input work batch data arrays
Claim 12: wherein the method comprises upsampling the input work batch data arrays for convolution
Claim 16: wherein the method comprises generating the input work batch data arrays …
Claim 17: wherein the method comprises generating the input work batch arrays …
Claim 18: wherein the method comprises performing convolutions between the plurality of input work batch data arrays and one or more work batch filter data arrays…
In addition, claims 4, 8, 9, and 13-18 recite additional elements to be addressed at Step 2A, Prong 2 and at Step 2B as follows: 
Claims 4, 16, 17, and 18 recite the generic computer components  “control processor circuitry” (claims 4 and 16) and “control processor circuitry” (claim 17), and “a data buffer” (claim 18). Claims 4, 8, 9, 14, 15, and 18 recite the data gathering steps of   “receiving convolution configuration data relating to the convolution to be performed” (claim 4),  “in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from contiguous areas of the input feature map into each input work batch data array respectively” (claim 8),  “in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from noncontiguous areas of the input feature map into each input work batch data array respectively” (claim 9), “wherein the method comprises receiving output feature map dimensions as part of the convolution configuration data” (claim 14), “receiving a convolutional operation mode as part of the convolution configuration data” (claim 15), and “by storing the plurality of input work batch data arrays and one or more work batch filter data arrays in a data buffer”.  Further, the element of “bilinear convolution”, is recited at a high level of generality that merely generally links the judicial exception to a particular, respective, technological environment and does not impose a meaningful limitation on the judicial exception. In addition, the claimed extra-solution data gathering (forming training and validation sub-sets) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). It is further noted that the functionality of generating the input work batch data array in either the arithmetic logic processing circuit or the control processor circuitry is also well-known and understood (see, for example, Huang et al. US20180061059, Published 1 March 2018 [0060, 0090, Figure 7, Figure 11]).
In summary, as shown in the analysis above, claims 1-20 do not provide any additional elements that when considered individually or as an ordered combination, amount to significantly more than the abstract idea identified. Therefore, as a whole claims 1-20 do not recite what have the courts have identified as "significantly more”. In particular, there is no indication that the combination of elements improves the functioning of a computer or improves another technology when claims are considered individually or as an ordered combination.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-9, 11-12, and 14-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Huang et al. (US20180173571, Published 21 June 2018), hereinafter referred to as Huang.

In regards to claim 1, Huang teaches A computer-implemented method, performed in a neural processing system comprising control processor circuitry and arithmetic logic circuitry, of performing a convolution between an input feature map and convolutional filter data, resulting in an output feature map, the method comprising: ([0064, 0126, Figure 1, Figure 2, Figure 9, Figure 10, Figure 11] The computing unit 207 may comprise arrays of calculation circuits . The calculation circuits may include arithmetic logic units ( ALUS ) . The ALUS may be in the arrays that are connected via a network which may depend on the dataflow requirements., FIG . 10 illustrates that multiplexers facilitate implementation of a distinct memory access pattern in convolution computation on the chip . The one or more multiplexers 1001 may receive a set of control signals 1007 to select one of a predetermined plurality of routes for transmitting data to one of the plurality of multipliers . The control signals may be decoded from various commands for the multiplexers . These control signals include activation function enablement , and also the input source selection for computation , either from the image buffer or from the output of the previous layer , selection of parameters , biases , input feature maps address ( i . e . , slice index , offset in slice ) , size of parameters or input data and so forth., wherein a CNN implementation framework comprises various processing elements (circuitry) for performing the convolution filtering of input feature maps at any layer which generate output feature maps for that layer that are in turn fed into the next layer as the input feature map for the following layer (Figure 1) such this this circuitry includes physical control functionality for allocating instructions and data required for that convolution to on-chip memory (e.g., Figure 2, Figure 9) and for orchestrating the convolution operation itself using arithmetic logic circuitry (Figure 2, Figure 11).) obtaining, in the control processor circuitry: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to a convolution to be performed; ([0093, 0094, Figure 3, Figure 7, Figure 10, Figure 11] For example , the parameters and input data may still be stored in a contiguous space which can be identified by an address and a size of the space . The contiguous space may be divided into one or more contiguous regions . The contiguous space or region may be divided into one or more contiguous slices . The slices may be identified by an offset address according to a base address of the contiguous region and a size of the slice . In some cases , the size of the slice may be variable depending on the total size of the parameters in a layer and the total number of slices . The total number of slices may be a variable or a fixed number . In some cases , the total number of slices and the number of units along the slice direction together define a data block which is to be processed by the computing unit in a batch manner . In some cases , the size of the slice may be a pre - determined size whereas the total number of slices may be variable . The size of the slice may vary in a wide range such as from 1 byte to thousands of bytes . For example , given an input image having 128x128 pixels in three channels and a first layer of a CNN having 16 5x5 kernels in three channels , the system can choose to have eight slices for storing the input image . The size of each slice can then be 8 , 192B ( 2 to the 13th power ) to fit all the features in the input image . This size also allows padding so as to utilize one of the predetermined chip layouts , as further discussed below ., In some cases , the contiguous space in the RAM may have the same size as the contiguous space in the main memory . The contiguous in the RAM may receive the data from the contiguous space in the main memory without alternating the arrangement of the data . In some cases , information regarding the data arrangement may also be transmitted to the RAM . Such information may include address and size of the contiguous space , address and size of the contiguous region , slice number , slice index , offset within a contiguous region and the like .,
wherein the CNN implementation framework determines the slice/partition/block (input work batch data array) of the feature map input into a given layer to be allocated to chip memory for convolution batch processing (Figure 11) such that the dimensions of this slice are determined according to various criteria (such as the size of the RAM/SRAM memory, dimensions of the IFM, particularly the number of channels but also according to other dimensional parameters such as the dimensions of the kernel filter) and are used/obtained by the control circuitry for orchestrating the transfer of that batch data to on-chip memory.) and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; ([0093, 0094, 0096, Figure 7, Figure 11] The slices may be identified by an offset address according to a base address of the contiguous region and a size of the slice . In some cases , the size of the slice may be variable depending on the total size of the parameters in a layer and the total number of slices . … This size also allows padding so as to utilize one of the predetermined chip layouts , as further discussed below ., In some cases , information regarding the data arrangement may also be transmitted to the RAM . Such information may include address and size of the contiguous space , address and size of the contiguous region , slice number , slice index , offset within a contiguous region and the like., The instruction may comprise at least an address and a size that together identify a contiguous space storing the input data and an address and a size that identify a contiguous space storing the parameters of the CNN model . For instance , the size of the input data or parameters may be specified by the data - width operand in the data transfer instructions . In some cases , the data transfer instruction may also include an offset of the slice and the pre - determined size of the slice within a group / layer ., wherein this control circuitry uses/obtains various positional parameters for determining the location, span, and content of each slice/block within the input feature map corresponding to the input work batch data array according to various positional characteristic parameters which include various relative position/spatial parameters such as the offset (related to the filter dimension) and padding such that the circuitry thereby forms a mapping of the corresponding regions/slices in the input feature map into the main memory for transference to on-chip memory using the same organized memory structure (Figure 7, Figure 11) and wherein, in a more general sense, the organization and indexing of the slices also forms positional characteristic parameters.)  and performing, in the arithmetic logic processing circuitry, convolutions between: the plurality of input work batch data arrays, generated from the input feature map based at least in part on the one or more dimensional characteristic parameters and the one or more positional characteristic parameters; ([0105, 0106, , Figure 8, Figure 11, Figure 13] In some embodiments , input feature map within a slice may be arranged such that all items stored at the same offset from starting points of the number of slices are used for the parallel operations . In this way , a chunk of input feature map data to be identified as index of slices or number of slices and offset or number of rows . The chunk of input feature map data may be supplied to a plurality of multipliers for convolution operations in parallel . The chunk of input feature map data may be a data block comprising one or more rows and one or more slices ., The number of rows and slices to be processed in parallel may correspond to different configurations of data storage . In some cases , when data are arranged in the same configuration , the same sets of calculation circuits and interconnect configurations can be used for performing the convolution operations . … In some cases , input data or parameter data may not be aligned with a pre - determined configuration of data storage while pertaining to the channels or filter sizes . In this case , the input data or parameter data may be padded with zeros such that the data arrangement may be aligned with a pre determined configuration of the chip …. In the case when the input data is image data 801 with dimension of 128x128 pixel and three channels , the input data may be padded with a row of zeros such that the input data with original dimension of 128x128x3 is transformed to 128x64x8 which is aligned with a 4 - row query configuration . In the example when the parameters are from K kernels each is 5x5 in size across eight channels 803 ( i . e . , 5x5x3 ) , the parameters may be arranged and padded with zeros such that the parameters data are transformed to 5x3x8 to be aligned with the 4 - row query configuration . It should be noted that zeros can be placed in various locations such as to the top or bottom of the rows , or to the first or last columns so as to complete the size of the kernel to be times of 4 or complete the number of channels to be times of 4 ., wherein the CNN implementation framework executes the CNN convolution for each of the slices/blocks (input work batch data arrays) that were generated/organized according to the positional parameters (e.g. padding) and dimensional parameters (size of the batch data, size of slice, size of filter) and one or more work batch filter data arrays corresponding to the convolutional filter data, to produce a plurality of output work batch data arrays which may be combined to generate an output feature map.  ([0131, Figure 1, Figure 13] In each cycle , the computing unit may be able to handle a plurality of input values in parallel . In the depicted example , the computing unit may be capable of handling 128 input feature map data and 128 parameters in parallel . The 128 multipliers may be configured to perform multiplication in parallel and each of the plurality of accumulators 1103 may sum the outputs of four multipliers and accumulate the partial sum results for one or more cycles . Then the accumulated partial results may further be summed and accumulated by one or more accumulators to yield a final output of a convolution layer., wherein the CNN implementation framework performs the convolution filtering across the input feature map by successively processing over successive cycles a different set of slices/blocks (input work batch data arrays) in combination with kernel filter arrays (work batch filter data arrays) which are stored locally with the slices, thereby generating a set of output arrays (output work batch data arrays) such that each output array corresponds to the result of processing a slice and such that these output work batch data arrays are accumulated over those successive cycles and combined to form the output feature map for a given layer.)

In regards to claim 2, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises: determining one or more locational characteristic parameters relating to the relative locations of the feature map content in each of the plurality of input work batch data arrays; ([0093, 0104, Figure 1, Figure 7, Figure 11]At step 407 , the input data , various CNN model parameters and associated data may be transmitted from the main memory to a random - access - memory ( RAM ) on the chip . The data to be transmitted may comprise the arranged input data , parameters and other data such as biases , instruction sets associated with the selected CNN model . The data may be loaded to the on - chip RAM and stored in a similar manner as the data stored in the main memory . For example , the parameters and input data may still be stored in a contiguous space which can be identified by an address and a size of the space . The contiguous space may be divided into one or more contiguous regions . The contiguous space or region may be divided into one or more contiguous slices . The slices may be identified by an offset address according to a base address of the contiguous region and a size of the slice ., FIG . 7 shows an exemplary arrangement of input features stored into slices within a contiguous region . As illustrated in the figure , the input feature map may be 4x4 ( i . e . , HxW ) in plane dimension across eight channels CO - C7 . The contiguous space for storing the parameters may be divided into eight slices Is1 - 8 . In the depicted examples , every row offset pointing to four rows and every two columns / slices may together store parameters corresponding to a point of a 2D plane ( i . e . , HOWOC ; ) in a filter across eight channels . The number of slices corresponding to a point may be determined by the number of channels . For example , when there are four channels , 1 slice may be enough for storing the parameters with respect to a point . In another example , when there are 16 channels , four slices may be used to store the parameters with respect to a point ., wherein this control circuitry determines/uses/obtains various locational parameters for determining the array index location in the contiguous memory structure (such as based on an offset) associated with the input feature map array corresponding to each slice/block within that input feature map (as shown, for example in Figures 7 and Figure 11 such that this information characterizes and organizes the correspondence between the content of particular elements/regions in a given feature map in that memory structure with the content of particular slices that are to be processed).) and combining the plurality of output work batch data arrays based on the one or more locational characteristic parameters to generate the output feature map.  ([0039, 0040, 0134, Figure 1, Figure 13] For example , given the shape of the input feature map plane with size of HxH ( i . e . , weight and height ) across C channels , and N filters each has C channels with filter plane dimension RxR ( i . e . , weight and height ) , the computation of the convolution layer may be defined as : …Where O , I , W and B represent the matrices of the output features maps , input features maps , filters and biases , respectively . U represents the stride size., In some cases , after a batch of the input feature map data or data block ( e . g . , eight slices by four rows ) are finished with processing , the offset may be increased by the data block size ( e . g . , four rows ) and the next batch of data are fetched to the computing unit repeatedly until all of the parameters and input feature map are processed for a layer of operation . wherein, as previously noted, the processing of input work batch arrays generates an output array for each slice/block (output work batch array) that are accumulated and combined across successive processing cycles such that this combination takes into account the locational data (input array indices, offset parameters) to construct that output feature map according to the convolution filtering functionality thereby enabling that output feature map to be organized and processed similarly at a succeeding layer.) 

In regards to claim 3, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises: determining a plurality of array position data from the input feature map based on the one or more positional characteristics; and determining the positions of feature map content within the plurality of input work batch data arrays based on the plurality of array position data.  ([0104, 0106, Figure 7, Figure 8, Figure 11] FIG . 7 shows an exemplary arrangement of input features stored into slices within a contiguous region . As illustrated in the figure , the input feature map may be 4x4 ( i . e . , HxW ) in plane dimension across eight channels CO - C7 . The contiguous space for storing the parameters may be divided into eight slices Is1 - 8 . In the depicted examples , every row offset pointing to four rows and every two columns / slices may together store parameters corresponding to a point of a 2D plane ( i . e . , HOWOC ; ) in a filter across eight channels . The number of slices corresponding to a point may be determined by the number of channels . For example , when there are four channels , 1 slice may be enough for storing the parameters with respect to a point . In another example , when there are 16 channels , four slices may be used to store the parameters with respect to a point.,The number of rows and slices to be processed in parallel may correspond to different configurations of data storage . In some cases , when data are arranged in the same configuration , the same sets of calculation circuits and interconnect configurations can be used for performing the convolution operations . … In some cases , input data or parameter data may not be aligned with a pre - determined configuration of data storage while pertaining to the channels or filter sizes . In this case , the input data or parameter data may be padded with zeros such that the data arrangement may be aligned with a pre determined configuration of the chip …. In the case when the input data is image data 801 with dimension of 128x128 pixel and three channels , the input data may be padded with a row of zeros such that the input data with original dimension of 128x128x3 is transformed to 128x64x8 which is aligned with a 4 - row query configuration . In the example when the parameters are from K kernels each is 5x5 in size across eight channels 803 ( i . e . , 5x5x3 ) , the parameters may be arranged and padded with zeros such that the parameters data are transformed to 5x3x8 to be aligned with the 4 - row query configuration . It should be noted that zeros can be placed in various locations such as to the top or bottom of the rows , or to the first or last columns so as to complete the size of the kernel to be times of 4 or complete the number of channels to be times of 4 ., wherein this control circuitry determines position data in the feature map to be processed (such as array location indices) using/taking into account various parameters which include positional characteristics such as padding (Figure 8) as well as offset information (Figure 11) to thereby form a mapping from particular elements/regions in the input feature map to be processed to the slices (input work batch data arrays) .)

In regards to claim 4, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises, in the control processor circuitry: receiving convolution configuration data relating to the convolution to be performed;  and determining from the convolution configuration data the one or more dimensional characteristic parameters and the one or more positional characteristic parameters.  ([0044, 0045, 0046, 0084, 0092, Figure 7, Figure 8, Figure 11] Stride controls how depth columns around the spatial dimensions ( width and height ) are allocated . When the stride is 1 , a new depth column of neurons is allocated to spatial positions only one spatial unit apart . This leads to heavily overlapping receptive fields between the columns , and also to large output volumes ., Sometimes it is convenient to pad the input with zeros on the border of the input volume . The size of this zero - padding is another hyper - parameter . Zero padding pro vides control of the output volume spatial size . In particular , sometimes it is desirable to exactly preserve the spatial size of the input volume ., The spatial size of the output volume can be computed as a function of the input volume size W , the kernel field size of the convolution layer neurons K , the stride with which they are applied S and the amount of zero padding P . The formula for calculating how many neurons fit in a given volume is given by ( W - K + 2 P ) / S + 1 . If this number is not an integer , then the strides are set incorrectly and the neurons cannot be tiled to fit across the input volume in a symmetric way . In general , setting zero padding to be P = ( K - 1 ) / 2 when the stride is S = 1 ensures that the input volume and output volume will have the same size spatially., For example , the instructions may include high - level instructions corresponding to layers of the CNN such as types of layers ( e . g . , convolution , pooling , upscale , etc ) , low - level instructions corresponding to different types of operations including but not limited to convolution , elementwise convolution , upscale , return , or pooling at matrix / matrix or vector / matrix data level., At step 405 , the main processor may arrange the input data and store the data into a space on the main memory . The data stored in the main memory may be the raw or processed input data rearranged by the main processor . For example , the processed input data may be a down sized image data or a segmented image data . In some cases , the input data may be arranged according to the selected CNN models . In some cases , the input data may be arranged according to a pre - determined configuration of the chip , which determines the CNN dataflow or data transmission routes . In some cases , the input data may be arranged and zero - padded to conform to the pre - determined configuration of the chip for dataflow or data transmission route in the CNN system., wherein the control circuitry receives/makes use of various parameters that specify the configuration of the CNN such as chip layout/memory characteristics, hyperparameters (stride, depth, presence or absence of padding), and layer type/connectivity (including upscaling/upsampling, pooling/downsampling) upon which the dimensional characteristics of the input work batch data array (size, span) and positional  characteristics of the input work batch data array (padding, upsizing/upsampling) depend upon.)

In regards to claim 5, the rejection of claim 4 is incorporated and Huang further teaches wherein the method comprises one or more of: determining the one or more positional characteristic parameters at least in part based on a dimension of the one or more work batch filter data arrays; adjusting the one or more positional characteristic parameters at least in part based on a dimension of the input feature map ([0005, 0046, 0093, 0100, Figure 6, Figure 7, Figure 8, Figure 12B] In some cases , the method may further comprise : for the one layer , determining a number of slices based on the filter size , that the contiguous space is divided into a plurality of regions , each region being contiguous ; and dividing an area within one of the plurality of regions into at least the number of slices , each slice being contiguous , that the storing includes arranging the items across the number of slices such that all items stored at the same offset from starting points of the number of slices are used for the parallel operations ., The spatial size of the output volume can be computed as a function of the input volume size W , the kernel field size of the convolution layer neurons K , the stride with which they are applied S and the amount of zero padding P . The formula for calculating how many neurons fit in a given volume is given by ( W - K + 2 P ) / S + 1 . If this number is not an integer , then the strides are set incorrectly and the neurons cannot be tiled to fit across the input volume in a symmetric way . In general , setting zero padding to be P = ( K - 1 ) / 2 when the stride is S = 1 ensures that the input volume and output volume will have the same size spatially ., For example , given an input image having 128x128 pixels in three channels and a first layer of a CNN having 16 5x5 kernels in three channels , the system can choose to have eight slices for storing the input image . The size of each slice can then be 8 , 192B ( 2 to the 13th power ) to fit all the features in the input image . This size also allows padding so as to utilize one of the predetermined chip layouts , as further discussed below ., The parameters within a group of parameters associated with a layer may be arranged in accordance with information about the CNN . The information regarding the CNN may include for example , a distinct combination of a number of filters / kernels [ K ] , a number of channels [ C ] , and a filter size [ P ] . In some embodiments , the space within a contiguous region where data associated with a layer is stored may be divided into a number of slices . Alternatively , the number of slices and size of each slice may generally be determined based on a kernel size …. The greater the kernel size / number of parameters , the greater the slice size . As illustrated in the figure , a convolution layer may comprise four kernels KO - K3 , each kernel may comprise 2x2 parameters ( i . e . , RO - R1 , SO - S1 ) , and each kernel has eight channels CO - C7 . The contiguous space for storing the parameters may be divided into eight slices Ps1 - 8 ., wherein the size of a feature map slice (dimensional characteristic parameter) of a particular slice as well as the offset associated with that array across different slices and padding (positional characteristics) corresponds to/is based on/takes account of the size of the dimension of kernel size (batch filter data array dimension) for the kernel filter array to be applied for each batch (e.g., Figure 8 shows zero padding with the slices computed taking into account the dimensions of the kernel filter)

In regards to claim 6, the rejection of claim 1 is incorporated and Huang further teaches wherein the plurality of input work batch data arrays are generated from the input feature map based at least in part on dimensions of the plurality of output work batch data arrays.  ([0103, 0130, 0144, 0145, Figure 12A, Figure 14]In some embodiments , the number of slices used for input features depends on how much data is processed by the computing unit per cycle . Generally , the number of slices is C * P / NR , where NR is the number of rows in the slices . In addition , the previous layer should generate output data in slices according to current layer ' s requirement for input data . Therefore , when the next layer has the K4C8P4 configuration , the output of current layer can write to eight slices, when the next operation has the K1C16P8 configuration , the output of current layer can write to 32 slices , and when the next operation uses K8C16P1 configuration , the output of current layer can write to four slices , as further discussed below ., The computing unit may be implemented using a plurality of multiplexers , multipliers , adders / accumulators , and / or other elements such as splitters or delay elements . The computing unit can be implemented with various configurations . The various calculation circuits may be inter connected in various different ways . The configurations of the calculation circuits are be advantageous to allow for an efficient utilization of the plurality of calculation circuits while adaptation to different input data / parameters layouts . …The convolution layer can be a depthwise convolution layer or a pointwise convolution layer . It should be noted that the number of multipliers and adders are for illustrative purpose only , any number of multipliers ( e . g . , 32 , 64 , 128 , 256 , 512 , etc . ) and any number of adders can be utilized in the computing unit ., EHOW1Ci * KOROS1Ci for i = 0 - 15 . The number of multiplication is determined by the kernel size . In the depicted example , because the kernel contains only one parameter , each cycle the second level adder such as adder 0 ' will output a convolution result . The convolution operations will be applied across the entire input feature map until finish . For example , after at least one cycle , eight convolution output results may be obtained from the eight second level adders 0 ' - 7 ' ., In some embodiments , only one level of adders / accumulators may be used . As a variation example of the configuration as illustrated in FIG . 14 , the computing unit may comprise 32 accumulators each connected with four multipliers without second - level accumulators . Every four multipliers ( e . g . , first four multipliers ) may be used to perform multiplication of a 2x2 region of the input feature map and the kernel and the products are summed and accumulated by an adder / accumulator ( e . g . , adder 0 ) . Multiple cycles clocks may be required to generate one output result . The number of cycles may be determined by the kernel size and the number of channels . For instance , in the depicted example , since the kernel size is 4 and the convolution is applied to eight channels , the total number of clock cycles to generate one output is 8 cycles = 2 ( cycles for one parameter across eight channels ) x4 ( parameters ) . It may take at least eight cycles to process input feature map data stored in a data block containing eight slices and eight rows ., wherein the organization of the slices (input work batch data arrays), including their dimensions, are determined according to/are configured to be compatible with the dimensions of the partial sums output by an adder (i.e., an adder at any given level in the convolution computation)  such that the number of elements/dimensions in each slice/input work batch data array (e.g., 8 in Figure 14) is related to (configured to accommodate) the configuration of the multiplier and adder functionality (e.g., each adder receives 4 products in Figure 14) which corresponds to the dimension of each output work batch data array/partial sum (e.g., a dimension of 4 per cycle over 2 cycles as shown in Figure 14) and wherein, in another sense, the slices/input work batch data arrays are generated according to the dimensions associated with the output slices/output work batch data array formed from the previous layer.)

In regards to claim 7, the rejection of claim 1 is incorporated and Huang further teaches wherein the one or more positional characteristic parameters relates to an amount of edge elements required by the one or more input work batch data arrays.  ([0106, Figure 8] In some cases , when data are arranged in the same configuration , the same sets of calculation circuits and interconnect configurations can be used for performing the convolution operations . For example , it is possible to have chip designs optimized for the following CNN configurations : K4C8P4 , K1C16P8 , and K8C16P1 . In some cases , input data or parameter data may not be aligned with a pre - determined configuration of data storage while pertaining to the channels or filter sizes . In this case , the input data or parameter data may be padded with zeros such that the data arrangement may be aligned with a pre determined configuration of the chip …. In the case when the input data is image data 801 with dimension of 128x128 pixel and three channels , the input data may be padded with a row of zeros such that the input data with original dimension of 128x128x3 is transformed to 128x64x8 which is aligned with a 4 - row query configuration . In the example when the parameters are from K kernels each is 5x5 in size across eight channels 803 ( i . e . , 5x5x3 ) , the parameters may be arranged and padded with zeros such that the parameters data are transformed to 5x3x8 to be aligned with the 4 - row query configuration . It should be noted that zeros can be placed in various locations such as to the top or bottom of the rows , or to the first or last columns so as to complete the size of the kernel to be times of 4 or complete the number of channels to be times of 4 ., wherein the padding is a positional characteristic parameter that corresponds to the number of edge elements used in the formation of the set of slices (work batch data arrays) as required to accommodate the underlying hardware processing configuration and constraints.)

In regards to claim 8, the rejection of claim 1 is incorporated and Huang further teaches in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from contiguous areas of the input feature map into each input work batch data array respectively.  ([0044, 0093, 0104, 0105, Figure 7, Figure 11, Figure 12A]Stride controls how depth columns around the spatial dimensions ( width and height ) are allocated . When the stride is 1 , a new depth column of neurons is allocated to spatial positions only one spatial unit apart . This leads to heavily overlapping receptive fields between the columns , and also to large output volumes., The contiguous space may be divided into one or more contiguous regions . The contiguous space or region may be divided into one or more contiguous slices . The slices may be identified by an offset address according to a base address of the contiguous region and a size of the slice ., FIG . 7 shows an exemplary arrangement of input features stored into slices within a contiguous region . As illustrated in the figure , the input feature map may be 4x4 ( i . e . , HxW ) in plane dimension across eight channels CO - C7 . The contiguous space for storing the parameters may be divided into eight slices Is1 - 8 . In the depicted examples , every row offset pointing to four rows and every two columns / slices may together store parameters corresponding to a point of a 2D plane ( i . e . , HOWOC ; ) in a filter across eight channels ., The chunk of input feature map data may be a data block comprising one or more rows and one or more slices . In some cases , multiple rows may provide a data block for a query , and such multi - row data blocks may arrive sequentially representing one query at a time . For example , a first query may cause the first four rows and all of the eight slices form the input feature map to arrive at the plurality of multipliers and a second query may cause row 5 - 8 to arrive at the multipliers for processing., wherein, as shown in Figure 7, the slices (or blocks) associated with a work batch data array input feature elements are formed from contiguous elements in the input feature map (e.g., contiguous over the channel dimension and the row/H dimension for a single slice) and wherein, in a more general sense, the contiguity of the data allocated to the (successive) slices is also determined by the stride with a stride of 1 indicative of contiguity in the formation of the batch arrays (maximally overlapping receptive fields).)   

In regards to claim 9, the rejection of claim 1 is incorporated and Huang further teaches in which each of the plurality of input work batch data arrays are generated from the input feature map by loading input feature map elements from noncontiguous areas of the input feature map into each input work batch data array respectively.  ([0044, 0093, 0104, 0105, Figure 7, Figure 11, Figure 12A] Stride controls how depth columns around the spatial dimensions ( width and height ) are allocated . When the stride is 1 , a new depth column of neurons is allocated to spatial positions only one spatial unit apart . This leads to heavily overlapping receptive fields between the columns , and also to large output volumes.,
The contiguous space may be divided into one or more contiguous regions . The contiguous space or region may be divided into one or more contiguous slices . The slices may be identified by an offset address according to a base address of the contiguous region and a size of the slice ., FIG . 7 shows an exemplary arrangement of input features stored into slices within a contiguous region . As illustrated in the figure , the input feature map may be 4x4 ( i . e . , HxW) in plane dimension across eight channels CO - C7 . The contiguous space for storing the parameters may be divided into eight slices Is1 - 8 . In the depicted examples , every row offset pointing to four rows and every two columns / slices may together store parameters corresponding to a point of a 2D plane ( i . e . , HOWOC ; ) in a filter across eight channels ., The chunk of input feature map data may be a data block comprising one or more rows and one or more slices . In some cases , multiple rows may provide a data block for a query , and such multi - row data blocks may arrive sequentially representing one query at a time . For example , a first query may cause the first four rows and all of the eight slices form the input feature map to arrive at the plurality of multipliers and a second query may cause row 5 - 8 to arrive at the multipliers for processing., wherein the data allocated to the (successive) slices is determined by the stride such that a stride greater than 1 is indicative of non-contiguity in feature map data in the formation of the batch arrays and wherein, in a different sense, the set of slices comprising a block to be processed (also a work batch data array) includes a non-contiguous arrangement of data (e.g., over the simultaneous W and channel dimensions as shown in Figure 7).) 

In regards to claim 11, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises downsampling one or more positional characteristic parameters to determine the amount of input feature map content required by the one or more input work batch data arrays.  ([0057, 0058, 0116, Figure 1]  A pooling layer operates independently on every depth slice of the input ( e . g . , an activation map or feature map from a previous convolutional layer ) , and reduces its spatial dimension by performing a form of non - linear down - sampling …. The number and placement of the pooling layers may be determined based on various factors , such as the design of the convolutional network architecture , the size of the input , the size of convolutional layers 28 , and / or application of CNN model 10 ., Various non - linear functions can be used to implement the pooling layers . For example , max pooling may be used . Max pooling may partition an image slice of the input into a set of overlapping or non - overlapping sub - regions with a predetermined stride . For each sub - region , max pooling outputs the maximum ., In max pooling operations , the input feature map may be partitioned into a set of non - overlapping rectangles and , for each such sub - region , outputs the maxi mum value . In another example , in an average pooling , an average value of a sub - region may be output . The input feature map can be partitioned by any size . For example , pooling may be applied with filters of size 2x2 applied with a stride of 2 at every depth slice . A pooling layer of size 2x2 with stride of 2 shrinks the input image to a 1 / 4 of its original size ., wherein the input feature map is downsampled through either pooling or striding such that, in either case, this operation modifies the positional characteristics (span corresponding to a slice or block of slices) corresponding to a region of the input feature map (i.e., stride modifies the offset between different slices or the offset between different elements within a slice or a block of slice while pooling modifies/controls the effective content and dimension used in a work batch for a given layer based upon the downsampled positional characteristics/resolution determined for that layer from the pooling operation). 

In regards to claim 12, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises upsampling the input work batch data arrays for convolution.  ([0068, 0117] In some embodiments the same computing unit may be used to perform convolution , average , maximum value , or dot - product operations without changing the components configuration and interconnections . In some embodiments , different calculation circuits may be used for different types of layers . For example , different sets of calculation circuits may correspond to convolution layers , pooling layers and upscaling layers .,
In some cases , upscale layer may also be operated with convolution layer . Upscaling operations may increase resolution of a feature map by suitable method such as interpolation . Upscaling operations may be implemented using various logic elements such as adders , accumulators , comparators , interpolator , or average , etc ., wherein the input feature map is upscaled/upsampled, such as through an interpolation technique to increase the resolution of the feature map for performing convolution at the higher resolution.)

In regards to claim 14, the rejection of claim 2 is incorporated and Huang further teaches wherein the method comprises receiving output feature map dimensions as part of the convolution configuration data.  ([0042, 0046, 0048, Figure 1] FIG . 1 part A shows a CNN application . This CNN is composed of eight layers . The first five layers are convolutional layers and layers 6 - 8 form a fully connected artificial neural network . The algorithm receives three 224x 224 input images that are from an original 256x256 three channel RGB image . The output vector of 1000 elements represents the likelihoods of 1000 categories . As is shown in the figure , Layerl receives three input feature maps in 224x224 resolution and 96 output feature maps in 55x55 resolution ., The size of the output volume of the convolution layer may also depend on hyper - parameters . The hyper parameters may also control the size of the output volume of the convolutional layer . In some cases , the hyper - parameters may include depth , stride and zero – padding., The spatial size of the output volume can be computed as a function of the input volume size W , the kernel field size of the convolution layer neurons K , the stride with which they are applied S and the amount of zero padding P . The formula for calculating how many neurons fit in a given volume is given by ( W - K + 2 P ) / S + 1 ., wherein the output volume (output feature map size) from each layer is computed directly from convolution configuration hyperparameters such as padding, stride, and input feature map size and is therefore also being interpreted as being contained within the array configuration data.)

In regards to claim 15, the rejection of claim 2 is incorporated and Huang further teaches wherein the method comprises: receiving a convolutional operation mode as part of the convolution configuration data; and determining the output feature map dimensions using the convolutional operation mode.  ([0042, 0046, 0048, Figure 1] FIG . 1 part A shows a CNN application . This CNN is composed of eight layers . The first five layers are convolutional layers and layers 6 - 8 form a fully connected artificial neural network . The algorithm receives three 224x 224 input images that are from an original 256x256 three channel RGB image . The output vector of 1000 elements represents the likelihoods of 1000 categories . As is shown in the figure , Layerl receives three input feature maps in 224x224 resolution and 96 output feature maps in 55x55 resolution ., The size of the output volume of the convolution layer may also depend on hyper - parameters . The hyper parameters may also control the size of the output volume of the convolutional layer . In some cases , the hyper - parameters may include depth , stride and zero – padding., The spatial size of the output volume can be computed as a function of the input volume size W , the kernel field size of the convolution layer neurons K , the stride with which they are applied S and the amount of zero padding P . The formula for calculating how many neurons fit in a given volume is given by ( W - K + 2 P ) / S + 1 ., wherein the output volume (output feature map size) from each layer is computed directly from convolution configuration hyperparameters such as padding, stride, and input feature map size (but also, in a different sense, changes in resolution from upscaling) such that the stride and padding is an example of a convolutional operational mode.)

In regards to claim 16, the rejection of claim 1 is incorporated and Huang further teaches The method of claim 1, wherein the method comprises generating the input work batch data arrays in the arithmetic logic processing circuitry.  ([0060, Figure 7, Figure 11] In some cases , the CNN model data may be transferred from the main memory to an on - chip RAM 209 whereas the input data may be transferred to an input data buffer on the chip . Typically , both of the input data and the CNN model data are transferred and stored into contiguous regions of the on - chip RAM . The data may have the same storage layout between the RAM and the main memory., wherein the arithmetic logic processing circuitry forms/generates a structured memory (in its RAM/SRAM) corresponding to the slices to be processed by the set of processing elements using/fetching the same corresponding organized memory structure used the main memory (Figure 7, Figure 11).  

In regards to claim 17, the rejection of claim 1 is incorporated and Huang further teaches The method of claim 1, wherein the method comprises generating the input work batch arrays in the control processor circuitry.  ([0090, Figure 7, Figure 11]
The main processor may arrange the parameters associated with each CNN model such that all parameters of the CNN model can be compactly stored in a contiguous space within the memory . In some cases , the parameters may be classified into a plurality of groups with each group associated with a convolution layer in the CNN . The parameters within a layer / group may be arranged and stored consecutively in a contiguous space ., wherein the control circuitry forms a mapping of the corresponding regions/slices in the input feature map into the main memory (control processor circuitry) such that these slices/input work batch arrays are transferred to the on-chip memory for convolution processing (Figure 7, Figure 11).)  

In regards to claim 18, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises performing convolutions between the plurality of input work batch data arrays and one or more work batch filter data arrays by storing the plurality of input work batch data arrays and one or more work batch filter data arrays in a data buffer. ([0133, Figure 14] Specifically , the input features HOWOC0 - 7 may be supplied to a first input of each of the first eight multipliers , and the parameters KOROSOCO - 7 may be supplied to the second input of the first eight multipliers . A network of adders / accumulators may include two first - level adders ( e . g . , adder 0 and adder 1 ) each for summing the outputs from the first and second set of multipliers , and a second level accumulator 1105 ( e . g . , adder O ' ) for summing the outputs from the two first - level adders ., wherein both the kernel filter arrays (batch filter data array) and the slices (input work batch data arrays) are stored in data buffers (as shown in Figure 14 for example where this data is temporally stored in the on-chip sram) such that any particular kernel filter array in that buffer is used in combination with a corresponding slice to compute a partial output feature map array (to be combined with others to form the output feature map).) 

Claim 19 is also rejected because it is just a system implementation of the same subject matter of claim 1 which can be found in Huang.  In addition, it is noted that claim 19 recites a system with storage circuitry for storing an input feature map, convolutional filter data, and an output feature map which are also be found in Huang (for example [Figure 2, Figure 3, Figure 10, Figure 13]).

In regards to claim 20, Huang teaches A non-transitory computer-readable storage medium comprising a set of computer- readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to output data for controlling the performance of convolutions by: ([0010, Figure 1, Figure 2, Figure 9, Figure 10, Figure 11] In a separate yet related aspect , a non - transitory computer - readable storage medium with instructions stored thereon that is provided . The instructions when executed by a computing system , cause the computing system to perform a method of arranging data to accelerate deep computing , the method comprising : receiving , with aid of one or more processors , data regarding a plurality of objects , each containing a group of three - dimensional numerical arrays ; allocating a space in a main memory to the plurality of objects , wherein the space includes a plurality of regions ; assigning an area within one of the plurality of regions to one of the plurality of objects ; determining a number of slices for the one object based on a size of the group and dimensions of the three - dimensional numerical arrays contained in the one object ; dividing the area into at least the number of slices for the one object ; and storing numerical items in the three dimensional arrays contained in the one object across the number of slices such that at least one numerical item is stored in each of the number of slices ., wherein a CNN implementation framework includes instructions in CRM for controlling the performance of the convolution filtering of input feature maps at any layer that includes physical control functionality for allocating instructions and data required for that convolution to on-chip memory (e.g., Figure 2, Figure 9) and for orchestrating the convolution operation itself using arithmetic logic circuitry (Figure 2, Figure 11).)
receiving convolution configuration data relating to a convolution to be performed; determining from the convolution configuration data: one or more dimensional characteristic parameters relating to dimensions of each of a plurality of input work batch data arrays corresponding to the convolution to be performed; and one or more positional characteristic parameters relating to positions of feature map content within the plurality of input work batch data arrays; ([0044, 0045, 0046, 0084, 0092, Figure 7, Figure 8, Figure 11] Stride controls how depth columns around the spatial dimensions ( width and height ) are allocated . When the stride is 1 , a new depth column of neurons is allocated to spatial positions only one spatial unit apart . This leads to heavily overlapping receptive fields between the columns , and also to large output volumes ., Sometimes it is convenient to pad the input with zeros on the border of the input volume . The size of this zero - padding is another hyper - parameter . Zero padding pro vides control of the output volume spatial size . In particular , sometimes it is desirable to exactly preserve the spatial size of the input volume ., The spatial size of the output volume can be computed as a function of the input volume size W , the kernel field size of the convolution layer neurons K , the stride with which they are applied S and the amount of zero padding P . The formula for calculating how many neurons fit in a given volume is given by ( W - K + 2 P ) / S + 1 . If this number is not an integer , then the strides are set incorrectly and the neurons cannot be tiled to fit across the input volume in a symmetric way . In general , setting zero padding to be P = ( K - 1 ) / 2 when the stride is S = 1 ensures that the input volume and output volume will have the same size spatially., For example , the instructions may include high - level instructions corresponding to layers of the CNN such as types of layers ( e . g . , convolution , pooling , upscale , etc ) , low - level instructions corresponding to different types of operations including but not limited to convolution , elementwise convolution , upscale , return , or pooling at matrix / matrix or vector / matrix data level., At step 405 , the main processor may arrange the input data and store the data into a space on the main memory . The data stored in the main memory may be the raw or processed input data rearranged by the main processor . For example , the processed input data may be a down sized image data or a segmented image data . In some cases , the input data may be arranged according to the selected CNN models . In some cases , the input data may be arranged according to a pre - determined configuration of the chip , which determines the CNN dataflow or data transmission routes . In some cases , the input data may be arranged and zero - padded to conform to the pre - determined configuration of the chip for dataflow or data transmission route in the CNN system., wherein the control circuitry receives/makes use of various parameters that specify the configuration of the CNN such as chip layout/memory characteristics, hyperparameters (stride, depth, presence or absence of padding), and layer type/connectivity (including upscaling/upsampling, pooling/downsampling) upon which the dimensional characteristics of the input work batch data array (size, span) and positional  characteristics of the input work batch data array (padding, upsizing/upsampling) depend upon).) and outputting data for controlling the performance of a convolution between an input feature map and convolutional filter data, based at least in part on the one or more dimensional characteristics and the one or more positional characteristics. ([0105, 0106, , Figure 8, Figure 11, Figure 13] In some embodiments , input feature map within a slice may be arranged such that all items stored at the same offset from starting points of the number of slices are used for the parallel operations . In this way , a chunk of input feature map data to be identified as index of slices or number of slices and offset or number of rows . The chunk of input feature map data may be supplied to a plurality of multipliers for convolution operations in parallel . The chunk of input feature map data may be a data block comprising one or more rows and one or more slices ., The number of rows and slices to be processed in parallel may correspond to different configurations of data storage . In some cases , when data are arranged in the same configuration , the same sets of calculation circuits and interconnect configurations can be used for performing the convolution operations . … In some cases , input data or parameter data may not be aligned with a pre - determined configuration of data storage while pertaining to the channels or filter sizes . In this case , the input data or parameter data may be padded with zeros such that the data arrangement may be aligned with a pre determined configuration of the chip …. In the case when the input data is image data 801 with dimension of 128x128 pixel and three channels , the input data may be padded with a row of zeros such that the input data with original dimension of 128x128x3 is transformed to 128x64x8 which is aligned with a 4 - row query configuration . In the example when the parameters are from K kernels each is 5x5 in size across eight channels 803 ( i . e . , 5x5x3 ) , the parameters may be arranged and padded with zeros such that the parameters data are transformed to 5x3x8 to be aligned with the 4 - row query configuration . It should be noted that zeros can be placed in various locations such as to the top or bottom of the rows , or to the first or last columns so as to complete the size of the kernel to be times of 4 or complete the number of channels to be times of 4 ., wherein the CNN implementation framework controls the execution of the CNN convolution for each of the slices/blocks (input work batch data arrays) that were generated/organized/outputted according to the (outputted) positional parameters and dimensional parameters (size of the batch data, size of slice, size of filter) and wherein the generation and dissemination of the instructions to perform the convolution over successive cycles as well as the organization/orchestration of the intermediate batch output arrays are also data that are output for controlling the execution of the convolution in the CNN.) 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 10 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Huang, in view of Chen et al. (“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 4, April 2018, pp. 834-848, 2014, pp. 260-264), hereinafter referred to as Chen.

In regards to claim 10, the rejection of claim 1 is incorporated and Huang further teaches wherein the method comprises upsampling the one or more positional characteristic ….  ([0117, Figure 1] In some cases , upscale layer may also be operated with convolution layer . Upscaling operations may increase resolution of a feature map by suitable method such as interpolation . Upscaling operations may be implemented using various logic elements such as adders , accumulators , comparators , interpolator , or average , etc ., wherein the input feature map is upscaled/upsampled, such as through an interpolation technique to increase the resolution of the feature map.)
However, Huang does not explicitly teach wherein the method comprises upsampling the one or more positional characteristic parameters. Although Huang teaches an upsampling/upscaling function that increases resolution which modifies the positional characteristics of the elements in the (dilated) feature map, he does not clearly disclose that this functionality is accompanied by a corresponding modification in the positional characteristic parameters themselves such as might be observed in a slice or a block of slices. 
However, Chen, in the analogous environment of implementing a CNN, teaches wherein the method comprises upsampling the one or more positional characteristic parameters ([pp. 837-838, Section 3.1Figure 2, Figure 3, Figure 4] Considering one-dimensional signals first, the output y½i of atrous convolution2 of a 1-D input signal x½i with a filter w½k of length K is defined as <equation 1>  The rate parameter r corresponds to the stride with which we sample the input signal…. If one implants the resulting feature map in the original image coordinates, we realize that we have obtained responses at only 1/4 of the image positions. Instead, we can compute responses at all image positions if we convolve the full resolution image with a filter ‘with holes’, in which we upsample the original filter by a factor of 2, and introduce zeros in between filter values. Although the effective filter size increases, we only need to take into account the non-zero filter values, hence both the number of filter parameters and the number of operations per position stay constant…. For example, in order to double the spatial density of computed feature responses in the VGG-16 or ResNet-101 networks, we find the last pooling or convolutional layer that decreases resolution (’pool5’ or ’conv5_1’ respectively), set its stride to 1 to avoid signal decimation, and replace all subsequent convolutional layers with atrous convolutional layers having rate r = 2…. Atrous convolution with rate r introduces r - 1 zeros between consecutive filter values, effectively enlarging the kernel size of a kxk filter to k_c=k+(k-1)(r-1) without increasing the number of parameters or the amount of computation., wherein an upsampling operation is performed in the implementation of a CNN to increase resolution (and to mitigate loss of information from downsampling operations) by dilating the kernel filter with zeroes (according to specified dilation rate/factor) and applying that to the feature map (Figures 3, 4) such that the span of each region in the feature map (input work batch data array) acted upon by the kernel is modified (i.e, the positional characteristics are upsampled by the dilation factor/rate) but such that the mechanization of the convolution preserves the number (multiply/add) of operations and parameters.
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huang to incorporate the teachings of Chen to upsample the one or more positional characteristic parameters. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved accuracy and effectiveness in dense feature extraction performance in a framework that uses upsampling/dilation functionality to control the multi-scale resolution of feature response, particularly when that functionality employs atrous convolution  (Chen, [Abstract, Figure 9, Figure 10, Table 5, Table 6]).
In regards to claim 13, the rejection of claim 2 is incorporated and Huang does not further teach wherein the method comprises determining the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution.  Although Huang disclose various convolution models/types that can be implemented according to the convolution configuration data, he does not explicitly disclose bilinear deconvolution.
However, Chen, in the analogous environment of implementing a CNN, teaches wherein the method comprises determining the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution ([p. 835, Section 1, p. 838, Section 3.1, Figure 1, Figure 3, Figure 4] In practice, we recover full resolution feature maps by a combination of atrous convolution, which computes feature maps more densely, followed by simple bilinear interpolation of the feature responses to the original image size. This scheme offers a simple yet powerful alternative to using deconvolutional layers [13], [14] in dense prediction tasks…. We then employ bi-linear interpolation to upsample by a factor of 8 the score map to reach the original image resolution, yielding the input to a fully-connected CRF [22] that refines the segmentation results., We have adopted instead a hybrid approach that strikes a good efficiency/accuracy trade-off, using atrous convolution to increase by a factor of 4 the density of computed feature maps, followed by fast bilinear interpolation by an additional factor of 8 to recover feature maps at the original image resolution., wherein a deconvolution operation is performed by performing (atrous) upsampling (to increase resolution in preceding convolutional layers) followed by bi-linear interpolation to recover the original resolution in a deconvolution operation (i.e., this is a bi-linear deconvolution process) such that, as noted previously, the atrous convolution step modifies the positional characteristics according to dilation rate but such that this process is specifically coupled with the deconvolution process in order to restore images to the original resolution.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huang to incorporate the teachings of Chen to determine the one or more positional characteristic parameters based on the convolution configuration data indicating a bilinear deconvolution. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved accuracy and effectiveness in dense feature extraction performance in a framework that uses upsampling/dilation functionality to control the multi-scale resolution of feature response along with bi-linear interpolation to complete the deconvolution to restore the original feature map resolution, particularly when that functionality employs atrous convolution  (Chen, [Abstract, Figure 9, Figure 10, Table 5, Table 6]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Luo et al. (“DaDianNao: A Neural Network Supercomputer”, IEEE Transactions on Computers, Vol. 66, No. 1, January 2017, pp. 73-88) who teach a control architecture for efficiently processing the computations in a CNN, including the partitioning of feature maps into blocks/batches and convolving them with kernel filter weights across a set of processing elements.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983. The examiner can normally be reached M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT LEWIS KULP/Examiner, Art Unit 2124                                                                                                                                                                                                        
/BRIAN M SMITH/Primary Examiner, Art Unit 2122