Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“an address generator configured to” in claims 1 and 11
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim limitation “an address generator configured to” in claims 1 and 11 invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The address generator is drawn in figure 2 as 210 and 220, however, this representation is seen as merely a black box and does not provide any form of structure that one of ordinary skill in the art would be able to recognize.  The instant specification does not cure this deficiency.  Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Regarding claim 19, "the changed position" is indefinite.  There is no indication of a first position or method of changing such that it would be evident to one of ordinary skill in the art how the position had been changed or if it had been changed at all.  In the interest of further examination the changed position is interpreted as a second position to apply the first weight group.
"the changed position" in claim 19 lacks antecedent basis.  "A changed position" is recommended.
The remaining claims are rejected with respect to their dependence on the rejected claims. 

Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to an apparatus which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes and mathematical calculations.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
to determine a second position spaced from a first position of a first input pixel of the input feature map based on size of the first weight group (observation, evaluation, and judgement based on mathematical relationships),
 determine a plurality of adjacent pixels adjacent to the second position (observation, evaluation, and judgement based on mathematical relationships)
to apply the first weight group to the plurality of adjacent pixels to obtain a first output pixel corresponding to the first position (mathematical calculation)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “A processor”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.  Claim 1 also recites additional elements “a weight memory configured to store a first weight group of a first layer” and “a feature map memory configured to store an input feature map where the first weight group is to be applied” which amount to storing and retrieving from memory which is well-understood, routine, and conventional (See MPEP 2106.05(d): Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93;).
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 2-4. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 2 recites additional mathematical calculations “the processor applies the second weight group of the second layer, which is the next layer after the first layer, to the first output feature map to generate a final output feature map” as well as additional storing and retrieving from memory “the address generator loads the input feature map from an external memory, and transmits the final output feature map to the external memory” which is well-understood, routine, and conventional.  
Dependent claim 3 recites additional observation, evaluation, and judgement “determines the second position based on the address information of the first position and the size of the first weight group among the address information of the plurality of input pixels” as well as additional insignificant extra-solution activity of gathering and outputting data “address generator obtains the address information of the input feature map and a plurality of input pixels contained in the input feature map”, “and transmits the second position to the processor” (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015))
Dependent claim 4 recites additional observation, evaluation, and judgement “configures part of the plurality of adjacent pixels to padding based on a result of comparing the address information of the plurality of adjacent pixels and the address information of the plurality of input pixels” as well as additional insignificant extra-solution activity of gathering data “address generator obtains address information of the plurality of adjacent pixels”.

Regarding Claim 5:  Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 5 is directed to a method which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 5 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes and mathematical calculations.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
loading an M-th (M is natural number) input pixel of a N-th (N is natural number) channel on an N*(M−1)-th address of the address space (mathematical relationships and calculations),
 loading an M-th input pixel of an (N+1)-th channel on an (N+1)*(M−1)-th address of the address space (mathematical relationships and calculations)
Therefore, Claim 5 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 5 recites additional elements “systolic array” and “memory”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, Claim 5 is directed to a judicial exception.
Step 2B Analysis:  As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in Claim 5 amount to no more than mere instructions to apply the judicial exception using a generic computer component.  Claim 5 also recites additional elements “loading an input feature map including a plurality of channels on address space of a memory” which amount to storing and retrieving from memory which is well-understood, routine, and conventional (See MPEP 2106.05(d): Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93;).
For the reasons above, Claim 5 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claim 11 as well as dependent claims 6-10 and 12-20. The additional limitations of the dependent claims are addressed briefly below:
Dependent claims 6 and 12 recite additional mathematical calculations “applying a weight to an M-th input pixel of the N-th channel to obtain an N*(M−1)-th output pixel” and “the N*(M−1)-th output pixel on the N*(M−1)-th address” as well as additional storing and retrieving from memory “storing the…output pixel on the…address” which is well-understood, routine, and conventional.  
Dependent claims 7 and 13 recite additional mathematical calculations “applying a weight to an M-th input pixel of the (N+1)-th channel to obtain an (N+1)*(M−1)-th output pixel” and “the (N+1)*(M−1)-th output pixel at the (N+1)*(M−1)-th address” as well as additional storing and retrieving from memory “storing the…output pixel on the…address” which is well-understood, routine, and conventional.  
Dependent claims 8 and 14 recite additional mathematical calculations “the (M+1)-th input pixel of the N-th channel on the N*M-th address” as well as additional storing and retrieving from memory “loading the… input pixel of the… address of the address space.” which is well-understood, routine, and conventional.  
Dependent claims 9 and 15 recite additional observation, evaluation, and judgement “the (M+1)-th input pixel of the N-th channel is a pixel included in a next column after a column including the M-th input pixel of the N-th channel.”
Dependent claims 10 and 16 recite additional mathematical calculations “applying a weight to an (M+1)-th input pixel of the N-th channel to obtain an N*M-th output pixel; and” as well as additional storing and retrieving from memory “storing the…output pixel on the…address” which is well-understood, routine, and conventional.  
Dependent claim 17 recites additional observation, evaluation, and judgement “the address generator determines a plurality of adjacent pixels to apply the first weight group based on the size of the first weight group” as well as additional mathematical calculations “the processor applies the first weight group to the plurality of adjacent pixels to obtain a first output pixel mapped to the N*(M−1)-th address.” 
Dependent claim 18 recites additional mathematical calculations “the processor applies a second weight group of a second layer that is a next layer after the first layer to the output feature map to generate the final output feature map” as well as additional retrieving from memory which is well-understood, routine, and conventional “the address generator loads the input feature map from the external memory and transfers the final output feature map to the external memory” 
Dependent claim 19 recites additional mathematical calculations “the processor generates the output feature map by applying the first weight group to a plurality of adjacent pixels adjacent to the changed position.” as well as additional insignificant extra-solution activity of gathering data “the address generator obtains the input feature map and the address of the plurality of input pixels included in the input feature map, and transmits the changed position to apply the first weight group based on the N*(M−1)-th address of the address of the plurality of input pixels and the size of the first weight group to the processor”.
Dependent claim 20 recites additional observation, evaluation, and judgement “the address generator configures some of the adjacent pixels as padding based on a result of comparing the address information of the changed locations and the plurality of input pixels.”

Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-20 are rejected under 35 U.S.C. § 101. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


	Claims 1-4 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Zhang (US20180314671), and Chen (US20200117519A1) and in further view of Ujjwalkarn (“An Intuitive Explanation of Convolutional Neural Networks”, 2017). 

	Regarding claim 1, Zhang teaches An apparatus for processing a convolutional neural network (CNN), comprising: ([Abstract] "A CNN computation is mapped onto the two-dimensional array of reconfigurable processing elements using an automated system configured to determine suitable reconfigurable processing element parameters.")
	a weight memory configured to store a first weight group ([¶0032] "weight buffer 218 reads in weight data via a weight line 262, and sequentially sends this weight data to weight buffer 228, and so on.")
	a feature map memory configured to store an input feature map where the first weight group is to be applied; ([¶0004] "An input feature map buffer having a double buffer can be used for storing input feature maps" [¶0031] " Each of input buffer 202, input buffer 204, input buffer 206 and input buffer 208 sends data to process element 220, process element 222, process element 224, and process element 228 respectively via a unique unidirectional interconnect.")
	a processor configured to apply the first weight group to the plurality of adjacent pixels to obtain a first output pixel corresponding to the first position. ([¶0039] " input A_in 322 is read by processing element 300 into an A register 304. A register 304 also produces an output A_out 332. Output A_out 332 is a horizontal output comprised of weight data, transmitted to a PE in the same row as processing element 300. In some embodiments, input B_in 328 is read by processing element into a B register 306. B register 306 produces an output B_out 326. Output B_out 326 is a vertical output comprised of input data, transmitted to a PE in the same column as processing element 300" Zhang explicitly teaches that the weights are applied to adjacent input elements (pixels).  See also FIG. 2.).
	However, Zhang does not explicitly teach a weight memory configured to store a first weight group of a first layer; 
an address generator configured to determine a second position spaced from a first position of a first input pixel of the input feature map based on size of the first weight group, and determine a plurality of adjacent pixels adjacent to the second position. 

Chen, in the same field of endeavor teaches a weight memory configured to store a first weight group of a first layer; ([¶0131] "segmenting convolutional layer computation of the neural network may include the follows. Input neurons of a convolutional layer of the neural network form a three-dimensional matrix (Nfin, Nxin, Nyin). Weights form a four-dimensional matrix (Nfout, Nfout, Kx, Ky) and output neurons form a three-dimensional matrix (Nfout, Nxout, Nyout)...the weights are simultaneously segmented according to a block size of (Bfout, Bfin, Bx, By)." Chen explicitly teaches that the weights are grouped by layer.). 

Zhang and Chen are both directed towards systolic arrays for processing convolutional neural networks.  Therefore, Zhang and Chen are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang with the teachings of Chen by storing a layer-wise weight group in a weight buffer.  It would be obvious to one of ordinary skill in the art that neural network computations are propagated layer-wise through a directed graph, and that splitting weights into weight groups based on the layers would be a practical way of partitioning the weights.  This is further reinforced in Chen who explicitly teaches segmenting the weights of each individual layer into blocks.  Chen provides as a motivation for combination ([¶0257] “According to the method, data interaction efficiency may be effectively improved, and an interaction delay may be greatly reduced. A public storage module for each layer may be accessed by licensed access units, and for private storage modules, data interaction and access between access units may be completed directly or through a certain rule or a certain protocol.”).  This motivation for combination also applies to the remaining claims depending on this combination.   

	While it would be obvious to one of ordinary skill in the art that convolutional neural networks use sliding windows to convolve input images, the combination of Zhang and Chen does not explicitly teach an address generator configured to determine a second position spaced from a first position of a first input pixel of the input feature map based on size of the first weight group, and determine a plurality of adjacent pixels adjacent to the second position.  

Ujjwalkarn, in the same field of endeavor, teaches an address generator configured to determine a second position spaced from a first position of a first input pixel of the input feature map based on size of the first weight group, and determine a plurality of adjacent pixels adjacent to the second position ([p. 4] " As we discussed above, every image can be considered as a matrix of pixel values. Consider a 5 x 5image whose pixel values are only 0 and 1 (note that for a grayscale image, pixel values range from 0 to255, the green matrix below is a special case where pixel values are only 0 and 1)…We slide the orange matrix over our original image (green) by 1 pixel (also called ‘stride’) and for every position, we compute element wise multiplication (between the two matrices) and add the multiplication outputs to get the final integer which forms a single element of the output matrix (pink). Note that the 3×3 matrix “sees” only a part of the input image in each stride." Figure 5 shows that the second position spaced from a first position is selected based on feature map adjacency.). 

	Zhang, Chen, and Ujjwalkarn are all directed towards processing convolutional neural networks.  Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Chen with the teachings of Ujjwalkarn by using a sliding window in the convolution calculations. As reinforced by Ujjwalkarn this is a fundamental aspect of convolutional neural networks and is well known to one of ordinary skill in the art.  Zhang and Chen both explicitly teach the use of feature maps which one of ordinary skill in the art would recognize as being derived from the convolution operation shown in Ujjwalkarn ([p. 5] “In CNN terminology, the 3×3 matrix is called a ‘filter‘ or ‘kernel’ or ‘feature detector’ and the matrix formed by sliding the filter over the image and computing the dot product is called the ‘Convolved Feature’ or ‘Activation Map’ or the ‘Feature Map‘. It is important to note that filters acts as feature detectors from the original input image.”).  This motivation for combination also applies to the remaining claims depending on this combination.  

	Regarding claim 2, the combination of Zhang, Chen, and Ujjwalkarn teaches The apparatus of claim 1, wherein: the processor applies the second weight group of the second layer, which is the next layer after the first layer, to the first output feature map to generate a final output feature map; and (Ujjwalkarn Fig. 3 shows that the weight groups of a given layer are applied to the subsequent layer in order, as is well known in the art. With respect to the instant specification, convolution, normalization, activation, and pooling are all considered to be part of one layer.  Therefore, the output of the second convolution as shown in figure 3 of Ujjwalkarn is interpreted as synonymous with the final output feature map.)
	the address generator loads the input feature map from an external memory, (Zhang [¶0035] "Each of input buffer 202 through input buffer 208 contains a double buffer—one buffer is used to store the data fetched from external memory, and the other is used to feed the data into the PE.")
	and transmits the final output feature map to the external memory. (Zhang [¶0024] "and output feature map buffer 108 is configured to read data from processing element array 104 and transmit this data to a data sink 118...data sink 118 may include any combination of memory units, display drivers, and so on" Data sink interpreted as being part of the external memory.). 

	Regarding claim 3, the combination of Zhang, Chen, and Ujjwalkarn teaches The apparatus of claim 2, wherein the address generator obtains the address information of the input feature map and a plurality of input pixels contained in the input feature map, determines the second position based on the address information of the first position and the size of the first weight group among the address information of the plurality of input pixels, and transmits the second position to the processor. (Ujjwalkarn [p. 4] "We slide the orange matrix over our original image (green) by 1 pixel (also called ‘stride’) and for every position, we compute element wise multiplication (between the two matrices) and add the multiplication outputs to get the final integer which forms a single element of the output matrix (pink). Note that the 3×3 matrix “sees” only a part of the input image in each stride"). 

	Regarding claim 4, the combination of Zhang, Chen, and Ujjwalkarn teaches The apparatus of claim 3, wherein the address generator obtains address information of the plurality of adjacent pixels, and configures part of the plurality of adjacent pixels to padding based on a result of comparing the address information of the plurality of adjacent pixels and the address information of the plurality of input pixels. (Ujjwalkarn [p. 7] "Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros around the border, so that we can apply the filter to bordering elements of our input image matrix. A nice feature of zero-padding is that it allows us to control the size of the feature maps. Adding zero-padding is also called wide convolution, and not using zero-padding would be a narrow convolution"). 

	Claims 5-17 are rejected under 35 U.S.C. 103 as being unpatentable over Aliabadi (US20180096226A1) and in view of Shi (“A Locality Aware Convolutional Neural Networks Accelerator”, 2015). 

	Regarding claim 5, A method for processing a convolutional neural network (CNN) using a systolic array, comprising: loading an input feature map including a plurality of channels on address space of a memory; ([¶0030] "A kernel stack of a CNN can include M rows of kernels and N columns of kernels, with each column also referred to as a filter bank of the kernel stack. The kernels of the kernel stack can have the same width and the same height. The convolutional layer can have M input channels for receiving M input activation maps." Input activation map interpreted as synonymous with input feature map.)
	an input feature map including a plurality of channels ([¶0030] "A kernel stack of a CNN can include M rows of kernels and N columns of kernels, with each column also referred to as a filter bank of the kernel stack. The kernels of the kernel stack can have the same width and the same height. The convolutional layer can have M input channels for receiving M input activation maps.")
	loading an M-th (M is natural number) input pixel of a N-th (N is natural number) channel ([¶0079] " For example, the first output activation map can include the first pixel, the (N+1)th pixel, the (2N+1)th pixel, and so on, of the reordered output activation map. As another example, the second output activation map can include the second pixel, the (N+2)th pixel, the (2N+2)th pixel, and so on, of the reordered output activation map." [¶0169-0174] "(1) For each row r of a reordered output activation map: (2) For each column c of the output activation map:...(5a) Load the corresponding reordered input activation map pixel value(s) and duplicate to a SIMD register.").
	However, Aliabadi does not explicitly teach loading an M-th (M is natural number) input pixel of a N-th (N is natural number) channel on an N*(M−1)-th address of the address space
	loading an M-th input pixel of an (N+1)-th channel on an (N+1)*(M−1)-th address of the address space.  

Shi, in the same field of endeavor, teaches loading an M-th (M is natural number) input pixel of a N-th (N is natural number) channel on an N*(M−1)-th address of the address space; and ([p. 593 §III] "Image Data: Though input data can be reused by all PEs in Inter OFm scheme, some low data reuse situations still exist, e.g. when the number of OFm is less than the number of PE. During CNN processing, neighboring convolution windows have large overlap area, which facilitate data reuse in IntraOFmP scheme. A controllable multiplexer can distribute input image data to appropriate PEs" [p. 595 §IV] "Input Tiling and Distribution Multiplexer: Based on the data locality optimization schemes in Section III, input buffer and multiplexer are set as shown in Fig. 11. Width of input buffer is 22(words)x8(bits/word) and its depth 48. Each buffer line stores 22 pixel data of one image column...Bank Width = (MaxConvSize + MaxPoolingSize × (NumPE − 1)) × DataWidth" Bank interpreted as synonymous with address space.  NumPE interpreted as synonymous with M, and DataWidth interpreted as synonymous with N.)
	loading an M-th input pixel of an (N+1)-th channel on an (N+1)*(M−1)-th address of the address space. ([p. 593 §III] "Image Data: Though input data can be reused by all PEs in Inter OFm scheme, some low data reuse situations still exist, e.g. when the number of OFm is less than the number of PE. During CNN processing, neighboring convolution windows have large overlap area, which facilitate data reuse in IntraOFmP scheme. A controllable multiplexer can distribute input image data to appropriate PEs" [p. 595 §IV] "Input Tiling and Distribution Multiplexer: Based on the data locality optimization schemes in Section III, input buffer and multiplexer are set as shown in Fig. 11. Width of input buffer is 22(words)x8(bits/word) and its depth 48. Each buffer line stores 22 pixel data of one image column...Bank Width = (MaxConvSize + MaxPoolingSize × (NumPE − 1)) × DataWidth" Bank interpreted as synonymous with address space.  NumPE interpreted as synonymous with M, and DataWidth interpreted as synonymous with N.). 

	Aliabadi and Shi are both directed towards methods of accelerating convolutional neural networks.  Therefore, Aliabadi and Shi are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Aliabadi and Shi by reusing memory relative to stride overlap. Shi provides as a motivation for combination ([p. 591 §1] “By avoiding unnecessary data movement, memory access bottlenecks can be eased to reduce the complexity of circuit design. Hence, data locality
in CNN processing is studied in this paper and prioritized as the major optimization target in the accelerator design”).  This motivation for combination also applies to the remaining claims depending on this combination.  

	Regarding claim 6, the combination of Aliabadi, and Shi teaches The method of claim 5, comprising: applying a weight to an M-th input pixel of the N-th channel to obtain an N*(M−1)-th output pixel; and (Aliabadi [Abstract] "The output activation maps can be determined using the clusters of the input activation map pixels and kernels tile by tile." [¶0032] " In some embodiments, no reordering may be necessary. For example, the first convolution layer can convolve one input activation map and produces multiple output activation maps. In this case, no reordering of the pixel values of the input activation map may be necessary." [¶0054] "To determine the pixel value 112 a at position (1, 1) of the output activation map 112, the following multiplications can be performed: A pixel value 104 a can be multiplied by a weight value 108 j;" Convolving input activation interpreted as synonymous with applying weight to input pixel.  See also FIG. 1 for how weights are applied tile by tile synonymous with applying an M-th input pixel of the N-th channel.)
	storing the N*(M−1)-th output pixel on the N*(M−1)-th address. ( [p. 593 §III] "Image Data: Though input data can be reused by all PEs in Inter OFm scheme, some low data reuse situations still exist, e.g. when the number of OFm is less than the number of PE. During CNN processing, neighboring convolution windows have large overlap area, which facilitate data reuse in IntraOFmP scheme. A controllable multiplexer can distribute input image data to appropriate PEs" [p. 595 §IV] "Input Tiling and Distribution Multiplexer: Based on the data locality optimization schemes in Section III, input buffer and multiplexer are set as shown in Fig. 11. Width of input buffer is 22(words)x8(bits/word) and its depth 48. Each buffer line stores 22 pixel data of one image column...Bank Width = (MaxConvSize + MaxPoolingSize × (NumPE − 1)) × DataWidth" Bank interpreted as synonymous with address space.  NumPE interpreted as synonymous with M, and DataWidth interpreted as synonymous with N.). 

	Regarding claim 7, the combination of Aliabadi, and Shi teaches The method of claim 6, comprising: applying a weight to an M-th input pixel of the (N+1)-th channel to obtain an (N+1)*(M−1)-th output pixel; and (Aliabadi [¶0079] "an output activation map may be ordered channel by channel, such that all pixel values that belong to the first output activation map, can be stored before all pixels that belong to the second output activation map (in terms of memory location) and so on. In some implementations, pixel values of the reordered output activation map in the interleaved output activation map layout can be ordered into the basic output activation map layout. For example, the first output activation map can include the first pixel, the (N+1)th pixel, the (2N+1)th pixel, and so on" (2N+1)th output pixel interpreted as synonymous with output pixel of 2nd channel.)
	storing the (N+1)*(M−1)-th output pixel at the (N+1)*(M−1)-th address. (Shi [p. 593 §III] "Image Data: Though input data can be reused by all PEs in Inter OFm scheme, some low data reuse situations still exist, e.g. when the number of OFm is less than the number of PE. During CNN processing, neighboring convolution windows have large overlap area, which facilitate data reuse in IntraOFmP scheme. A controllable multiplexer can distribute input image data to appropriate PEs" [p. 595 §IV] "Input Tiling and Distribution Multiplexer: Based on the data locality optimization schemes in Section III, input buffer and multiplexer are set as shown in Fig. 11. Width of input buffer is 22(words)x8(bits/word) and its depth 48. Each buffer line stores 22 pixel data of one image column...Bank Width = (MaxConvSize + MaxPoolingSize × (NumPE − 1)) × DataWidth" Bank interpreted as synonymous with address space.  NumPE interpreted as synonymous with M, and DataWidth interpreted as synonymous with N.). 

	Regarding claim 8, the combination of Aliabadi, and Shi teaches The method of claim 5, comprising:  loading the (M+1)-th input pixel of the N-th channel on the N*M-th address of the address space. (Aliabadi [¶0169-0174] "(1) For each row r of a reordered output activation map: (2) For each column c of the output activation map:...(5a) Load the corresponding reordered input activation map pixel value(s) and duplicate to a SIMD register."). 

	Regarding claim 9, the combination of Aliabadi, and Shi teaches The method of claim 8, wherein the (M+1)-th input pixel of the N-th channel is a pixel included in a next column after a column including the M-th input pixel of the N-th channel. ( [Abstract] "The output activation maps can be determined using the clusters of the input activation map pixels and kernels tile by tile." [¶0032] " In some embodiments, no reordering may be necessary. For example, the first convolution layer can convolve one input activation map and produces multiple output activation maps. In this case, no reordering of the pixel values of the input activation map may be necessary." [¶0054] "To determine the pixel value 112 a at position (1, 1) of the output activation map 112, the following multiplications can be performed: A pixel value 104 a can be multiplied by a weight value 108 j;" Convolving input activation interpreted as synonymous with applying weight to input pixel.  See also FIG. 1 for how weights are applied tile by tile synonymous with applying an M-th input pixel of the N-th channel.). 

	Regarding claim 10, the combination of Aliabadi, and Shi teaches The method of claim 9, comprising: applying a weight to an (M+1)-th input pixel of the N-th channel to obtain an N*M-th output pixel; and (Aliabadi [Abstract] "The output activation maps can be determined using the clusters of the input activation map pixels and kernels tile by tile." [¶0032] " In some embodiments, no reordering may be necessary. For example, the first convolution layer can convolve one input activation map and produces multiple output activation maps. In this case, no reordering of the pixel values of the input activation map may be necessary." [¶0054] "To determine the pixel value 112 a at position (1, 1) of the output activation map 112, the following multiplications can be performed: A pixel value 104 a can be multiplied by a weight value 108 j;" Convolving input activation interpreted as synonymous with applying weight to input pixel.  See also FIG. 1 for how weights are applied tile by tile synonymous with applying an M-th input pixel of the N-th channel.)
	storing the N*M-th output pixel at the N*M-th address. (Aliabadi [¶0079] "an output activation map may be ordered channel by channel, such that all pixel values that belong to the first output activation map, can be stored before all pixels that belong to the second output activation map (in terms of memory location) and so on. In some implementations, pixel values of the reordered output activation map in the interleaved output activation map layout can be ordered into the basic output activation map layout. For example, the first output activation map can include the first pixel, the (N+1)th pixel, the (2N+1)th pixel, and so on"). 

	Regarding claims 11-16, claims 11-16 are directed towards an apparatus for performing the methods of claims 5-10.  Therefore, the rejection applied to claims 5-10 also applies to claims 11-16.   

	Regarding claim 17, the combination of Aliabadi, and Shi teaches The apparatus of claim 11, wherein: the address generator determines a plurality of adjacent pixels to apply the first weight group based on the size of the first weight group; and the processor applies the first weight group to the plurality of adjacent pixels to obtain a first output pixel mapped to the N*(M−1)-th address. ( [¶0054] "To determine the pixel value 112 a at position (1, 1) of the output activation map 112, the following multiplications can be performed: A pixel value 104 a can be multiplied by a weight value 108 j; A pixel value 104 b can be multiplied by a weight value 108 i; A pixel value 104 c can be multiplied by a weight value 108 h; A pixel value 104 e can be multiplied by a weight value 108 g; A pixel value 104 f can be multiplied by a weight value 108 f; A pixel value 104 g can be multiplied by a weight value 108 e; A pixel value 104 h can be multiplied by a weight value 108 c; A pixel value 104 i can be multiplied by a weight value 108 b; and A pixel value 104 j can be multiplied by a weight value 108 a. Furthermore, an accumulation or a summation of the results of the above multiplications can be performed." See FIG. 1 which shows that the plurality of adjacent pixels to apply the first weight group corresponds to element 112a of output map 112.). 

	Claims 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Aliabadi, and Shi and in further view of Ujjwalkarn and Zhang.  

	Regarding claim 18, the combination of Aliabadi and Shi teaches The apparatus of claim 17.
	However, the combination of Aliabadi and Shi does not explicitly teach the processor applies a second weight group of a second layer that is a next layer after the first layer to the output feature map to generate the final output feature map; 
the address generator loads the input feature map from the external memory and transfers the final output feature map to the external memory.  

Ujjwalkarn, in the same field of endeavor, teaches the processor applies a second weight group of a second layer that is a next layer after the first layer to the output feature map to generate the final output feature map (Fig. 3 shows that the weight groups of a given layer are applied to the subsequent layer in order, as is well known in the art. With respect to the instant specification, convolution, normalization, activation, and pooling are all considered to be part of one layer.  Therefore, the output of the second convolution as shown in figure 3 of Ujjwalkarn is interpreted as synonymous with the final output feature map.). 

	Aliabadi, Shi, and Ujjwalkarn are all directed towards processing convolutional neural networks.  Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Aliabadi and Shi with the teachings of Ujjwalkarn by using a sliding window in the convolution calculations. As reinforced by Ujjwalkarn this is a fundamental aspect of convolutional neural networks and is well known to one of ordinary skill in the art.  Zhang and Chen both explicitly teach the use of feature maps which one of ordinary skill in the art would recognize as being derived from the convolution operation shown in Ujjwalkarn ([p. 5] “In CNN terminology, the 3×3 matrix is called a ‘filter‘ or ‘kernel’ or ‘feature detector’ and the matrix formed by sliding the filter over the image and computing the dot product is called the ‘Convolved Feature’ or ‘Activation Map’ or the ‘Feature Map‘. It is important to note that filters acts as feature detectors from the original input image.”).  This motivation for combination also applies to the remaining claims depending on this combination.  

	However, the combination of Aliabadi, Shi, and Ujjwalkarn does not explicitly teach the address generator loads the input feature map from the external memory and transfers the final output feature map to the external memory.  

Zhang, in the same field of endeavor, teaches the address generator loads the input feature map from the external memory ([¶0035] "Each of input buffer 202 through input buffer 208 contains a double buffer—one buffer is used to store the data fetched from external memory, and the other is used to feed the data into the PE.")
	and transfers the final output feature map to the external memory. ([¶0024] "and output feature map buffer 108 is configured to read data from processing element array 104 and transmit this data to a data sink 118...data sink 118 may include any combination of memory units, display drivers, and so on" Data sink interpreted as being part of the external memory.). 

	Aliabadi, Shi, Ujjwalkarn, and Zhang are all directed towards processing convolutional neural networks.  Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Aliabadi, Shi, and Ujjwalkarn with the teachings of Zhang by storing and retrieving inputs and outputs from an external memory. It would be obvious to one of ordinary skill in the art that storing and retrieving data from external memory is routine in the art, which is further reinforced by Zhang.  Zhang provides as an additional motivation for combination ([¶0069] “a use of double buffering in the input and output allows a modeling of the throughput in a decoupled way, so the overall throughput T is dominated by the lower one of computation throughput (PT) and external memory transfer throughput (MT)”).  

	Regarding claim 19,  the combination of Aliabadi, Shi, Ujjwalkarn, and Zhang teaches The apparatus of claim 18, wherein: the address generator obtains the input feature map and the address of the plurality of input pixels included in the input feature map, and transmits the changed position to apply the first weight group based on the N*(M−1)-th address of the address of the plurality of input pixels and the size of the first weight group to the processor; and (Aliabadi  [¶0055] "Other pixel values of the output activation map 112 can be similarly determined. Equation (3) below shows determining pixel values 112 a-112 i of the output activation map 112" See FIG. 1 output activation pixel 112b interpreted as the result of applying the first weight group to a changed position.)
	the processor generates the output feature map by applying the first weight group to a plurality of adjacent pixels adjacent to the changed position. (Aliabadi [¶0055] "Other pixel values of the output activation map 112 can be similarly determined. Equation (3) below shows determining pixel values 112 a-112 i of the output activation map 112" See FIG. 1 output activation pixel 112b interpreted as the result of applying the first weight group to a changed position.). 

	Regarding claim 20, the combination of Aliabadi, Shi, Ujjwalkarn, and Zhang teaches The apparatus of claim 19, wherein the address generator configures some of the adjacent pixels as padding based on a result of comparing the address information of the changed locations and the plurality of input pixels. (Aliabadi [¶0230] "FIG. 11 shows an application of a 1×3 kernel to the interleaved input image. Therefore, to account for padding, Winput=Woutput+2. The A matrix has Winput*3M=(Woutput+2)*3M values in it because it needs 3 values from every input image for every column in the row to compute the whole output row. "). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Cloutier (US5892962A) is directed towards a systolic array capable of performing convolutions and neural network calculations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        

/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126