Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“a splitter configured to” in claim 14 
“an operator configured to” in claim 14
“a generator configured to” in claim 14
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 1 and 13-15, the limitation "respective operational parameters in each column of the operational parameter array being from different subsets of the set of kernels of the weight parameter respectively and having the same one or more channels" is indefinite.  The previous limitation suggests "respective operational parameters in each row of the operational parameter array being from a same subset of a set of kernels of the weighted parameter and having different channels respectively" but it would be unclear to one of ordinary skill in the art how both the column direction and row direction can both represent multiple channels unless the column direction and row direction were the same direction.  One of ordinary skill in the art would recognize that the channel/kernel direction in a convolutional neural network is in the depth direction/z-axis and that the corresponding elements in the X and Y directions correspond to the same channel.  In the interest of further examination "respective operational parameters in each column of the operational parameter array being from different subsets of the set of kernels of the weight parameter respectively and having the same one or more channels" is interpreted as "respective operational parameters in each column of the operational parameter array being from different subsets of the set of kernels of the weight parameter respectively and having the same channel".

Regarding claims 1-15, the use of the term “weight parameter” is indefinite.  The term is inconsistent with a term widely used in the art, and it is unclear from the instant specification what a weight parameter actually is.  With respect to the instant specification it appears that a filter-bank is being referred to as a weight parameter, however, it is possible that the term weight parameter is intentional, and that other aspects of the usage of the term are unclear, such as how a single weight parameter can have multiple channels and kernels, and how a layer can simultaneously operate with only a single weight.  It is also unclear how the weight parameter might consist of multiple operational parameters.  For these reasons, in the interest of further examination, a weight parameter is interpreted as synonymous with a filter-bank.

Claim limitations “a splitter configured to” in claim 14, “an operator configured to” in claim 14, and “a generator configured to” in claim 14 invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function.  With respect to the instant specification there is no mention of whether or not the splitter, operator, or generator are hardware computer components, software, or something altogether different.  Figure 12 depicts each of these elements, however, the depiction is seen as merely a black box without inherent structure.  Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

The remaining claims are rejected with respect to their dependence on the rejected claims. 
Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-15 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes and mathematical calculations.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
splitting a weight parameter of a selected layer in the convolutional neural network in at least one of dimension of depth and number of kernels to obtain an operational parameter array including a plurality of operational parameters, respective operational parameters in each row of the operational parameter array being from a same subset of a set of kernels of the weighted parameter and having different channels respectively, and respective operational parameters in each column of the operational parameter array being from different subsets of the set of kernels of the weight parameter respectively and having the same one or more channels (mathematical concepts),
 performing, by using each operational parameter in the operational parameter array, operations of the selected layer on data of input data for the selected layer that are in the channel corresponding to the channel of the operational parameter that is in use, to obtain a partial operation result array including a plurality of partial operation results (mathematical calculations)
generating one or more output data of the selected layer based on the partial operational result array (output data is interpreted as the result of mathematical calculations and relationships)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 does not recite additional elements that are sufficient to integrate the judicial exception into a practical application.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 includes additional elements “generating one or more output data” which amounts to outputting data which is insignificant extra-solution activity (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015)). As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claims 13-15, which recite a system and a computer program product, respectively, as well as to dependent claims 2-12. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 2 recites additional mathematical calculations and relationships “splitting the weight parameter in a case where a size of the weight parameter exceeds a first threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a size less than or equal to the first threshold.”
Dependent claim 3 recites additional mathematical calculations and relationships “splitting the weight parameter in a case where a number of kernels of the weight parameter exceeds a second threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a number of kernels less than or equal to the second threshold.”
Dependent claim 4 recites additional mathematical calculations and relationships “splitting the weight parameter in a case where the weight parameter has a number of kernels greater than or equal to a first predetermined number, such that the operational parameter array obtained by the splitting has a number of rows equal to a multiple of the first predetermined number.”
Dependent claim 5 recites additional mathematical calculations and relationships “splitting the weight parameter in a case where the weight parameter has a number of channels exceeding a third threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a number of channels less than or equal to the third threshold.” 
Dependent claim 6 recites additional mathematical calculations and relationships “splitting the weight parameter in a case where the weight parameter has a number of channels greater than or equal to a second predetermined number, such that the operational parameter array obtained by the splitting has a number of columns equal to a multiple of the second predetermined number.” 
Dependent claim 7 recites additional insignificant extra-solution activity of gathering data “when the selected layer receives a plurality of partial input data, any two of which do not have the same channel, and the plurality of partial input data collectively correspond to a complete input data of the selected layer, then the weight parameter is split according to each partial input data such that the operational parameter array obtained by the splitting has a number of columns equal to the number of the received plurality of partial input data, and all the operational parameters in each column correspond to the same one or more channels as one of the plurality of partial input data.” 
Dependent claim 8 recites additional mathematical calculations and relationships “subdividing at least a row and/or column of the operational parameter array in at least one of dimensions of depth and number of kernels when the row and/or column includes an operational parameter having a size exceeding a first threshold, such that each operational parameter in the operational parameter array obtained by the subdividing has a size less than or equal to the first threshold.” 
Dependent claim 9 recites additional insignificant extra-solution activity “wherein each partial operation result in the partial operation result array corresponds to one output data of the selected layer.” Which amounts to selection of a data type (See also MPEP 2106.05(g) Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016))
Dependent claim 10 recites additional mathematical calculations and relationships “compressing the partial operation result array into one column by adding up all the partial operation results in each row of the partial operation result array in a point-to-point manner when the partial operation result array includes a plurality of columns, each partial operation result in the compressed partial operation result array corresponding to an output data of the selected layer.”
Dependent claim 11 recites additional mathematical calculations and relationships “compressing the partial operation result array into one row by combining all the partial operation results in each column of the partial operation result array in the depth direction when the partial operation result array includes a plurality of rows, each partial operation result in the compressed partial operation result array corresponding to an output data of the selected layer.”
Dependent claim 12 recites additional mathematical calculations and relationships “generating an output data of the selected layer by adding up all the partial operation results in each row of the partial operation result array in a point-to-point manner and then combining, in the depth direction, all the partial operation results in each column of the partial operation result array compressed by the adding up, or by combining all the partial operation results in each column of the partial operation result array in the depth direction and then adding up all the partial operation results in each row of the partial operation result array compressed by the combining in a point-to-point manner, when the partial operation result array includes a plurality of rows and a plurality of columns.”
 	Independent claim 15 is further rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. It recites a “non-temporary storage medium …”, and there is no description in the specification that states this storage medium to be limited only to non-transitory medium; therefore it can also include any type of medium. In the broadest reasonable sense, this encompasses signals or transmission media which are not statutory embodiments under 35 USC 101. See In re Nuijten, 500 F.3d 1346, 84 USPQ2d 1495 (Fed. Cir. 2007).  A “non-transitory storage medium” is recommended.

Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-15 are rejected under 35 U.S.C. § 101. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-11 and 13-15 are rejected under 35 U.S.C. 103 as being unpatentable over Yang ("A Systematic Approach to Blocking Convolutional Neural Networks", 2016).

	Regarding claim 1, Yang teaches A method for performing operations in a convolutional neural network, comprising: ([Abstract] "This paper explores how to block CNN computations for memory locality by creating an analytical model for CNN-like loop nests. Using this model we automatically derive optimized blockings for common networks that improve the energy efficiency of custom hardware implementations by up to an order of magnitude")
	splitting a weight parameter of a selected layer in the convolutional neural network in at least one of dimension of depth and number of kernels to obtain an operational parameter array including a plurality of operational parameters, ([p. 2 §2] "A convolutional layer (Conv) corresponds to a filter bank. In the standard case of 3D input and output, a convolutional layer maps a C×X×Y input to a K×X×Y output using K shift-invariant 3D stencils, where each stencil is of the size Fw×Fh×C (i.e., a set of K 3-dimensional convolutions). These K Fw×Fh×C stencil coefficients are the “weights” of the convolutional layer. Here, (X,Y) and (Fw,Fh) are the image and kernel width and height dimensions and both image and kernels have the same depth dimension, which we define as C, or the number of channels. Typically the dimensions of the kernels are much smaller than the image dimensions." [p. 4 §3.1] "The computation being performed by a convolutional layer can be easily expressed as a 6 layer loop nest as shown in Algorithm 1...blocking can be thought of as simply splitting a number of loops, and then exchanging the order in which these split loops are executed" See also Figure 1.  Splitting a kernel depthwise (splitting a weight parameter in a depth dimension) interpreted as synonymous with blocking along the channel dimension as described in Yang.)
	respective operational parameters in each row of the operational parameter array being from a same subset of a set of kernels of the weighted parameter and having different channels respectively, ([p. 5 §3.2] "Figure 1 demonstrates two levels of nested blocking for each dimension, and the associated buffers. The inner loop takes a small amount of input data with block size X0Y0C0 and convolves it with K0 kernels to create some partial outputs with block size X0Y0K0. A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated, which will happen only when all of the channels (C2 loop) finish." Yang explicitly teaches that each depthwise layer corresponds to a channel ([p. 2 §2] "Here, (X;Y) and (Fw;Fh) are the image and kernel width and height dimensions and both image and kernels have the same depth dimension, which we define as C; or the number of channels") and shows depthwise blocking such that the depthwise direction can be considered the column axis.  Operational parameters interpreted as synonymous with block. Weight parameter interpreted as synonymous with filter bank.  With respect to figure 1 a row is interpreted as being in the depthwise direction.)
	and respective operational parameters in each column of the operational parameter array being from different subsets of the set of kernels of the weight parameter respectively and having the same one or more channels; ([p. 2 §2.1] "Pooling and LRN layers have no learned parameters (weights)." [p. 5 §3.2] "A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated, which will happen only when all of the channels (C2 loop) finish...Figure 2: Multicore partitioning. Top: kernel partitioning broadcasts a shared input to separate cores, each of which processes a disjoint subset of the kernels to produce a disjoint slab of the output (in the K dimension)." With respect to figure 1 of Yang, X or Y direction is interpreted as synonymous with the column direction from the same subset of a set of kernels of the weighted parameter corresponding to one or more channel.)
	performing, by using each operational parameter in the operational parameter array, operations of the selected layer on data of input data for the selected layer that are in the channel corresponding to the channel of the operational parameter that is in use, to obtain a partial operation result array including a plurality of partial operation results; and ([p. 4] "Figure 1: Hierarchical blocking of a single convolutional layer. The six-dimensional overall problem domain (X,Y,C,Fw,Fh,K) depicted in Figure 1 is blocked to three levels in the input domain ({X,Y,C}{0,1,2}), and two levels in the set of kernels (K) which correspond to the third dimension of the output domain ({X,Y}{0,1,2},{K}{0,1}). Partial results for each output pixel are accumulated hierarchically across the three levels of blocking in C").
	generating one or more output data of the selected layer based on the partial operational result array. ([p. 4 §3] "Partial results for each output pixel are
accumulated hierarchically across the three levels of blocking in C"). 

	Regarding claim 2, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: splitting the weight parameter in a case where a size of the weight parameter exceeds a first threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a size less than or equal to the first threshold. ([p. 5 §3.2] "When a new C loop Ci is added, a series of images and kernels are streamed and Ci channels reductions are being performed on the same set of outputs. Therefore those partial outputs are being reduced Ci/Ci-1 times, and should be stored in a new output buffer to prevent these fetches from going to a larger memory at a higher level in the memory hierarchy" First threshold interpreted as Ci-1 such that if Ci is equal to Ci-1 no splitting occurs.). 

	Regarding claim 3, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: splitting the weight parameter in a case where a number of kernels of the weight parameter exceeds a second threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a number of kernels less than or equal to the second threshold. ([p. 5 §3.2] "When a new C loop Ci is added, a series of images and kernels are streamed and Ci channels reductions are being performed on the same set of outputs. Therefore those partial outputs are being reduced Ci/Ci-1 times, and should be stored in a new output buffer to prevent these fetches from going to a larger memory at a higher level in the memory hierarchy" [p. 5 §3.2] "Suppose we apply parallelism for S cores at a given level p by unrolling that loop p across the processors. The first constraint is that we need to block the application such that the dimension being unrolled, e.g. Cp, is S times that of the previous level, Cp􀀀1. The parallelism can be performed by partitioning the problem across the input XY, the kernels K, or the channels C" First threshold interpreted as Ci-1 such that if Ci is equal to Ci-1 no splitting occurs.  Yang explicitly teaches that the partitioning may occur as a function of kernels.  The second threshold is interpreted as Ki-1 such that if Ki=Ki-1 no splitting occurs.). 

	Regarding claim 4, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: splitting the weight parameter in a case where the weight parameter has a number of kernels greater than or equal to a first predetermined number, such that the operational parameter array obtained by the splitting has a number of rows equal to a multiple of the first predetermined number. ([p. 5 §3.3] "The first constraint is that we need to block the application such that the dimension being unrolled, e.g. Cp, is S times that of the previous level, Cp−1. The parallelism can be performed by partitioning the problem across the input XY, the kernels K, or the channels C" S interpreted as multiple of predetermined number (Cp-1).). 

	Regarding claim 5, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: splitting the weight parameter in a case where the weight parameter has a number of channels exceeding a third threshold, such that each operational parameter in the operational parameter array obtained by the splitting has a number of channels less than or equal to the third threshold. ([p. 5 §3.2] "When a new C loop Ci is added, a series of images and kernels are streamed and Ci channels reductions are being performed on the same set of outputs. Therefore those partial outputs are being reduced Ci/Ci-1 times, and should be stored in a new output buffer to prevent these fetches from going to a larger memory at a higher level in the memory hierarchy" [p. 5 §3.2] "Suppose we apply parallelism for S cores at a given level p by unrolling that loop p across the processors. The first constraint is that we need to block the application such that the dimension being unrolled, e.g. Cp, is S times that of the previous level, Cp-1. The parallelism can be performed by partitioning the problem across the input XY, the kernels K, or the channels C" Third threshold interpreted as Cp-1 such that if Cp is equal to Cp-1 no splitting occurs.  Yang explicitly teaches that the partitioning may occur as a function of channels (Cp = Cp-1).). 

	Regarding claim 6, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: splitting the weight parameter in a case where the weight parameter has a number of channels greater than or equal to a second predetermined number, such that the operational parameter array obtained by the splitting has a number of columns equal to a multiple of the second predetermined number. ([p. 5 §3.3] "The first constraint is that we need to block the application such that the dimension being unrolled, e.g. Cp, is S times that of the previous level, Cp−1. The parallelism can be performed by partitioning the problem across the input XY, the kernels K, or the channels C" S interpreted as multiple of predetermined number (Cp-1).). 

	Regarding claim 7, Yang teaches The method of claim 1 wherein splitting the weight parameter comprises: when the selected layer receives a plurality of partial input data, any two of which do not have the same channel, and the plurality of partial input data collectively correspond to a complete input data of the selected layer, ([p. 5 §3.2] "The inner loop takes a small amount of input data with block size X0Y0C0 and convolves it with K0 kernels to create some partial outputs with block size X0Y0K0. A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated" Small amount of input data interpreted as synonymous with plurality of partial input data.  If block X0 and Y0 are 1 (which is explicitly taught in Algorithm 1) then any two of the partial input data would not have the same channel.  Therefore Yang explicitly teaches receiving a plurality of partial input data of which any two do not have the same channel.)
	then the weight parameter is split according to each partial input data such that the operational parameter array obtained by the splitting has a number of columns equal to the number of the received plurality of partial input data, and all the operational parameters in each column correspond to the same one or more channels as one of the plurality of partial input data. ([p. 5 §3.2] "The inner loop takes a small amount of input data with block size X0Y0C0 and convolves it with K0 kernels to create some partial outputs with block size X0Y0K0. A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated"). 

	Regarding claim 8, Yang teaches The method of claim 1 wherein splitting the weight parameter further comprises: subdividing at least a row and/or column of the operational parameter array in at least one of dimensions of depth and number of kernels when the row and/or column includes an operational parameter having a size exceeding a first threshold, such that each operational parameter in the operational parameter array obtained by the subdividing has a size less than or equal to the first threshold. ([p. 5 §3.3] "The first constraint is that we need to block the application such that the dimension being unrolled, e.g. Cp, is S times that of the previous level, Cp−1. The parallelism can be performed by partitioning the problem across the input XY, the kernels K, or the channels C" S interpreted as multiple of predetermined number (Cp-1).  See also figure 1.  Cp interpreted as operational parameter exceeding the threshold.  Cp-1 interpreted as synonymous with operational parameter having a size less than Cp.). 

	Regarding claim 9, Yang teaches The method of claim 1 wherein each partial operation result in the partial operation result array corresponds to one output data of the selected layer. ([p. 4] "Partial results for each output pixel are accumulated hierarchically across the three levels of blocking in C" Output pixel interpreted as synonymous with one output data of the selected layer.). 

	Regarding claim 10, Yang teaches The method of claim 1 where generating the output data comprises: compressing the partial operation result array into one column by adding up all the partial operation results in each row of the partial operation result array in a point-to-point manner when the partial operation result array includes a plurality of columns, each partial operation result in the compressed partial operation result array corresponding to an output data of the selected layer. ([p. 2] "Partial results for each output pixel are accumulated hierarchically across the three levels of blocking in C" [p. 5 §3.2] "The inner loop takes a small amount of input data with block size X0Y0C0 and convolves it with K0 kernels to create some partial outputs with block size X0Y0K0. A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated, which will happen only when all of the channels (C2 loop) finish" For the iteration of X0=1 partial output is compressed into a single column.  Yang explicitly teaches that the partial results are accumulated (summed) from the channels (rows).). 

	Regarding claim 11, Yang teaches The method of claim 1 wherein generating the output data comprises: compressing the partial operation result array into one row by combining all the partial operation results in each column of the partial operation result array in the depth direction when the partial operation result array includes a plurality of rows, each partial operation result in the compressed partial operation result array corresponding to an output data of the selected layer. ([p. 2] "Partial results for each output pixel are accumulated hierarchically across the three levels of blocking in C" [p. 5 §3.2] "The inner loop takes a small amount of input data with block size X0Y0C0 and convolves it with K0 kernels to create some partial outputs with block size X0Y0K0. A complete output cannot be generated until all the channels of the input are processed for that kernel and the output pixel is generated, which will happen only when all of the channels (C2 loop) finish" For the iteration of X0=1 partial output is compressed into a single row.  See Figure 1 of how the partial operation results correspond to the output.). 

	Regarding claims 13-15, claims 13-15 are directed towards an apparatus for performing the method of claim 1.  Therefore, the rejection applied to claim 1 also applies to claims 13-15.  Claims 13 and 15 also mention additional elements including a processor to perform the method ([p. 1 §1] "Early attempts [20, 1, 24, 2] to optimize CPU and GPU CNN implementations treated the convolutional layers as matrix multiplication and used an optimized BLAS matrix matrix-multiplication (GEMM) routine") as well as memory to store the instructions performed by the processor ([p. 1 §1] "the design of the memory hierarchy and how the data is choreographed has a dramatic effect on the energy required for the computation."). 

	Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Yang and in view of Stanford (“CS231n Convolutional Neural Networks for Visual Recognition”, 2015). 

	Regarding claim 12, Yang teaches The method of claim 1.
	However, Yang does not explicitly teach generating the output data comprises: generating an output data of the selected layer by adding up all the partial operation results in each row of the partial operation result array in a point-to-point manner and then combining, in the depth direction, all the partial operation results in each column of the partial operation result array compressed by the adding up, or by combining all the partial operation results in each column of the partial operation result array in the depth direction and then adding up all the partial operation results in each row of the partial operation result array compressed by the combining in a point-to-point manner, when the partial operation result array includes a plurality of rows and a plurality of columns.  

Stanford, in the same field of endeavor, teaches generating the output data comprises: generating an output data of the selected layer by adding up all the partial operation results in each row of the partial operation result array in a point-to-point manner and then combining, in the depth direction, all the partial operation results in each column of the partial operation result array compressed by the adding up, or by combining all the partial operation results in each column of the partial operation result array in the depth direction and then adding up all the partial operation results in each row of the partial operation result array compressed by the combining in a point-to-point manner, when the partial operation result array includes a plurality of rows and a plurality of columns. ([p. 11] "The visualization below iterates over the output activations (green), and shows that each element is computed by elementwise multiplying the highlighted input (blue) with the filter (red), summing it up, and then offsetting the result by the bias." See FIG. on p. 12.). 

	Yang and Stanford are both directed towards accelerating convolutional neural networks. Therefore, Yang and Stanford are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Yang with the teachings of Stanford.  While the content on the Stanford convnet course website would be considered to be well-understood to one of ordinary skill in the art, a motivation for combination with regards to matrix factorization of convolutional neural networks has been provided ([p. 13 "Implementation as Matrix Multiplication"] "the benefit is that there are many very efficient implementations of Matrix Multiplication that we can take advantage of (for example, in the commonly used BLAS API). Moreover, the same im2col idea can be reused to perform the pooling operation").  
	
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Howard (US20180137406A1) and Huang (US20190179674A1) are both directed towards convolutional neural network accelerators utilizing pointwise and depthwise convolution operations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        

/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126