DETAILED ACTION
Claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 14, 2022, has been entered.
 
Drawings
All replacement FIGs submitted on March 14, 2022, are objected to for failing to comply with 37 CFR 1.84(a)(1) and 37 CFR 1.84(l), which requires the drawings be in black, and that all drawings be made by a process which will give them satisfactory reproduction characteristics.  Every line, number, and letter must be durable, clean, solid black (except for color drawings), sufficiently dense and dark, and uniformly thick and well-defined.  The weight of all lines and letters must be heavy enough to permit adequate reproduction.  This requirement applies to all lines however fine, to shading, and to lines representing cut surfaces in sectional views.  The drawings are pixelated because they are blurry and are not entirely in black (RGB = 000), despite appearing black to the naked eye.  This has been confirmed by inspecting applicant’s pdf file submitted through EFS.  When black is not used, the dithering used to convert applicant's grayscale image to black and white will add white pixels to try to estimate applicant's "gray" color, and the final drawings may not print properly.  Therefore, applicant must be sure to use only black and white.  FIG.1, when printed, is illegible.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claim 8 is objected to because of the following informalities:
In the last line, delete the space before the period.
Claim 11 is objected to because of the following informalities:
Replace all three instances of “operation” with --operations-- to be consistent with prior language.
Claim 19 is objected to because of the following informalities:
Replace all three instances of “operation” with --operations-- to be consistent with prior language.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 18 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The claims recite the following limitations for which there is a lack of antecedent basis:
In claim 18, line 6, “the input data”.  Please amend in similar fashion to claim 8, i.e., insert --of the grouped operations-- after “the input data”.
In claim 18, 2nd to last line, “the input data”.  Please amend in similar fashion to claim 10, i.e., insert --of the grouped operations-- after “the input data”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-3, 6-13, and 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Alwani et al., “Fused-Layer CNN Accelerators”, 2016 (as cited by applicant and herein referred to as Alwani).
Referring to claim 1, Alwani has taught a method of data processing, said method comprising:
(a) receiving an input data to be processed by a series of sequentially performed operations (see FIG.2 and page 2, section II (up to subsection A).  A CNN receives input data for processing by a series of sequential layers/operations);
(b) identifying a first operation from the series of operations, wherein the first operation has a first amount of an input data and an output data exceeding a capacity of a memory unit (from the abstract, Alwani fuses the first five operations/layers (e.g. those in FIG.2), including a first operation (layer 2).  From the paragraph below FIG.2, each layer includes an amount of input and output data.  The reason layer 2 is considered for fusion with others is because the associated input and output data (the former coming from layer 1 and the latter being sent to layer 3) is “too large to fit on chip” (abstract).  Thus, costly external memory accesses must normally be made.  Alwani recognizes that fusion with an adjacent layer would reduce capacity requirements.  Note that if on-chip memory were large enough to accommodate all input and output data for any given layer, there would be no reason to fuse as no external data accesses would be made for intermediate data.  As such, it is because of smaller on-chip capacity and large amounts of data that fusion is considered);
(c) selecting at least one second operation from the series of operations to be grouped with the first operation based at least in part on a second amount of an input data and an output data of the grouped operations and the capacity of the memory unit, wherein the at least one second operation comprises an operation from the series of operations which is immediately preceding the first operation (again, based on the capacity being exceeded, and based on a reduced amount of input and output data resulting from fusing two layers, layer 2 is fused with layer 1, which immediately precedes layer 2); and
(d) processing a portion of the input data of the grouped operations, wherein the portion of the input data of the grouped operations is determined based at least in part on an amount of an intermediate data result of the grouped operations (the input data of layer 2 (which is part of the input data of the grouped operations) is output/intermediate data produced (determined) by layer 1 (see abstract; p.1, right column , 1st full paragraph; and section A spanning pages 3-4)). 
Referring to claim 2, Alwani has taught the method of claim 1, wherein an amount of the output data of the first operation is determined based on an amount of the input data of the first operation and one or more parameters of the first operation (see page 3, left column, which inherently generates some amount of output data based on 12.3 MB of input and 144KB of parameters (weights))
*NOTE: Struck-through claim language is alternative in nature and not addressed at this time with respect to the prior art.
Referring to claim 3, Alwani has taught the method of claim 1, wherein (c) comprises determining whether the second amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (this is the reason fusion is performed - because the memory unit capacity is exceeded, more off-chip transfers need to be made.  Thus, one or more additional layers are selected for fusion to try to reduce these expensive transfers.  From page 8, section B, and FIG.7, all fusion permutations are considered.  If fusing two layers still exceeds the capacity of the on-chip memory, fusion with a third layer may be considered).
Referring to claim 6, Alwani has taught the method of claim 1, wherein the portion of the input data of the grouped operations comprises one of N equal parts of the input data of the grouped operations and a marginal data, N being an integer of 2 or larger (see FIG.3.  Each pyramid is the same size.  The marginal data could be the weights.  Or, the overlapping portions (6M blue circles) may be considered the input data and the non-overlapping could be the marginal data (see section B on page 4).  As the window shifts through the data, the overlapping amounts are the same size).
Referring to claim 7, Alwani has taught the method of claim 6, (in FIG.3, there are two equal parts/pyramids, and the processing is performed twice.  The 3rd pyramid will require a 3rd processing; a 4th pyramid will require a 4th processing, and so on)
Referring to claim 8, Alwani has taught the method of claim 1, wherein (c) further comprises storing the output data of the grouped operations in an external memory when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (note, from page 8, section B, that one possible CNN configuration includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the second amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see page 3, last couple sentences before section A.  “These last output feature maps are either written to off-chip memory or simply retained in an on-chip memory (if they are small enough).  This means that in any system with an on-chip memory that is too small to store the output data, said output data is sent off-chip), and (d) comprises storing the input data of the grouped operations and the intermediate data result of the grouped operations in the memory unit (see page 3, last few sentences before section A.  Intermediate data is stored in the on-chip memory.  Also, from the last paragraph on page 3, input values are store in on-chip buffers/memory).
Referring to claim 9, Alwani has taught the method of claim 1, wherein (c) further comprises storing the input data of the grouped operations in an external memory (per the abstract, input data is brought on chip for processing, which means it is stored off-chip first) when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (again, note, from page 8, section B, that any number of layers may be fused, which includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the second amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see the last sentence before section A on page 3.  The output may be too large for on-chip memory, which would mean that input and output exceed memory capacity.  When conditions (1) and (2) exist, and even when they don’t exist, the claimed storing in an external memory still occurs.  That is, this storing always occurs, when (1) and (2) occur, and when (1) or (2) don’t occur.  The examiner notes that the claim is not limited to storing in response to these two conditions occurring, i.e., that the occurrence of (1) and (2) cause the storing).
Referring to claim 10, Alwani has taught the method of claim 9, wherein (d) comprises receiving the portion of the input data of the grouped operations from the external memory (again, see the abstract.  All input data comes from an off-chip memory when it is brought on chip), or storing the intermediate data result and the output data of the grouped operations in the memory unit (see the last few sentences before section A on page 3.  All intermediate data is stored in the on-chip memory unit, as is the output (if it is small enough to fit)), or storing the input data of the grouped operations, the intermediate data result and the output data of the grouped operations in the memory unit (see the abstract and the last few sentences before section A on page 3.  All input data is brought into the on-chip memory unit, intermediate data is stored in the on-chip memory unit, and the output is stored in the on-chip memory unit (if it is small enough to fit)). 
Referring to claim 11, Alwani has taught the method of claim 1, further comprising (e) obtaining a portion of the output data of the grouped operation, and assembling each portion of the output data of the grouped operation to obtain the output data of the grouped operation (see FIG.3 and the three subsequent paragraphs.  One pyramid is processed at a time, which serves to generate one output at a time.  When all pyramids are complete, the entire output has been assembled).
Claim 12 is partly rejected for similar reasons as claim 1.  Alwani has further taught a system of data processing, said system comprising:
a) one or more processors (see the paragraph above section B on page 6.  The CNN is run on one or more DSP slices, which means there is inherently at least one digital signal processor (DSP)); and
b) a memory unit having instructions stored thereon which when executed by the one or more processors cause the one or more processors to carry out the claimed steps (a DSP executes instructions which are inherently stored in a memory unit).  Step (a) is performed by sending data to one or more layers of the CNN for processing.  Step (b) is performed when the first operation of a fused group of operations is started.  Step (c) is performed by starting the second operation of the fused group.  And, step (d) is performed by running one or more operations of the fused group on various data.
Claims 13, 16-17, and 19-20 are respectively rejected for similar reasons as claims 2, 6-7, 11-12.
Claim 18 is rejected for similar reasons as claims 8-10.

Claim Rejections - 35 USC § 103
Claims 4-5 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Alwani in view of the examiner’s taking of Official Notice.
Referring to claim 4, Alwani has taught the method of claim 3, wherein (c) comprises incrementally increasing a number of operations in the at least one second operation (see section B spanning pages 8-9, and FIG.7.  This shows trying all fusion permutations).  While Alwani has not taught the incremental increasing until the second amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit, the examiner does note that no intermediate data is stored off-chip (page 3, last few sentences before section A).  Alwani notes that different fusions require different amounts of on-chip memory to avoid off-chip transfer of inputs/outputs (intermediate data between layers which is an output of the Mth layer and an input of the (M+1)th layer (from page 5, fusing two layers “requires 55.86 KB of additional on-chip storage…”, and, from page 3, fusing five layers is “at the cost of only 362 KB of extra on-chip storage”, etc.).  FIG.7 shows how much extra memory one needs to avoid off-chip transfers for given layer combinations.  Alwani, from page 8, section A, generally tries to find an ideal hardware cost (in additional memory) given an analyzed CNN.  However, the examiner notes that it is known to have a hardware design already fixed and then to run compatible applications on that hardware.  As such, if one of ordinary skill in the art were given hardware with some amount of fixed on-chip storage, one could select which layers and how many to fuse (based on FIG.7) so as to fine tune the CNN for the given hardware.  In other words, given FIG.7, one could choose the third parameter, given the other two known parameters, the parameters being (DRAM transfer, extra storage, and CNN configuration (represented by dots in graphs).  As a result, in order to more efficiently run a CNN based on on-chip storage amount and desired DM transfer/cost, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Alwani to perform the incremental increasing until the second amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit. 
Referring to claim 5, Alwani, as modified, has taught the method of claim 4, wherein (c) further comprises storing the output data of the grouped operations in an external memory when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (note, from page 8, section B, that one possible CNN configuration includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the second amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see page 3, last couple sentences before section A.  “These last output feature maps are either written to off-chip memory or simply retained in an on-chip memory (if they are small enough).  This means that in any system with an on-chip memory that is too small to store the output data, said output data is sent off-chip)
Claim 14 is rejected for similar reasons as claims 3-4.
Claim 15 is rejected for similar reasons as claim 5.

---------------------------------------------------------------------------------------------------------------------

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 6-7, 11-12, 13-14, 16-17, and 19-20 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by Woo, U.S. Patent No. 10,019,668.
Referring to claim 1, Woo has taught a method of data processing, said method comprising:
(a) receiving an input data to be processed by a series of sequentially performed operations (see FIGs.2A-B, for instance, and the description thereof.  A neural network may have sequential layers (i.e., series of operations A, B, C, D, E), which receive input data for processing);
(b) identifying a first operation from the series of operations, wherein the first operation has a first amount of an input data and an output data exceeding a capacity of a memory unit (see column 9, line 60, to column 10, line 14.  The system identifies when a particular layer’s working set will exceed the on-chip memory capacity.  The working set includes the number of inputs and outputs (column 7, lines 44-53).  For instance, first operation (layer B) is identified as one who cannot perform three batches (working set) due to limited on-chip memory);
(c) selecting at least one second operation from the series of operations to be grouped with the first operation based at least in part on a second amount of an input data and an output data of the grouped operations and the capacity of the memory unit, wherein the at least one second operation comprises an operation from the series of operations which is immediately preceding the first operation (see column 14, line 18, to column 15, line 36.  Taking into account the memory capacity, if a layer’s working set will exceed that capacity, the layer is combined with another layer to form a super-layer that operates on just a portion of that working set so as to not exceed capacity.  For instance, from FIG.3, it is recognized that performing layer B on the entire working set at once will exceed capacity.  Thus, it is split and combined with layer A, which immediately precedes B, such that any given super-layer will not exceed capacity); and
(d) processing a portion of the input data of the grouped operations, wherein the portion of the input data of the grouped operations is determined based at least in part on an amount of an intermediate data result of the grouped operations (again, see FIG.3.  Each super-layer would include input into layer A, which generates intermediate data to be operated on by layer B, which in turn generates intermediate data to be operated on by layer C (column 9, lines 31-37).  Thus, a portion of this data is based on intermediate data results of the super-layer).
Referring to claim 2, Woo has taught the method of claim 1, wherein an amount of the output data of the first operation is determined based on an amount of the input data of the first operation and one or more parameters of the first operation (see column 4, lines 13-15.  An amount of output is generated based on input and parameters)
*NOTE: Struck-through claim language is alternative in nature and not addressed at this time with respect to the prior art.
Referring to claim 3, Woo has taught the method of claim 1, wherein (c) comprises determining whether the second amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (again, see column 9, line 60, to column 10, line 14, and column 14, line 18, to column 15, line 36.  For instance, if layer B by itself is going to exceed capacity, the system will determine if superlayer A+B+C will exceed capacity, and, if not, it creates the superlayer).
Referring to claim 4, Woo has taught the method of claim 3, wherein (c) comprises incrementally increasing a number of operations in the at least one second operation until the second amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit (see column 15, lines 27-30.  Each group of layers is tested against the storage capacity.  This is equivalent to incrementally increasing group size until an appropriate super-layer is determined). 
Referring to claim 6, Woo has taught the method of claim 1, wherein the portion of the input data of the grouped operations comprises one of N equal parts of the input data of the grouped operations and a marginal data, N being an integer of 2 or larger (any group of three or more inputs is made up of two or more equal parts plus marginal data.  For example, column 15, lines 12-13, give an example of 200 MB of inputs.  This includes two 90 MB portions and a marginal 10 MB portion.  This 200 MB can be arbitrarily divided up into any number of parts).
Referring to claim 7, Woo has taught the method of claim 6, (to process two 90 MB groups and a 10 MB group (N being 2), processing must occur at least two times (once for each 90 MB portion))
Referring to claim 11, Woo has taught the method of claim 1, further comprising (e) obtaining a portion of the output data of the grouped operation, and assembling each portion of the output data of the grouped operation to obtain the output data of the grouped operation (see FIG.3.  The final output will be the output of layer E.  However, a portion of the output is generated in the 3rd to last column, and another portion of the output is generated in the last column.  Thus, at the last column, the full output is assembled).
Claim 12 is partly rejected for similar reasons as claim 1.  Woo has further taught a system of data processing, said system comprising:
a) one or more processors (see column 5, lines 18-33); and
b) a memory unit having instructions stored thereon which when executed by the one or more processors cause the one or more processors to carry out the claimed steps (see column 2, line 59, to column 3, line 2, and column 5, lines 62-65.  A program to cause all steps is stored in, and executed from, memory).
Claims 13, 16-17, and 19-20 are respectively rejected for similar reasons as claims 2, 6-7, 11-12.
Claim 14 is rejected for similar reasons as claims 3-4.

Response to Arguments
On page 15 of applicant’s response (hereafter “the response”), applicant firstly argues feature I, stating that the action appears to rely on a description of the problem solved by Alwani to reject an element of the solution recited by the pending claims.
The examiner does not understand the argument.  The problem Alwani recognizes is that there is too much intermediate data to fit on chip (thus requiring many external memory accesses for unfused CNNs).  Thus, Alwani fuses layers so that intermediate data is fit in on-chip memory e.g. from the last paragraph on page 1, “we demonstrate the ability to restructure the CNN evaluation such that intermediate data do not need to be shuffled on and off chip.”).  The examiner merely pointed out why Alwani wants to fuse.

On pages 15-16 of the response, applicant secondly argues feature I, stating that Alwani does not identify a first operation whose amount of input and output data exceeds a memory capacity because the intermediate data for each layer are too large to fit on chip.
The examiner does not understand the argument.  The intermediate data are too large to fit on chip prior to fusion.  If all intermediate data could fit on chip, there would be no need to fuse.  However, because input/output data is high for given layers, the layers are fused to reduce external data transfer.  Section II(B) explains how the input/output feature maps consume the majority of the memory in the early stages.  Thus, stage 2, for instance, is identified as exceeding on chip-memory with 12.3 MB of feature maps.  As such, this stage is identified as a candidate for fusion.

On page 16 of the response, applicant thirdly argues feature I, stating that, while Alwani selects a layer to fuse, Alwani does not identify the later from a series of operations, i.e., selecting is not identifying.
The examiner sees no difference between the two.  All layers create the serious of operations.  Those that are selected for fusion are also identified for fusion.

On page 16 of the response, applicant fourthly argues feature I, stating that just a first layer must be read from DRAM to start the operations.
The examiner does not understand how this argument relates to the claim language.  Further, it is not just the first layer that reads.  From section II(B), the first layer reads in 0.6 MB of inputs.  It produces 12.3 MB of outputs.  These 12.3 MB of outputs are then read by the second layer.  Thus, the outputs are intermediate data.  Prior to fusion, the outputs (intermediate data) would be read from external memory.  Post-fusion, they outputs (intermediate data) would be read from on-chip storage.

On page 16 of the response, applicant finally argues feature I, stating that Alwani has not taught selecting layer 1 or 2 to be fused just because the amount of input and output data exceeds the capacity of a memory unit.
The examiner respectfully disagrees.  What other reason are the layers fused?  The entire premise of Alwani is to stop transferring data that doesn’t fit on chip in an unfused CNN to external memory.

On page 17 of the response, applicant firstly argues feature II, stating that Alwani describes that the fusion of layer 2 with layer 1 still exceeds the capacity of the on-chip memory.
The examiner does not see where Alwani describes this.  Please point this out.  In the last paragraph on page 1, intermediate data does not need to be shuffled on/off chip.  Thus, fusing layers allows the intermediate data, which comprises both input and output data depending on layer perspective, to on-chip memory.

On page 17 of the response, applicant secondly argues feature II, stating that Alwani does not describe any selection criteria for fusion.
The examiner respectfully disagrees.  The selection criteria is based on the amount of data.  If it does not fit on chip, then it is selected for fusion.  This is apparent to one of ordinary skill in the art reading Alwani.

On page 17 of the response, applicant thirdly argues feature II, stating that the fusion still causes the data to exceed the memory unit by 362 KB.
The examiner respectfully disagrees.  The on-chip memory includes this 362 KB.  Prior to fusion, the entire on-chip memory would be exceeded.  Post-fusion, the intermediate data fits entirely on-chip. 

On page 17 of the response, applicant fourthly argues feature II, stating that Alwani does not teach the claimed order of fusion.
The examiner respectfully disagrees.  If first and second layers are fused in Alwani, then this necessarily includes identifying layer 2, and selecting layer 1, which precedes layer 2.  It appears applicant may be reading the claim language too narrowly.

On page 18 of the response, applicant firstly argues feature III, stating that Alwani does not disclose that the input data of layer 2 is output/intermediate data of layer 1.
The examiner respectfully disagrees.  This is how a CNN works.  Layer 1 produces output, which is then input into layer 2, which in turn produces output for the next layer.  Thus, the data flowing into layers 2+ is intermediate data generated by the preceding layer.  Alwani explains this in section III with reference to FIG.3.  That is, 5x5 inputs are processed by layer 1 to produce 3x3 outputs.  These 3x3 outputs are then processed by layer 2 to produce 1x1 outputs.  The 3x3 data is thus intermediate data.

On page 18 of the response, applicant secondly argues feature III, stating that Alwani performs a backwards determination and that the output values have no logical relationship with the amount of intermediate data result.
The examiner again respectfully disagrees for reasons set forth above.  The outputs are intermediate values.  The pyramid determination is merely used to determine which operations to fuse to reduce data store requirements of the CNN.

On page 18 of the response, applicant thirdly argues feature III, stating that the amount of intermediate data result is determined by the intermediate data.
The examiner agrees, as explained above, but fails to see how this would preclude rejection.  Again, intermediate data makes up a portion of the overall inputs processed by a fused layer.

On pages 19-20 of the response, applicant firstly argues feature I, stating that Woo’s working set includes parameters in addition to inputs and outputs.
The examiner does not understand how this precludes rejection.  Parameters are inputs.

On page 20 of the response, applicant secondly argues feature I, stating that Woo does not describe layer B as having an amount of input and output data that exceeds a memory capacity, but instead describes layer B supporting only two batch elements.
The examiner respectfully disagrees.  The point of column 9, line 60, to column 10, line 3, is that B can include data that exceeds the memory capacity.  However, when layer B is not fused with another layer, this excess data cannot be processed because memory will be exceeded.  Thus, by fusing, layer B may be spread out so that an excess of 20 elements could be processed (where capacity is 20).  FIG.5 shows an example of this, where layer B processes many more than 20 total elements.  It can do so in this fused configuration, but could not in an unfused configuration such as that in FIG.2B.

On pages 20-21 of the response, applicant firstly argues feature II, stating that Woo takes into account intermediate data.
As described above, intermediate data is input/output data depending on the layer in question.  Also, Woo does not appear to mention intermediate data as part of a working set.  The working set includes inputs, outputs, and weights.  Of course, the inputs of a subsequent layer would be intermediate data generated by a previous layer.

On page 21 of the response, applicant secondly argues feature II, stating that Woo performs global scheduling or batch and layer dimensions.
The examiner does not understand the relevance of this argument to the claimed invention, nor how applicant is asserting that this teaching precludes rejection.

On page 21 of the response, applicant thirdly argues feature II, stating that Woo has not taught that the combination of layer A and B will not exceed capacity.
The examiner does not understand how applicant came to this conclusion.  FIG.3, for instance, for each batch element (0 and 1) shows the total number of storage elements required per superlayer is under the capacity of 20.  For instance, layer ABC, when processing element 0, only requires 11 storage units (1 A, 8 B, and 2 C) total, with each sub-layer including less (1 A, 8 B, 2 C).  FIG.5 shows that the combined layers do not require more than 14 storage elements for B3 (column 13, lines 8-21).

On page 21 of the response, applicant fourthly argues feature II, stating that a working set for B only includes 14 storage units, as opposed to 8 storage units.
The combining of layers reduces the storage requirements.  Applicant is asked to explain how this argument related to the claimed invention.

On page 22 of the response, applicant argues feature II, stating that Woo does not describe the relationship between input data of the grouped operations and the amount of intermediate data result of the grouped operations.
Again, intermediate data makes up a portion of the overall inputs to be processed by the superlayer.  For instance, the first superlayer in FIG.3 processes inputs corresponding to sub-layer A, sub-layer B, and sub-layer C.  The inputs into sub-layers B and C are intermediate data and outputs of the preceding layer.

Examiner Note
The examiner again notes the two distinct rejections set forth above.  The first rejection (Alwani) is based on the rejection provided in the Extended European Search Report cited by applicant.  Modifications have been made where appropriate.  The second rejection (Woo) is based on the examiner’s search.  The examiner notes that because all claims are rejected under Alwani, the examiner has not fully addressed all dependent claims with respect to Woo.  Only those that could be quickly addressed, given time constraints, have been addressed at this time.  However, this is not an indication that these dependent claims are allowable over Woo.  Should applicant amend to overcome Alwani (in a manner other than simply writing a dependent claim in independent form), the examiner may reject the dependent claims with respect to Woo at that time, where possible.  Such dependent claim rejections would be necessitated by applicant's amendment to overcome Alwani.  The examiner is not required to make secondary rejections based on Woo at this point in time but has done so only to expedite prosecution.

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Park, 2018/0032859, has taught a non-fused CNN (FIG.1) and a fused CNN (FIG.3) that makes use of temporary storage in an internal buffer.
Bokhari, 10,664,310, has taught scheduling layers of a CNN application based on memory constraints to minimize external memory access (see description of FIG.1, for instance).
All claims are either identical to or patentably indistinct from claims in the application prior to the entry of the submission under 37 CFR 1.114 (that is, restriction would not be proper) and all claims could have been finally rejected on the grounds and art of record in the next Office action if they had been entered in the application prior to entry under 37 CFR 1.114. Accordingly, THIS ACTION IS MADE FINAL even though it is a first action after the filing of a request for continued examination and the submission under 37 CFR 1.114.  See MPEP § 706.07(b). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta, can be reached at 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183