DETAILED ACTION
Claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
The clean substitute disclosure submitted on October 22, 2021, is objected to because of the following informalities:
In paragraphs 120, 122, 123, and 124, insert a space between “(d)” and “comprises”.
In paragraph 145, applicant states “size of W x H (i.e., weight and height)”.  How is weight indicative of size in this context?  Does applicant mean --width and height-- instead?
Appropriate correction is required.

Drawings
All replacement FIGs submitted on October 22, 2021 (except for FIG.7), are objected to for failing to comply with 37 CFR 1.84(a)(1) and 37 CFR 1.84(l), which requires the drawings 
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3-10 and 14-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The claims recite the following limitations for which there is a lack of antecedent basis:
In claim 3, “the amount of the input data”, which could refer to claim 1, line 5, or claim 1, line 8.  Applicant could claim this as a second amount in claims 1 and 3.
In claims 4-5, 7-9, 14-15, and 17-18, all instances of “the amount of the input data” for similar reasons.
In claim 5, line 11, both instances of “the input data”.  Which input data of claim 1 (line 2, 5, or 8)?
In claim 6, line 2, “the input data” for similar reasons.
In claim 8, 2nd to last line, “the input data” for similar reasons.
In claim 10, line 4, “the input data” for similar reasons.
In claim 15, line 11, “the input data” for similar reasons.
In claim 16, line 2, “the input data” for similar reasons.
In claim 18, lines 5 and 14, both instances of “the input data” for similar reasons.
Claims 7 and 17 are indefinite because the alternatives and formatting are presented in such a way to be overly expansive and so as to bring in to question the metes and bounds of the claim.  For example, is applicant trying to claim (c) comprises determining the at least one second operation such that the value of N is minimal as an alternative to (d) is performed at least N times, or is applicant trying to claim that the value of N is minimal is an alternative to (d) is performed at least N times?  The same applies to the third alternative in lines 4-5.  In other words, does each alternative fill in for “the value of N is minimal”?  Or does any alternative fully replace everything after “wherein” in lines 1-2?  Though applicant attempts to clarify in the remarks, the claim must be amended to be clear.  See MPEP 2173.05(h).
Claims 4-5, 7, 10, and 17 are rejected for being indefinite due to dependency on an indefinite claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-3, 6-13, and 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Alwani et al., “Fused-Layer CNN Accelerators”, 2016 (as cited by applicant and herein referred to as Alwani).
Referring to claim 1, Alwani has taught a method of data processing, said method comprising:
(a) receiving an input data to be processed by a series of sequentially performed operations (see FIG.2 and page 2, section II (up to subsection A).  A CNN receives input data for processing by a series of sequential layers/operations);
(b) identifying a first operation from the series of operations, wherein the first operation has an amount of an input data and an output data exceeding a capacity of a memory unit (from the abstract, Alwani fuses the first five operations/layers (e.g. those in FIG.2), including a first operation (layer 2).  From the paragraph below FIG.2, each layer includes an amount of input and output data.  The reason layer 2 is considered for fusion with others is because the associated input and output data (the former coming from layer 1 and the latter being sent to layer 3) is “too large to fit on chip” (abstract).  Thus, costly external memory accesses must normally be made.  Alwani recognizes that fusion with an adjacent layer would reduce capacity requirements.  Note that if on-chip memory were large enough to accommodate all input and output data for any given layer, there would be no reason to fuse as no external data accesses would be made for intermediate data.  As such, it is because of smaller on-chip capacity and large amounts of data that fusion is considered);
(c) selecting at least one second operation from the series of operations to be grouped with the first operation based at least in part on an amount of an input data and an output data of the grouped operations and the capacity of the memory unit, wherein the at least one second operation comprises an operation from the series of operations which is immediately preceding the first operation (again, based on the capacity being exceeded, and ; and
(d) processing a portion of the input data of the grouped operations, wherein the portion of the input data of the grouped operations is determined based at least in part on an amount of an intermediate data result of the grouped operations (the input data of layer 2 (which is part of the input data of the grouped operations) is output/intermediate data produced (determined) by layer 1 (see abstract; p.1, right column , 1st full paragraph; and section A spanning pages 3-4)). 
Referring to claim 2, Alwani has taught the method of claim 1, wherein an amount of the output data of the first operation is determined based on an amount of the input data of the first operation and one or more parameters of the first operation (see page 3, left column, which inherently generates some amount of output data based on 12.3 MB of input and 144KB of parameters (weights))
*NOTE: Struck through claim language is alternative in nature and not addressed at this time with respect to the prior art.
Referring to claim 3, Alwani has taught the method of claim 1, wherein (c) comprises determining whether the amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (this is the reason fusion is performed - because the memory unit capacity is exceeded, more off-chip transfers need to be made.  Thus, one or more additional layers are selected for fusion to try to reduce these expensive transfers.  From page 8, section B, and FIG.7, all fusion permutations are considered.  If fusing two layers still exceeds the capacity of the on-chip memory, fusion with a third layer may be considered).
Referring to claim 6, Alwani has taught the method of claim 1, wherein the portion of the input data of the grouped operations comprises one of N equal parts of the input data and a marginal data, N being an integer of 2 or larger (see FIG.3.  Each pyramid is the same size.  The marginal data could be the weights.  Or, the overlapping portions (6M blue circles) may be considered the input data and the non-overlapping could be the marginal data (see section B on page 4).  As the window shifts through the data, the overlapping amounts are the same size).
Referring to claim 7, Alwani has taught the method of claim 6, wherein (in FIG.3, there are two equal parts/pyramids, and the processing is performed twice.  The 3rd pyramid will require a 3rd processing; a 4th pyramid will require a 4th processing, and so on)
Referring to claim 8, Alwani has taught the method of claim 1, wherein (c) further comprises storing the output data of the grouped operations in an external memory when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (note, from page 8, section B, that one possible CNN configuration includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see page 3, last couple sentences before section A.  “These last output feature maps are , and (d) comprises storing the input data and the intermediate data result of the grouped operations in the memory unit (see page 3, last few sentences before section A.  Intermediate data is stored in the on-chip memory.  Also, from the last paragraph on page 3, input values are store in on-chip buffers/memory).
Referring to claim 9, Alwani has taught the method of claim 1, wherein (c) further comprises storing the input data of the grouped operations in an external memory (per the abstract, input data is brought on chip for processing, which means it is stored off-chip first) when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (again, note, from page 8, section B, that any number of layers may be fused, which includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see the last sentence before section A on page 3.  The output may be too large for on-chip memory, which would mean that input and output exceed memory capacity.  When conditions (1) and (2) exist, and even when they don’t exist, the claimed storing in an external memory still occurs.  That is, this storing always occurs, when (1) and (2) occur, and when (1) or (2) don’t occur.  The examiner notes that the claim is not limited to storing in response to these two conditions occurring, i.e., that the occurrence of (1) and (2) cause the storing).
Referring to claim 10, Alwani has taught the method of claim 9, wherein (d) comprises receiving the portion of the input data of the grouped operations from the external memory (again, see the abstract.  All input data comes from an off-chip memory when it is brought on , or storing the intermediate data result and the output data of the grouped operations in the memory unit (see the last few sentences before section A on page 3.  All intermediate data is stored in the on-chip memory unit, as is the output (if it is small enough to fit)), or storing the input data, the intermediate data result and the output data of the grouped operations in the memory unit (see the abstract and the last few sentences before section A on page 3.  All input data is brought into the on-chip memory unit, intermediate data is stored in the on-chip memory unit, and the output is stored in the on-chip memory unit (if it is small enough to fit)). 
Referring to claim 11, Alwani has taught the method of claim 1, further comprising (e) obtaining a portion of the output data of the grouped operation, and assembling each portion of the output data of the grouped operation to obtain the output data of the grouped operation (see FIG.3 and the three subsequent paragraphs.  One pyramid is processed at a time, which serves to generate one output at a time.  When all pyramids are complete, the entire output has been assembled).
Claim 12 is partly rejected for similar reasons as claim 1.  Alwani has further taught a system of data processing, said system comprising:
a) one or more processors (see the paragraph above section B on page 6.  The CNN is run on one or more DSP slices, which means there is inherently at least one digital signal processor (DSP)); and
b) a memory unit having instructions stored thereon which when executed by the one or more processors cause the one or more processors to carry out the claimed steps (a DSP executes instructions which are inherently stored in a memory unit).  Step (a) is performed by sending data to one or more layers of the CNN for processing.  Step (b) is performed when the 
Claims 13, 16-17, and 19-20 are respectively rejected for similar reasons as claims 2, 6-7, 11-12.
Claim 18 is rejected for similar reasons as claims 8-10.

Claim Rejections - 35 USC § 103
Claims 4-5 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Alwani in view of the examiner’s taking of Official Notice.
Referring to claim 4, Alwani has taught the method of claim 3, wherein (c) comprises incrementally increasing a number of operations in the at least one second operation (see section B spanning pages 8-9, and FIG.7.  This shows trying all fusion permutations).  While Alwani has not taught the incremental increasing until the amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit, the examiner does note that no intermediate data is stored off-chip (page 3, last few sentences before section A).  Alwani notes that different fusions require different amounts of on-chip memory to avoid off-chip transfer of inputs/outputs (intermediate data between layers which is an output of the Mth layer and an input of the (M+1)th layer (from page 5, fusing two layers “requires 55.86 KB of additional on-chip storage…”, and, from page 3, fusing five layers is “at the cost of only 362 KB of extra on-chip storage”, etc.).  FIG.7 shows how much extra memory one needs to avoid off-chip transfers for given layer combinations.  Alwani, from page 8, section A, generally tries to find an ideal hardware cost (in additional memory) given an analyzed CNN.  However, until the amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit). 
Referring to claim 5, Alwani, as modified, has taught the method of claim 4, wherein (c) further comprises storing the output data of the grouped operations in an external memory when (1) a number of operations in the grouped operations is equal to a number of operations in the series of operations (note, from page 8, section B, that one possible CNN configuration includes fusing all layers, denoted by “(3)” in a 3-layer CNN), and (2) the amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (see page 3, last couple sentences before section A.  “These last output feature maps are either written to off-chip memory or simply retained in an on-chip memory (if they are small enough).  This means that in any system with an on-chip memory that is too small to store the output data, said output data is sent off-chip)
Claim 14 is rejected for similar reasons as claims 3-4.
Claim 15 is rejected for similar reasons as claim 5.

---------------------------------------------------------------------------------------------------------------------

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 6-7, 11-12, 13-14, 16-17, and 19-20 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by Woo, U.S. Patent No. 10,019,668.
Referring to claim 1, Woo has taught a method of data processing, said method comprising:
(a) receiving an input data to be processed by a series of sequentially performed operations (see FIGs.2A-B, for instance, and the description thereof.  A neural network may have sequential layers (i.e., series of operations A, B, C, D, E), which receive input data for processing);
(b) identifying a first operation from the series of operations, wherein the first operation has an amount of an input data and an output data exceeding a capacity of a memory unit (see column 9, line 60, to column 10, line 14.  The system identifies when a particular layer’s working set will exceed the on-chip memory capacity.  The working set includes the number of inputs and outputs (column 7, lines 44-53).  For instance, first operation (layer B) is identified as one who cannot perform three batches (working set) due to limited on-chip memory);
(c) selecting at least one second operation from the series of operations to be grouped with the first operation based at least in part on an amount of an input data and an output data of the grouped operations and the capacity of the memory unit, wherein the at least one second operation comprises an operation from the series of operations which is immediately preceding the first operation (see column 14, line 18, to column 15, line 36.  Taking into account the memory capacity, if a layer’s working set will exceed that capacity, the layer is combined with another layer to form a super-layer that operates on just a portion of that working set so as to not exceed capacity.  For instance, from FIG.3, it is recognized that performing layer B on the entire working set at once will exceed capacity.  Thus, it is split and combined with layer A, which immediately precedes B, such that any given super-layer will not exceed capacity); and
(d) processing a portion of the input data of the grouped operations, wherein the portion of the input data of the grouped operations is determined based at least in part on an amount of an intermediate data result of the grouped operations (again, see FIG.3.  Each super-layer .
Referring to claim 2, Woo has taught the method of claim 1, wherein an amount of the output data of the first operation is determined based on an amount of the input data of the first operation and one or more parameters of the first operation (see column 4, lines 13-15.  An amount of output is generated based on input and parameters)
*NOTE: Struck through claim language is alternative in nature and not addressed at this time with respect to the prior art.
Referring to claim 3, Woo has taught the method of claim 1, wherein (c) comprises determining whether the amount of the input data and the output data of the grouped operations exceeds the capacity of the memory unit (again, see column 9, line 60, to column 10, line 14, and column 14, line 18, to column 15, line 36.  For instance, if layer B by itself is going to exceed capacity, the system will determine if superlayer A+B+C will exceed capacity, and, if not, it creates the superlayer).
Referring to claim 4, Woo has taught the method of claim 3, wherein (c) comprises incrementally increasing a number of operations in the at least one second operation until the amount of the input data and the output data of the grouped operations does not exceed the capacity of the memory unit (see column 15, lines 27-30.  Each group of layers is tested against the storage capacity.  This is equivalent to incrementally increasing group size until an appropriate super-layer is determined). 
Referring to claim 6, Woo has taught the method of claim 1, wherein the portion of the input data of the grouped operations comprises one of N equal parts of the input data and a marginal data, N being an integer of 2 or larger (any group of three or more inputs is made up of two or more equal parts plus marginal data.  For example, column 15, lines 12-13, give an example of 200 MB of inputs.  This includes two 90 MB portions and a marginal 10 MB portion.  This 200 MB can be arbitrarily divided up into any number of parts).
Referring to claim 7, Woo has taught the method of claim 6, wherein (to process two 90 MB groups and a 10 MB group (N being 2), processing must occur at least two times (once for each 90 MB portion))
Referring to claim 11, Woo has taught the method of claim 1, further comprising (e) obtaining a portion of the output data of the grouped operation, and assembling each portion of the output data of the grouped operation to obtain the output data of the grouped operation (see FIG.3.  The final output will be the output of layer E.  However, a portion of the output is generated in the 3rd to last column, and another portion of the output is generated in the last column.  Thus, at the last column, the full output is assembled).
Claim 12 is partly rejected for similar reasons as claim 1.  Woo has further taught a system of data processing, said system comprising:
one or more processors (see column 5, lines 18-33); and
b) a memory unit having instructions stored thereon which when executed by the one or more processors cause the one or more processors to carry out the claimed steps (see column 2, line 59, to column 3, line 2, and column 5, lines 62-65.  A program to cause all steps is stored in, and executed from, memory).
Claims 13, 16-17, and 19-20 are respectively rejected for similar reasons as claims 2, 6-7, 11-12.
Claim 14 is rejected for similar reasons as claims 3-4.

Response to Arguments
On page 11 of the response, applicant argues that the claims are definite because “the amount of the input data and the output data of the grouped operations” should be considered as a whole.
While the examiner agrees that this is one interpretation of this phrase, it is not the only interpretation.   In claim 3, for instance, while the output data is clearly of the grouped operations, the amount of the input data may or may not be.  That is, the amount of the input data may refer to that of the grouped operations or to the amount of claim 1, step (b).  Hence, it is unclear which is being referred to, and amendment is needed for clarification.

Applicant similarly argues that there is proper antecedent basis for “the input data” in various claims.
The examiner respectfully disagrees for similar reasons as above.  For instance, in claim 5, line 11, “the input data” is not necessarily “of the grouped operations”, which means it could refer to one of multiple instances of “input data” in claim 1.

On pages 11-12 of applicant’s response, applicant provides clarification of the alternatives in claims 7 and 17.
While the examiner appreciates clarification, the claims themselves must be clear and, thus, must be amended to clearly cover what is intended.

On pages 14-15 of the response, applicant argues that Alwani fuses downwards while the invention fuses upwards.
The examiner notes that there is no practical distinction between the two as claimed.  That is, while one can say that it is decided to fuse layer 2 with layer 1 to reduce memory requirements, this is no different that fusing layer 1 with layer 2 to reduce memory requirements.  In Alwani, layer 2 has the highest storage requirements (FIG.2), and Alwani fuses layer 2 with layer 1 to reduce memory requirements (of both layers). 

On page 16 of the response, applicant argues that Alwani does not teach the same technical solution to the same technical problem.
The examiner respectfully disagrees.  Alwani fuses multiple layers so as to not exceed on-chip memory.  The idea is that off-chip memory accesses are costly.  Thus, layers are fused to reduce such accesses by keeping intermediate data on chip.

On page 18 of the response, applicant appears to conclude that Woo has not taught sequentially performed operations where a first layer is fused with an immediately preceding operation.
The examiner disagrees.  The idea in Woo is to split up batches so as to reduce the size of working sets so that they don’t exceed memory capacity.  This is discussed in reference to FIGs.2B and 5, wherein in FIG.2B, Woo contemplates executing a 3rd batch in layer B, but can’t because there are only 16 storage units, which is not enough storage to handle three batches requiring 8 storage units each.  Thus, the layer is capped at two batches (column 9, line 60, to column 10, line 3).  However, if B is fused with previous layer/operation A, which immediately precedes B, the batches are split up.  For instance, see FIG.5.  Not only can more batches be handled (four, in this example), but the maximum number of storage units required is reduced to 14 (B3), which is less than the capacity of 16.  As such, no off-chip transfer is required (e.g. see column 13, lines 8-21).  As such, because operation B has too much data (when adding up all batches), it is fused with immediately preceding operation A so as to split the batches and reduce on-chip storage requirements.

Examiner Note
The examiner notes the two distinct rejections set forth above.  The first rejection (Alwani) is based on the rejection provided in the Extended European Search Report cited by applicant.  Modifications have been made where appropriate.  The second rejection (Woo) is based on the examiner’s search.  The examiner notes that because the full set of claims are rejected under Alwani, the examiner has not fully addressed all dependent claims with respect to Woo.  Only those that could be quickly addressed, given time constraints, have been addressed at 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183