DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Claims 1, 13, and 17 were amended. Claims 1-20 are pending.
Applicant’s amendment overcomes the previous grounds of rejection of 1-20 under 35 USC 112; however, Applicant’s amendment necessitated the new grounds of rejection under 35 USC 112 presented herein.
The previous grounds of rejection under 35 USC 103 are maintained. See response to arguments.

Response to Arguments
Applicant’s arguments filed 11/26/2021 have been fully considered, but are not persuasive.

Applicant argues, see especially page 11, that Tucker and Becher fail to teach or suggest “identifying a number of intermediate data layers needed during execution of the neural network at the time-based processing slice and defining the minimum number of data storage portions to equal a largest number of identified numbers of intermediate data layers needed during execution of the neural network at the time-based processing slices”. 
Applicant argues that the subgraphs taught by Tucker are not analogous to the claimed time-based processing slices. Examiner believes this point to be moot. Tucker is relied upon to teach breaking a neural network up into a computational graph consisting of tasks (i.e., subgraphs) to be performed and dependencies between these tasks. In the combination with Becher, these tasks may be part of a time-based processing slice, but are not themselves interpreted as time-based processing slices.

Applicant further argues, see especially page 11, that 
Moreover, Becher fails to obviate the deficiencies of the Tucker. Instead, Becher
merely discloses identifying tasks to be processed in parallel tasks based upon if there are
no interdependencies between two tasks and if there are processing threads available in
the storage area network. See Becher Col. 3, lines 15-26. Becher attempts to utilize all
available processing threads to run tasks in parallel. See Becher Col. 7, lines 45-51. That
is, Becher does not identify the minimum number of allocations that is possible (e.g.,
based upon the largest number of portions needed at a given time slice). Furthermore,
Becher assigns threads from a given number of available processing threads in a storage
area network rather than defining a minimum number of the processing threads that are
needed for execution of the neural network. See Becher Col. 3, lines 22-23. (Arguments filed 11/26/2021, page 11)

Examiner respectfully disagrees. Becher does identify the smallest number of processing threads. It is the number of processing threads to which a task is assigned. As to Applicant’s argument that Becher assigns threads from a given number of available processing threads, Examiner agrees that the minimum number of threads defined by Becher may, at least in some examples, be selected from a potentially larger number of threads available. However, the claim does not rule out this possibility.

Applicant further argues, see especially page 12, that
Tucker and Becher, taken alone or in hypothetical combination, fail to teach or suggest assigning each of the intermediate data layers to one of the minimum number of data storage portions, as generally recited by independent claims 1, 13, and 17. Independent claim 1 recites, inter alia, “assigning each of the intermediate data layers to one of the minimum number of data storage portions.”…Independent claims 13 and 17 recite similar recitations. Tucker and Becher, taken alone or in hypothetical combination, fail to teach or suggest “assigning each of the intermediate data layers to one of the minimum number of data storage portions,” as recited by independent claims 1, 13, and 17. Emphasis added. In contrast, Tucker merely discloses dividing a computational graph into subgraphs for processing on different devices. See Tucker paragraph [0041]-[0045]. Moreover, Becher fails to obviate the deficiencies of the Tucker. In sharp contrast, Becher merely discloses allocating available threads to tasks. See Becher Col 3, lines 13-21. That is, Becher merely assigns threads from a given number of available processing threads in a storage area network rather than from a defined minimum amount of processing threads that must be available for execution of the neural network. See Becher Col. 3, lines 22-23. (Arguments filed 11/25/2021, page 12)

Examiner respectfully disagrees. As is described in more detail in the rejection, the threads taught by Becher correspond to a portion of memory for executing that thread.

The rejection under 35 USC 103 is maintained. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 9-11, and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over “Tucker” (US 2017/0124452 A1) in view of “Becher” (US 8,387,066 B1).

Regarding claim 1, Tucker teaches
A method comprising: accessing a data processing architecture associated with a neural network to determine dependencies between intermediate data layers of the neural network, the intermediate data layers comprising interconnected nodes of the neural network; (Abstract 
obtaining dimensions of the intermediate data layers in the neural network; ([0045] describes determining a dimension of a tensor along an edge. That is, it determines the dimension of the output of the starting node.)
[representing the neural network as a collection of tasks including dependencies between the tasks and system resources required to perform each task, wherein the tasks may correspond to layers of the neural network, see discussion regarding the combination of references below] wherein each data storage portion comprises a memory location configured to store, during execution of the neural network, at least one of the intermediate data layers based on the dependencies; assigning each of the intermediate data layers to one of a plurality of data storage portions (Figure 2, elements 208 and 210, described at [0041-0045], show partitioning the graph representing the network into subgraphs and assigning these to processing devices. The computing devices include memory (i.e., data storage portions), so the determination of the subgraphs represents a determination of data storage portions for executing the neural network. Each subgraph represents a data storage portion because a particular device/memory will be used to process that portion. [0050] clarifies that this may correspond to a memory location which is accessed during execution of the neural network. An assignment of a layer to a subgraph then corresponds to an assignment of that layer to a data storage portion. As a device may process a subgraph comprising a neural network layer, it is configured to store at least one of the intermediate data layers. Tucker abstracts the various tasks of a neural network (e.g., layer execution as described above) into nodes (i.e., tasks in the language of Becher described below).)
…determining a memory allocation size for each respective data storage portion of the data storage portions based on the dimensions and dependencies; ([0045] indicates that the dimension of a tensor on each directed edge to and from each node of a subgraph is determined to determine a size of memory necessary to perform the operation. Since this looks at dimensions across the subgraph and the subgraph is based on the dependencies, the determination is based on both the dimensions and dependencies.)
allocating memory on a storage device for each data storage portion in accordance with its respective determined memory allocation size. ([0045] describes assigning each subgraph to a device that has memory capable of storing the largest tensor. Assigning a subgraph to a device allocates that device’s memory (i.e., storage portion) to the subgraph.)
Tucker does not appear to explicitly teach
calculating a minimum number of data storage portions for executing the neural network based on the dependencies, by:
segmenting the neural network into time-based processing slices;
for each time-based processing slice, identifying a number of intermediate data layers needed during execution of the neural network at the time-based processing slice; and
defining the minimum number of data storage portions to equal a largest number of the identified numbers of intermediate data layers needed during execution of the neural network at the time-based processing slices, 
…assigning each of the intermediate data layers to one of the minimum number of data storage portions;
However, Becher—modifying Tucker in the manner described below—teaches
calculating a minimum number of data storage portions for executing the neural network based on the dependencies, by: (Becher, Abstract describes scheduling and executing a plurality of tasks based on preconditions (i.e., dependencies.) Becher, Column 3, lines 34-41 indicate that the tasks may be represented as a graph showing dependencies. In the combination with Tucker, it is understood that Tucker teaches representing a neural network as a graph which represents the neural network as a collection of tasks related by dependencies. Once the neural network is represented in this way, the 
segmenting the neural network into time-based processing slices; for each time-based processing slice, identifying a number of intermediate data layers needed during execution of the neural network at the time-based processing slice; and (Becher, column 3, lines 15-26 describes identifying tasks (i.e., layers of a neural network in the combination with Tucker) to be executed for each time period or step. The determination of which tasks are to be performed during which period is a segmentation of the job (i.e., all of the tasks) into time-based processing slices and an identification of which tasks belong to each slice. Figure 1, element 160, described at column 6 lines 27-39 shows the scheduling data structure. Tasks are assigned to both time periods and processing threads. The minimum number of processing threads is the number of threads which have a task assigned, which is the number of tasks which are running concurrently at the time slice with the largest number of tasks. Compare with Figure 4 of the instant application.)
defining the minimum number of data storage portions to equal a largest number of the identified numbers of intermediate data layers needed during execution of the neural network at the time-based processing slices, … assigning each of the intermediate data layers to one of the minimum number of data storage portions; (Becher, column 3, lines 15-26 describes identifying tasks (i.e., layers of a neural network in the combination with Tucker) to be executed for each time period or step. Becher, column 3, lines 16-33 indicate that a minimum number of threads is determined which respects the dependencies and times at which the tasks are to be performed. Becher, Column 3, lines 8-14 indicate that each thread may use a portion of the storage area during execution of the tasks. That is, Becher determines which tasks belong to which time step (i.e., time slice) and then determines a minimum number of threads to be used in executing the tasks. Each thread requires memory in the storage.)

It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Tucker to calculate a minimum number as taught by Becher because the scheduling system of Becher may minimize resource usage in the processing of the tasks as described by Becher at column 3, lines 7-15. The minimization of resource usage described by Becher does not rely on Becher processing any particular type of task, so appears to apply equally to neural network computation.

Regarding claim 9, the rejection of claim 1 is incorporated herein. Furthermore, Tucker teaches
wherein accessing a data processing architecture associated with the neural network comprises: accessing metadata of the neural network that identifies the intermediate data layers and data dependencies between the intermediate data layers. (Abstract describes methods/systems/apparatus for processing a computational graph. [0005] indicates that this may be for processing neural networks.  Figure 2, described at [0037-0047], shows an overview of a method for processing computational graphs. In particular, step 204, described at [0039], shows receiving the computational graph. Figure 3, described at [0048-0055], provides an example of a computational graph. It shows dependencies between nodes of the graph. The nodes are described at [0020-0022]. In particular, [0021] indicates that a node may represent a layer of the neural network. Since the inputs/outputs may be multidimensional, this indicates that the layer comprises a plurality of neurons. 

Regarding claim 10, the rejection of claim 9 is incorporated herein. Furthermore, Tucker teaches
wherein the dimensions of the intermediate data layers are stored in the metadata. ([0045, 0052] indicate that the dimensions of the tensors are associated with the directed edges in the graph.)

Regarding claim 11, the rejection of claim 1 is incorporated herein. Furthermore, Tucker teaches
wherein the determined memory allocation size for each respective data storage portion is stored as memory allocation configuration data associated with an executing architecture. ([0045] describes determining a memory size for each subgraph (which corresponds to a portion as described above). As the method is computer-implemented, this data is necessarily stored. Since the data represents memory allocation data associated with the executing architecture, it is necessarily stored as such. The claim does not appear to require storage using any particular data structure or arrangement of data structures.)

Regarding claim 13, Tucker teaches
A non-transitory computer readable medium comprising instructions, which when executed by at least one processor, configure the processor to perform operations comprising: (Tucker, [0066-0067] describes an embodiment as a computer readable medium storing instructions which may be executed by a data processing apparatus, which may include a processor.)
The remainder of claim 13 is substantially similar to claim 1; claim 13 is rejected with the same rationale.

Regarding claim 14, the rejection of claim 13 is incorporated herein. Furthermore, Tucker teaches
for each respective intermediate data layer of the intermediate data layers: designating a data storage portion of the data storage portions for the respective intermediate data layer, (Figure . )

Regarding claim 15, the rejection of claim 14 is incorporated herein. Furthermore, Tucker teaches
wherein a first data storage portion of the plurality of data storage portions is designated for a first intermediate data layer and a second data storage portion is designated for a second intermediate data layer. (Figure 3, described at [0048-0055], shows some different nodes (which may correspond to intermediate layers as described above) to different subgraphs (which correspond to portions as described above).

Regarding claim 16, the rejection of claim 14 is incorporated herein. Furthermore, Tucker teaches
wherein a first data storage portion is designated for a plurality of intermediate data layers. (Figure 3, described at [0048-0055], shows subgraphs (which correspond to portions as described above)

Regarding claim 17, Tucker teaches
A system comprising: a storage device; at least one processor, comprising instructions, which when executed by the at least one processor, configure the processor to perform operations comprising: (Tucker, [0066-0071] teaches an embodiment as a system comprising a storage device and processor for executing instructions.)
The remainder of claim 17 is substantially similar to claim 1; claim 17 is rejected with the same rationale, mutatis mutandis.

Claims 18-20 are substantially similar to claims 14-16 and are rejected with the same rationale in view of the rejection of claim 17.

Claims 2-6 and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over “Tucker” (US 2017/0124452 A1) in view of “Becher” (US 8,387,066 B1), further in view of “Craddock” (US 2017/0161604 A1).

Regarding claim 2, the rejection of claim 1 is incorporated herein. Furthermore, Tucker teaches
for each respective intermediate data layer of the intermediate data layers: designating a data storage portion of the data storage portions for the respective intermediate data layer, (Figure 3, described at [0048-0055], shows each node of the computational graph being assigned to one of the subgraphs (which correspond to portions as described above). )
The combination of Tucker and Becher does not appear to explicitly teach
wherein the data storage portions are stored on a single data storage device.
However, Craddock—directed to analogous art—teaches
wherein the data storage portions are stored on a single data storage device. (Abstract describes executing a neural network in a memory-constrained environment. Figure 2, described at [0049-0053], shows portions of a single memory being allocated to different layers of a neural network.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Tucker and Becher to assign data storage portions on a single data storage device as taught by Craddock because the available computing resources may be so constrained as describe by Craddock at [0002]. For example, the device may be a smartphone as described at [0068, 0081, 0084]. A person of ordinary skill in the art wishing to implement a neural network on a single memory-constrained device with the decreased resource utilization obtained from parallelizing the computation as in Tucker and Becher would necessarily need to apportion the single memory of, say, the phone taught by Craddock.

Claims 3-4 are substantially similar to claims 15-16 and are rejected with the same rationale in view of the rejection of claim 2.

Regarding claim 5, the rejection of claim 4 is incorporated herein. Furthermore, Tucker teaches
further comprising determining a memory allocation size for the first data storage portion based on the dimensions of the plurality of intermediate data layers. ([0045] indicates that the dimension of a tensor on each directed edge to and from each node of a subgraph is determined to determine a size of memory necessary to perform the operation. Since this looks at dimensions across the subgraph and the subgraph is based on the dependencies, the determination is based on both the dimensions and dependencies.)

Regarding claim 6, the rejection of claim 5 is incorporated herein. Furthermore, Tucker teaches
wherein the memory allocation size for the first data storage portion is based on a largest total size of an intermediate data layer in the plurality of intermediate data layers that are assigned to the first data storage portion. ([0045] indicates that the dimension of a tensor on each directed edge to and from each node of a subgraph is determined to determine a size of memory necessary to perform the operation. Since this looks at dimensions across the subgraph and the subgraph is based on the dependencies, the determination is based on both the dimensions and dependencies.)

Regarding claim 11, the rejection of claim 1 is incorporated herein. Note that this claim is also rejected in view of Tucker and Becher as described above. The rejection of claim 12 below is clearer when Craddock is applied to this limitation. Craddock—directed to analogous art—teaches
wherein the determined memory allocation size for each respective data storage portion is stored as memory allocation configuration data associated with an executing architecture. (Abstract describes scheduling execution of a neural network. [0032-0034] describes storing memory allocation configuration data corresponding to the executing architecture (the neural network) “as an annotation specifying the memory address”)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Tucker and Dickenson to store the allocation sizes as memory allocation configuration data associated with the architecture on computing devices with the same executing architecture because this allows for the additional devices to 

Regarding claim 12, the rejection of claim 11 is incorporated herein. The combination of Tucker and Becher does not appear to explicitly teach, but Craddock teaches
wherein the memory allocation configuration data is stored on a plurality of computing devices with the same executing architecture. ([0057, 0068] indicates that the method can be implemented on one or more computing devices, and the memory allocation configuration at a related to the neural network (the executing architecture) may be sent to remote computing devices to be executed (thereby storing it on a plurality of computing devices with the same executing architecture).)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 11.

Claims 7-8 are rejected under 35 U.S.C. 103 as being unpatentable over “Tucker” (US 2017/0124452 A1) in view of “Becher” (US 8,387,066 B1), further in view of “Dickenson” (US 2006/0053421 A1).

Regarding claim 7, the rejection of claim 1 is incorporated herein. Tucker teaches
wherein allocating memory on the storage device for each memory data storage portion in accordance with its respective allocation size comprises: ([0045] describes assigning each subgraph to a device that has memory capable of storing the largest tensor. Assigning a subgraph to a device is the same as allocating that device’s memory (i.e., storage portion) to the subgraph.)
The combination of Tucker and Becher does not appear to explicitly teach, but Dickenson teaches
linearly allocating the memory on the storage device for a respective memory portion. (Abstract describes methods/systems/media for increasing the efficiency of a computer system by selectively implementing different memory processes. [0045-0059] describes the memory allocation 
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Tucker and Becher to dynamically optimize the selection of linear or non-linear memory (resulting in linear allocation sometimes being used and contiguous allocation sometimes being used) because this results in more desirable performance characteristics as described by Dickenson at [0059].

Regarding claim 8, the rejection of claim 1 is incorporated herein. Tucker teaches
wherein allocating memory on the storage device for each memory data storage portion in accordance with its respective allocation size comprises: ([0045] describes assigning each subgraph to a device that has memory capable of storing the largest tensor. Assigning a subgraph to a device is the same as allocating that device’s memory (i.e., storage portion) to the subgraph.)
The combination of Tucker and Becher does not appear to explicitly teach, but Dickenson teaches
allocating a single contiguous block of memory on the storage device. (Abstract describes methods/systems/media for increasing the efficiency of a computer system by selectively implementing different memory processes. [0045-0059] describes the memory allocation process. [0059] indicates that either linear or non-linear (i.e., contiguous as shown in Figure 4, element 405 and described at [0051]) may be selected based on which will result in better performance.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Tucker and Becher to dynamically optimize the selection of linear or non-linear memory (resulting in linear allocation sometimes being used and contiguous allocation sometimes being used) because this results in more desirable performance characteristics as described by Dickenson at [0059].

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
U.S. Patent Pub. No. 2016/0364644 to Brothers (teaching determination of neural network dependencies to optimize the portions of the network stored in internal vs external memory)
U.S. Patent Pub. No. 2017/0344882 to Ambrose (teaching optimization of neural network memory usage to reduce execution time)
“Optimizing the use of GPU Memory in Applications with Large data sets” by Satish et al. (teaching improved GPU memory management for applications with large and intermediate data sets that do not completely fit in GPU memory)

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Markus A Vasquez whose telephone number is (303)297-4432. The examiner can normally be reached Monday to Friday 9AM to 2PM MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit 





/M.A.V./             Examiner, Art Unit 2121      



/Li B. Zhen/             Supervisory Patent Examiner, Art Unit 2121