DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2, 4-12, 14-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (US 20190156185) in view of Kalamkar et al (US 20180293492).

As to claim 1, Li discloses a method comprising (FIG. 6): 
receiving a representation of a neural network (NN) model to be executed on an electronic device (FIG. 4 and [0070], convolutional neural network; see processor 1110 and memory 1120 in FIG. 11), the representation of the NN model including nodes corresponding to intermediate layers of the NN model (FIG. and [0070], layers in the convolutional neural network), wherein at least some of the nodes each corresponds to a respective operation of a respective intermediate layer of the NN model to be performed by the electronic device (see [0022] and [0081], convolution and pooling operations); 
determining, for the respective operation corresponding to each node in each respective intermediate layer of the NN model, a respective set of operations that are mathematically equivalent to the respective operation such that an aggregation of outputs of the respective set of operations is equivalent to an output of the respective operation (FIG. 4 and FIG. 6, S325-S330; see [0028]), wherein a respective output of each operation from the respective set of operations is constrained based at least in part on a memory constraint (see [0025-27], [0066]).
Li fails to explicitly disclose generating a graph based on each respective set of operations, wherein the graph includes a set of branches, each branch includes a plurality of operations, the plurality of operations including a particular operation from each respective set of operations; determining a respective order for executing each branch of the graph; and storing the graph and the respective order.
However, Kalamkar teaches generating a graph based on each respective set of operations, wherein the graph includes a set of branches, each branch includes a plurality of operations, the plurality of operations including a particular operation from each respective set of operations (FIGS. 14C-14D, convolutional layers and node 0, node 1, node 2 and node 3, each node is a branch including a plurality of operations; see FIG. 9B describing operations of convolutional layers); 
determining a respective order for executing each branch of the graph (see [0194] and [0197], hybrid parallelism); and 
storing the graph and the respective order (see [0194] and [0197]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Li using Kalamkar’s teachings to include generating a graph based on each respective set of operations, wherein the graph includes a set of branches, each branch includes a plurality of operations, the plurality of operations including a particular operation from each respective set of operations; determining a respective order for executing each branch of the graph; and storing the graph and the respective order in order to accelerate the training process using a distributed network of computational nodes and provide parallel processor accelerated machine learning enables computer vision applications to be trained using significantly larger training dataset than previously feasible and enables inferencing systems to be deployed using low power parallel processors which enables rapid training of the increasingly complex neural networks (Kalamkar; [0174], [0181]-[0182).

As to claim 2, the combination of Li and Kalamkar further discloses further comprising: compiling a binary package for the electronic device based at least in part on the graph and the respective order for executing each branch of the graph, wherein the electronic device performs each respective set of operations based on the respective order (Kalamkar; see [0051], [0241]).

As to claim 4, the combination of Li and Kalamkar further discloses wherein an output of each operation from the respective set of operations is constrained based at least in part on an amount of available memory in a cache of the electronic device (Li; see [0025]-[0027], [0066]).

As to claim 5, Li as modified by Kalamkar fails to explicitly disclose wherein the aggregation of the outputs of the respective set of operations is stored in memory of the electronic device, the memory being slower memory than the cache of the electronic device.
However, Kalamkar teaches wherein the aggregation of the outputs of the respective set of operations is stored in memory of the electronic device, the memory being slower memory than the cache of the electronic device (Kalamkar; FIGS. 2A-2C; partition unit 220 including L2 cache 221, processing cluster 214 including L1 cache 248, and memory units 224A-224N including dynamic random access memory (DRAM); see [0057] and [0060]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Li using Kalamkar’s teachings to include wherein the aggregation of the outputs of the respective set of operations is stored in memory of the electronic device, the memory being slower memory than the cache of the electronic device in order to accelerate the training process using a distributed network of computational nodes and provide parallel processor accelerated machine learning enables computer vision applications to be trained using significantly larger training dataset than previously feasible and enables inferencing systems to be deployed using low power parallel processors which enables rapid training of the increasingly complex neural networks (Kalamkar; [0174], [0181]-[0182).

As to claim 6, the combination of Li and Kalamkar further discloses wherein the plurality of operations of each branch start after an input node of the NN model and end before an output node of an output layer of the NN model (Li; FIG. 4; Kalamkar; FIGS. 14C-14D).

As to claim 7, the combination of Li and Kalamkar further discloses wherein the plurality of operations of each branch provides a portion of an output of the NN model from an output layer (Li; FIG. 4 and FIG. 6, S325-S330).

As to claim 8, the combination of Li and Kalamkar further discloses wherein an aggregation of each output of each branch is equal to the output of the NN model from the output layer (Li; FIG. 6, S330; see [0028]).

As to claim 9, Li as modified by Kalamkar fails to explicitly disclose wherein the output of the NN model from the output layer is stored in dynamic random access memory (DRAM).
However, Kalamkar teaches wherein the output of the NN model from the output layer is stored in dynamic random access memory (DRAM) (Kalamkar; FIGS. 2A-2B; partition unit 220 including L2 cache 221, processing cluster 214 including L1 cache 248, and memory units 224A-224N including dynamic random access memory (DRAM); see [0057] and [0060]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Li using Kalamkar’s teachings to include wherein the output of the NN model from the output layer is stored in dynamic random access memory (DRAM) in order to accelerate the training process using a distributed network of computational nodes and provide parallel processor accelerated machine learning enables computer vision applications to be trained using significantly larger training dataset than previously feasible and enables inferencing systems to be deployed using low power parallel processors which enables rapid training of the increasingly complex neural networks (Kalamkar; [0174], [0181]-[0182).

As to claim 10, Li as modified by Kalamkar fails to explicitly disclose wherein the electronic device includes cache memory and dynamic random access memory (DRAM).
However, Kalamkar teaches wherein the electronic device includes cache memory and dynamic random access memory (DRAM) (Kalamkar; FIGS. 2A-2B; partition unit 220 including L2 cache 221, processing cluster 214 including L1 cache 248, and memory units 224A-224N including dynamic random access memory (DRAM); see [0057] and [0060]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Li using Kalamkar’s teachings to include wherein the electronic device includes cache memory and dynamic random access memory (DRAM) in order to accelerate the training process using a distributed network of computational nodes and provide parallel processor accelerated machine learning enables computer vision applications to be trained using significantly larger training dataset than previously feasible and enables inferencing systems to be deployed using low power parallel processors which enables rapid training of the increasingly complex neural networks (Kalamkar; [0174], [0181]-[0182).

As to claims 11-12 and 14-19, system claims 11-12 and 14-19 correspond to method claims 1-9, recite the same features as those recited in claims 1-2 and 4-9 respectively, and are therefore rejected for the same reasons of obviousness as those used above in rejecting claims 1-2 and 4-9.

As to claim 20, CRM claim 20 corresponds to method claim 1, recites the same features as those recited in claim 1, and is therefore rejected for the same reasons of obviousness as those used above in rejecting claim 1.

Allowable Subject Matter
Claims 3 and 13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
Applicant’s arguments, filed on 09/12/2022, with respect to objections of drawings and specifications have been fully considered and are persuasive.  The objections of drawings and specifications have been withdrawn. 
Applicant's arguments filed on 09/12/2022 have been fully considered but they are not persuasive. 
Applicant argues that the cited portions of Li do not disclose or suggest “a respective output of each operation from the respective set of operations is constrained based at least in part on a memory constraint.” 
The examiner respectfully disagrees. First, based on Applicant’s disclosure, see for example paragraph [0024] and FIGS. 4-5, a respective output of each operation from the respective set of operations is constrained based at least in part on a memory constraint as a result of splitting the input data based on memory constraints, each output is dependent of the split input data. Therefore, the interpretation of the limitation encompasses splitting the input data based on memory constraints to generate output data as thought by Li in paragraphs [0025]-[0027] and [0066]. Li discloses in [0025], due to limitations such as costs, the capacity of high speed memory is usually limited … [0026] Therefore, in a technical solution according to an embodiment of the present disclosure, an input feature data of a designated layer (for example, the input layer or a layer in the middle of the convolutional neural network) in a convolutional neural network is “split” into multiple subdata. Then, the obtained subdata may be used instead of the original feature data, and each of the obtained subdata can be provided to the designated layer as the input feature data, respectively. [0027] For example, having sufficient number of subdata and/or making the size of each subdata sufficiently small, as needed, for each subdata input, data involved in the operation of each layer from the consecutive layers, beginning from the above designated layer, may be completely buffered in the high speed memory, or even operations in the consecutive layers starting from the designated layer may only use the high speed memory. [0028] The above “splitting” should at least ensure that the result of the final output from the convolutional neural network is not changed. In other words, in the case where each of the obtained subdata is respectively provided, as an input, to the above designated layer, the result obtained by combining (for example, “splicing” or “lapping”) the plurality of output subdata obtained from the operations in the consecutive layers should be the same as the output feature data obtained by directly providing the original input feature data before “splitting” to the designated layer and performing the operations in the consecutive layers. Li further discloses in [0066], For example, the expected number of subdata blocks of the output feature data for the last layer may be determined to satisfy the following condition:
E>(max.sub.1≤i≤N{F.sub.i+P.sub.i})/R,
wherein E is the expected number, max is a function that returns a maximum value, N is the number of layers selected in step S301, F.sub.i and P.sub.i are respectively the sizes of the input feature data of the i-th layer in the selected layers and the sizes of the related parameters, and R is a reference value. 
Therefore, Li discloses “a respective output of each operation from the respective set of operations is constrained based at least in part on a memory constraint.” 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BOUBACAR ABDOU TCHOUSSOU whose telephone number is (571)272-7625. The examiner can normally be reached M-F 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chris Kelley can be reached on 5712727331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BOUBACAR ABDOU TCHOUSSOU/Primary Examiner, Art Unit 2482