DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-23 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Che et al (US 2020/0249998)
For claim 1, Che teaches a method of implementing a neural network model ([0014] and Figure 2-6), comprising: 
partitioning a neural network model (via 211 and 212 of Figure 3) into first sub-models (nodes and super nodes shown in Figure 4, [0033]-[0035) based on a partitioning standard (e.g., by parsing a machine learning model, [0033]); 
determining second sub-models (213 of Figure 3) by merging at least a portion of the first sub-models (into subsets, Figures 3-4 and [0038]-[0042]) based on characteristics of the first sub-models (“available accelerator resources, each accelerator's capacity, time requirements, properties of a data structure…minimum cut algorithm”, [0038]-[0042]); and 
deploying the second sub-models (214-216 of Figure 3 and [0043]-[0056]).
For claim 2, Che further teaches that the partitioning comprises: 
partitioning the neural network model into the first sub-models based on at least one of a number of times of an operation of an operating segment comprised in the neural network model, an input data size, or an output data size (“1) system and target device information, 2) operation profiling information per target device, and 3) subgraph profiling information per target device”, [0036]).
For claim 3, Che further teaches that the partitioning comprises:
verifying the first sub-models based on a verification standard (whether a super node can be created, [0035]-[0038]); and
in response to the first sub-models not meeting the verification standard for the verifying, modifying the partitioning standard (use of nodes instead of super nodes, [0035]-[0038]).
For claim 4, Che further teaches that the verifying comprises:
detecting performances of the first sub-models (estimating operation profiling information via simulations [0036]-[0037]) in response to the first sub-models being executed in one or more accelerators (target device information, [0036]-[0037]).
For claim 5, Che further teaches that the detecting of the performances comprises:
for each of the first sub-models, detecting at least one of a consumed time, a consumed power, or an occupied memory size in response to the first- sub model being executed in an accelerator of the one or more accelerators ([0036]-[0037]).
For claim 6, Che further teaches that the partitioning comprises:
for each of the first sub-models, matching the first sub models sub-model to a type of an accelerator of the one or more accelerators in which the first sub models sub-model is executed with a specific performance ([0036]-[0037]).
For claim 7, Che further teaches that the determining comprises:
determining the second sub-models (e.g., S1, S21 and S22 of Figure 4) by merging first sub-models among the first sub-models that are adjacent to each other in terms of an execution order and have a specific performance in response to being executed in an accelerator of a same type ([0039]).
For claim 8, Che further teaches that the deploying comprises:
writing a heterogeneous graph based on the second sub-models (sequence of nodes and a sequence of target devices for a subset of a computation graph”, [0055]); and
deploying the second sub-models based on the heterogeneous graph ([0061]-[0063]).
For claim 9, Che further teaches that the writing comprises:
writing a connecting relationship between the second sub-models based on an input and output relationship between the second sub-models (sequence of nodes and a sequence of target devices for a subset of a computation graph”, [0055]).
For claim 10, Che further teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, configure the processor to perform the method of claim 1 ([0004]).
For claim 11, Che teaches an apparatus for implementing a neural network model (Figures 1-2 and [0003]), comprising:
a memory configured to store therein instructions ([0003]); and
a processor configured to execute the instructions, wherein, when the instructions are executed, the processor is configured to ([0003]):
partition a neural network model (via 211 and 212 of Figure 3) into first sub-models (nodes and super nodes shown in Figure 4, [0033]-[0035) based on a partitioning standard (e.g., by parsing a machine learning model, [0033]); 
determine second sub-models (213 of Figure 3) by merging at least a portion of the first sub-models (into subsets, Figures 3-4 and [0038]-[0042]) based on characteristics of the first sub-models (“available accelerator resources, each accelerator's capacity, time requirements, properties of a data structure…minimum cut algorithm”, [0038]-[0042]); and
deploy the second sub-models (214-216 of Figure 3 and [0043]-[0056]).
For claim 12, Che further teaches that for the partitioning, the processor is configured to:
partition the neural network model into the first sub-models based on at least one of a number of times of an operation of an operating segment comprised in the neural network model, an input data size, or an output data size (“1) system and target device information, 2) operation profiling information per target device, and 3) subgraph profiling information per target device”, [0036]).
For claim 13, Che further teaches that for the partitioning, the processor is configured to:
verify the first sub-models based on a verification standard (whether a super node can be created, [0035]-[0038]); and
in response to the first sub-models not meeting the verification standard, modify the partitioning standard (use of nodes instead of super nodes, [0035]-[0038]).
For claim 14, Che further teaches that for the verifying, the processor is configured to:
detect performances of the first sub-models (estimating operation profiling information via simulations [0036]-[0037]) in response to the first sub-models being executed in one or more accelerators (target device information, [0036]-[0037]).
For claim 15, Che further teaches that for the detecting of the performances, the processor is configured to:
for each of the first sub-models, detect at least one of a consumed time, a consumed power, or an occupied memory size in response to the first- sub model being executed in an accelerator of the one or more accelerators ([0036]-[0037]).
For claim 16, Che further teaches that for the partitioning, the processor is configured to:
for each of the first sub-models, match the first sub models sub-model to a type of an accelerator of the one or more accelerators in which the first sub models sub-model is executed with a specific performance ([0036]-[0037]).
For claim 17, Che further teaches that for the determining, the processor is configured to:
determine the second sub-models (e.g., S1, S21 and S22 of Figure 4) by merging first sub-models among the first sub-models that are adjacent to each other in terms of an execution order and have a specific performance in response to being executed in an accelerator of a same type ([0039]).
For claim 18, Che further teaches that the processor is configured to:
write a heterogeneous graph based on the second sub-models (sequence of nodes and a sequence of target devices for a subset of a computation graph”, [0055]); and
deploy the second sub-models based on the heterogeneous graph ([0061]-[0063]).
For claim 19, Che further teaches that the processor is configured to:
write a connecting relationship between the second sub-models based on an input and output relationship between the second sub-models (sequence of nodes and a sequence of target devices for a subset of a computation graph”, [0055]).
For claim 20, Che further teaches that the processor comprises:
a plurality of accelerators in which the second sub-models are deployed (Figure 2 and [0014]).
For claim 21, Che further teaches that:
the number of times of the operation of the operating segment is a number of multiply-addition (MAdd) operations of the operating segment ([0018], [0034]), and
the input data size is of input data comprising either one of a single input task and streaming input data input to the neural network (as understood by examination of the Figures).
For claim 22, Che further teaches that the deploying comprises: 
executing the second sub-models in accelerators, based on received input data (Figure 2 and [0014]).
For claim 23, Che further teaches that the determining comprises: 
determining one of the second sub-models (S1) by merging one of the first sub-models (n0, N0) and another one of the first sub-models (n5-n12), 
an output end of the one of the first sub-models (output of N0) is an input end of the other one of the first sub-models (input to n5, as understood by examination of Figure 4), 
an input end of the one of the second sub-models is an input end of the one of the first sub- models (input to N0), and
an output end of the one of the second sub-models is an output end of the other one of the first sub-models (output of n12).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL CALRISSIAN PUENTES whose telephone number is (571)270-5070. The examiner can normally be reached M-F 9-6:30 (flex).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Menatoallah Yousseff can be reached on 571-270-3684. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL C PUENTES/Primary Examiner, Art Unit 2849