DETAILED ACTION
Status of Claims
This action is in response to the applicant amendment filed on 9/14/2022. Claim 1 – 4, 6 – 9, 11 – 15, 17, 19 – 20 and 22 are pending and have been examined.
Claim 1, 2, 7, 9, 11, 12, 13, 17, and 20 are amended.
Claim 5, 10, 16, 18 and 21 are canceled. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on March 16, 2013 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Response to Argument
Applicant's remark filed on 9/14/2022 has been fully considered but they are not moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. 
Claim Objection
Claim 22 objected to because of the following informalities:  Claim 22 depends on Claim 21, however Claim 21 is canceled. It is a clear typo. For the examination purpose, it will be examined as depending on Claim 12. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claim 22 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	Claim 22 recite the limitation of “the specific DNN layer”. There is insufficient antecedent basis for this limitation in the claim or the depending claim. For the examination purpose, the term is interpreted as “the first DNN layer”.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim 1 – 3, 7 and 12 - 14 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Fleming et al., (hereinafter Fleming) US 20190205269.

Regarding Claim 1, Fleming disclose: 
A deep neural network (DNN) system, comprising: A memory comprising a plurality of queues (Fleming, fig. 30, ingress control with queues 3032), each queue mapped to a different one of a plurality of DNN layers (Fleming, para. 0185, “input buffer (queue) of consumer PEs”; para. 0131 a neural network tuned extension may include dataflow operations (inference pipeline); Examiner’s note: each PE is performing a task in the dataflow of a neural network, which is a task in a layer of the neural network. Input buffers are linked to PE and thus map to a layer of a neural network) and each queue having queue packets (Examiner’s BRI: packets are data, each queue store data; Fleming, para. 0132, “dataflow graph includes … a buffer (queue) may optionally be included along the communication channel”. Examiner’s note: buffers hold data (packet) of dataflow graph throughout the system for the system to perform parallel processing); each queue packet comprising: 
Instructions to process an input for one of the DNN layers in an inference pipeline (Examiner’s BRI: the processors process queue packets as long as there is data in the queue, i.e., data in a queue is an instruction to process the data; Fleming: para. 0451 “buffer … first in first out FIFO data characteristic”)
a layer identifier identifying a next DNN layer in the inference pipeline (Examiner’s BRI: the packet identify the destination of the next processing stage, Fleming, para. 0283, “allow data to be routed to and/or from … according to (e.g., a header of ) a data packet”)
 a plurality of processing elements (Fleming, fig. 36, plurality of PEs) each processing element associated with one of the plurality of queues (Fleming,  para. 0277, “a processing element may include an input buffer (queue)”), each processing element configured to:
process an input for a DNN layer in the inference pipeline according to instruction of a respective queue packet (Examiner’s BRI: each processing element is performing data processing task of a step in a DNN; Fleming para. 0131, “for example, more complex mathematical dataflow operations … may be included in certain embodiment to accelerate certain mathematics intensive HPC workload. Similarly, a neural network tuned extension may include dataflow operations (inference pipeline)”; para. 0144, “for example a fabric for high-performance computing might include some customization for double-precision, fused multiply-add (a data processing computation to input during the inference in a DNN layer), while a fabric targeting deep neural networks might include low-precision floating point operation”) in parallel with inputs for other DNN layers being processed by other processing element according to instruction of other queue packets (Examiner’s BRI: processing elements can perform parallel processing according to program instruction; Fleming para. 0159, “instruction register may be set during a special configuration step. During this step, auxiliary control wires and state, in addition to the inter-PE network may be used to stream in configuration across the several PEs comprising the fabric. As result of parallelism, certain embodiments of such network may provide for rapid reconfiguration”)
when the processing element completes the processing of the input for the DNN layer, push a new queue packet to a queue mapped to the next DNN layer in the inference pipeline identified by the layer identifier in the respective queue packet (Examiner’s BRI: when processing element finish processing, the data is send to the next processing point in the dataflow, the next processing point is identified in the data packet; Fleming, para. 0288, where when network dataflow endpoint circuit is to transmit input data … generate a packet including input data and a header to steer that data to network). 

Regarding Claim 2, Fleming further disclose: the respective queue packet identifies at least one of a DNN network identifier, a DNN layer identifier and a pointer to buffer for data (Examiner’s BRI: pointer to buffer for data is a pointer to a storage where the data is; Fleming para. 0275, “load two input data operands (e.g., indicated by pointers *a and *b)”). 

Regarding Claim 3, Fleming further disclose: the DNN layer identifier identifies a DNN layer type, which is used to determine a nature of computation to be performed and what kernel to launch (Examiner’s BRI: the identifier is an instruction on which operation to be performed by PE; Fleming, para. 0194, “PEs … are configured such that … operations to be performed on that data by second PE and third PE” Examiner’s note: the dataflow graph identifies which operation to be performed by which PE).  

Regarding Claim 7, Fleming further discloses: wherein one or more of the queues and associated processing elements receive queue packets through remote direct memory access (Fleming para. 0359, “CSA approach may use a wide data word, is distributed and includes mechanism to fetch program data directly from memory”; Examiner’s note, the memory not in PE is remote).

Regarding Claim 12, Fleming discloses: 
A method for deep neural network (DNN) processing, the method comprising: processing an input for a first DNN layer of an inference pipeline, by a processing element, in parallel with inputs for other DNN layers being processed by other processing elements (Examiner’s BRI: the method that process different layer of DNN data in parallel; Fleming para. 0131, “for example, more complex mathematical dataflow operations … may be included in certain embodiment to accelerate certain mathematics intensive HPC workload. Similarly, a neural network (DNN) tuned extension may include dataflow operations (inference pipeline)”; para. 0159, “instruction register may be set during a special configuration step. During this step, auxiliary control wires and state, in addition to the inter-PE network may be used to stream in configuration across the several PEs comprising the fabric. As result of parallelism, certain embodiments of such network may provide for rapid reconfiguration”), wherein each processing element is associated with one of a plurality of queues (Fleming,  para. 0277, “a processing element may include an input buffer (queue)”) and wherein each queue is mapped to a DNN layer in the inference pipeline (Fleming, para. 0185, “input buffer (queue) of consumer PEs”; para. 0131 a neural network tuned extension may include dataflow operations (inference pipeline); Examiner’s note: each PE is performing a task in the dataflow of a neural network, which is a task in a layer of the neural network. Input buffers are linked to PE and thus map to a layer of a neural network), the processing including:
writing a queue packet to a queue mapped to the first DNN layer (Examiner’s BRI: the data is send to the buffer of the processing element of the first layer; Since the PE of Fleming use buffer at the input, the processing data is send to the buffer of the processing element of the first layer)
processing, by the processing element, the input for the first DNN layer based on a DNN processing profile determined from the queue packet (Examiner’s BRI: processing element are operated base on a dataflow graph; Fleming, para. 0357, “programs viewed as control-dataflow graphs (DNN processing profile) … Generally, PE maybe configured as dataflow operators and once all input operands arrive at the PE, some operation occurs”)
when the processing element completes the processing of the input for the first DNN layer, pushing a new queue packet to another queue mapped to a second DNN layer in the inference pipeline which is identified from a layer identifier in the queue packet (Examiner’s BRI: when processing element finish processing, the data is send to the next processing point in the dataflow, the next processing point is identified in the data packet; Fleming, para. 0288, where when network dataflow endpoint circuit is to transmit input data … generate a packet including input data and a header to steer that data to network).

Regarding Claim 13 – 14, Claim 13 – 14 are the corresponding method claim of Claim 2 – 3. Claim 13 – 14 are rejected with the same reason as Claim 2 – 3.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Fleming et al. (hereinafter Fleming) US 20190205269  in view of LI et al. CA3013680 Data Flow Processing Method and Apparatus and System Aug 2017.

Regarding Claim 4, Fleming does not explicitly disclose:
the DNN network identifier enables processing of multiple DNN workloads by designating which network to use.
Li explicitly disclose: the DNN network identifier enables processing of multiple DNN workloads by designating which network to use (Examiner’s BRI: having identifier for each pipelined tasks in order to process multiple pipelined tasks; Li, para. 0112, “each pipeline queue (DNN) is differentiated by using unique queue ID (DNN identifier)”; Examiners note: the queue of Li refer to the dataflow of tasks to be done for a job).
Fleming and Li both teach data processing pipeline using multi processors and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Fleming’s disclosure of  the hardware based deep learning accelerator system with Li’s disclosure a system to perform multiple pipelined tasks to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order for the processor to identify and perform the corresponding action (See at least Li, para. 0112, ln. 5 - 7).

Regarding Claim 15, Claim 15 is the corresponding method claim of Claim 4. Claim 15 is rejected with the same reason as Claim 4.

Claim 6 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Fleming et al. (hereinafter Fleming) US 20190205269  in view of Mittal US5860126.

Regarding Claim 6, Fleming further discloses: wherein the queue packets include at least instructions on how to launch threads, provide a size of private memory allocation, provide a size of group memory allocation, and control and synchronization information (Examiner’s BRI: the system is configurable on launching thread, memory size, and control, synchronization; Fleming, para. 0351, “CSA PEs and the network together enable the expression many kinds of parallelism: instruction data, pipeline, vector memory, thread and task parallelism (control and synchronization) may all be implemented” para. 0451, “when shared (group/private) the input queues and the completion queues may be implemented as a ring buffer of a fixed size”; Examiners note: Fleming’s system implement control information over the thread, memory allocation, and synchronization).
Fleming does not explicitly disclose:
provide a handle for an object in memory that includes an executable ISA image for a computation kernel,
Mittal explicitly discloses: 
provide a handle for an object in memory that includes an executable ISA image for a computation kernel (Examiner’s BRI: multiple processor can share ISA images to launch; Mittal, col. 10, ln. 13 – 33, “when separate MFDA and MFDR instruction are utilized, change are not needed to the existing load/store instruction of an instruction set architecture”; Examiners note: for the same operations, the instruction set to initialize PE can be the shared),
Fleming and Mattal both teach data processing using multi processors and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Fleming’s disclosure of  the hardware based deep learning accelerator system with Mattal’s disclosure to share initialization of processors to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification so to improve performance (Mattal, col. 1, ln. 26 - 31).

Regarding Claim 17, Claim 17 are the corresponding method claim of Claim 6. Claim 17 are rejected with the same reason as Claim 6.


Claim 8 – 9, 11, 19 – 20 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Fleming et al. (hereinafter Fleming) US 20190205269  in view of Goyal et al., (hereinafter Goyal) US20170316312.

Regarding Claim 8, Fleming does not explicitly disclose: the plurality of DNN layers are different DNN layer types 
Goyal explicitly disclose: the plurality of DNN layers are different DNN layer types (Goyal, para. 0024, “each TE perform a portion/subtask [different layer type] of neural network in parallel “).

Fleming and Goyal both disclose multi processors accelerator for neural network and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Fleming’s disclosure of configurable distributed processing engine with Goyal’s disclosure of the detail multi-processor processing on neural network layers to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification as the combination yield predictable result.

Regarding Claim 9, depending on Claim 1, Fleming in view of Goyal further discloses: each input is processed at a different DNN layer type (Examiner’s BRI: input are processed at different layers; Goyal, para. 0024, “each TE perform a portion/subtask [different layer type] of neural network in parallel”, Examiner’s note: inputs are processed at different processing element thus different layer type).

Regarding Claim 11, Fleming in view of Goyal further discloses: the DNN layer is supported by different DNN networks to enable multiple use of the DNN layer. (Examiner’s BRI: DNN layer can be used in different DNN networks; Goyal, para. 0022, ln. 1 – 3, where DLP is configured to implement one or more neural networks; Examiner’s note: the tensor engine can be used in multiple neural networks).

Regarding Claim 19 – 20 and 22, Claim 19 – 20 and 22 are the corresponding method claim of Claim 8 – 9 and 11. Claim 19 – 20 and 22 are rejected with the same reason as Claim 8 – 9 and 11.   

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure: Lin et al. US20160275123A1 Pipeline Execution of Multiple Map-reduce Jobs. Lin discloses a pipeline of tasks performed by data node computing devices. The workflow configuration tables includes information of job ID and data node ip that are analogous to the network ID and layer ID of the instant application. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354.  The examiner can normally be reached on Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.C./Examiner, Art Unit 2122                                                                                                                                                                                                        

/VIKER A LAMARDO/Primary Examiner, Art Unit 2126