Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to amendment filed 5/24/2021. Claims 1-48 remain pending.

Response to Arguments
Applicant's arguments filed 5/24 have been fully considered but they are not persuasive. 
112(b)
In response: The rejection is withdrawn.

103
Regarding reference Khan does not disclose the claim limitation “wherein the virtual channel specifier identifies one of the one or more virtual channels” in claim 1 (similarly, claims 2, 25 and 26)
In response: Khan teaches the above claim limitation. A virtual channel may be understood as a communication path between nodes for virtual connections in networks. In Khan, page 2853, col 2, paragraph 3, the module for neural network mapping setting up the routing for packets with routing entries of the nodes, including the origin and the target, in a virtual address space discloses “the virtual channel specifier identifies one of the one or more virtual channels”, where the the set-up route in a virtual address space interprets a virtual channel. 
.   

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 7, 8, 9, 21, 22, 24, 25, 26, 31, 32, 33, 45, 46, and 48 are rejected under 35 U.S.C. 103 as being unpatentable over Imam et al. (US 2018 / 0174041 A1) in view of Korthikanti et al. (US 2018 / 0189652 A1) and Khan et al. (“SpiNNaker: Mapping Neural Networks onto a Massively-Parallel Chip Multiprocessor”).
As per claim 1, Imam teaches:
performing dataflow-based and instruction-based processing and exchanging fabric packets respectively in and between a plurality of processing elements interconnected as a fabric, each processing element comprising a compute engine and a fabric router; (0032: computing device organized as a fabric of neural nodes for processing fabric packets as data; Fig. 2A: each squared block [processing element] includes a core [compute engine] and router connecting the nodes)
specifying communications and computations respectively corresponding to a plurality of branches and a plurality of nodes of a dataflow graph; (Fig. 2A and 3A: fabric contains plurality of PE’s that correspond to nodes, with the routers acting as branches for data to flow as a graph)
allocating a plurality of the processing elements to locally perform the computations, at least two of the processing elements being allocated to respectively locally perform a plurality of computation portions corresponding to a partitioned one of the nodes; (Fig. 2A: processing is local in the sense that it is performed on the computing device).
While Imam does not teach the remaining limitations, Korthikanti does teach:
at least two of the processing elements being allocated to respectively locally perform a plurality of computation portions corresponding to a partitioned one of the nodes (0054: matrix operations [computation portions] of node can be partitioned to be processed by multiple processing elements).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the partitioning of Korthikanti, as partitioning allows for multiple elements to work on a node at the same time, thus speeding up performance.

performing the computations and communications in accordance with the specifying, the allocating, and a virtual channel specifier of each fabric packet sent via one or more virtual channels between the at least two processing elements to transfer between the respective computation portions data comprising one or more sources and results; and (pg. 2854: each neuron has a virtual address; thus, every data packet that goes to a node is specified a virtual channel [virtual channel specifier]; pg. 2853, col 2, paragraph 3, the routing path for packets is an example of a virtual channel; 0032: routers send synapses between neurons, which are the sources and results).
wherein the virtual channel specifier identifies one of the one or more virtual channels (pg. 2853, col 2, paragraph 3, the module for neural network mapping setting up the routing for packets with routing entries of the nodes, including the origin and the target, in a virtual address space, pg. 2854: each neuron has a virtual address; thus, every data packet that goes to a node is specified a virtual channel [virtual channel specifier]).
	It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the virtual addresses of Khan, as the virtual network approach to neural modelling extends the size and class of neural networks that hardware can model (Khan pg. 2857 left).

As per claim 2, Imam teaches:
performing dataflow-based and instruction-based processing and exchanging fabric packets respectively in and between a plurality of processing elements interconnected as a fabric, each processing element comprising a compute engine and a fabric router; (0032: computing device organized as a fabric of neural nodes for processing fabric packets as data; Fig. 2A: each squared block [processing element] includes a core [compute engine] and router connecting the nodes)
specifying communications and computations respectively corresponding to a plurality of branches and a plurality of nodes of a dataflow graph; (Fig. 2A and 3A: fabric contains plurality of PE’s that correspond to nodes, with the routers acting as branches for data to flow as a graph)
allocating a plurality of the processing elements to locally perform the computation; (Fig. 2A: processing is local in the sense that it is performed on the computing device).
While Imam does not teach the remaining limitations, Korthikanti does teach:
at least a single one of the processing elements being allocated to locally perform a plurality of respective first computation portions of each of at least two partitioned ones of the nodes, each of the portioned nodes comprising a respective plurality of computation portions including the respective first computation portions (0054: matrix operations [computation portions] of node can be partitioned to be processed by multiple processing elements).
While Imam does not teach the remaining limitations, Khan does teach
performing the computations and communications in accordance with the specifying, the allocating, and a virtual channel specifier of each fabric packet sent via one or more virtual channels between the at least single one of the processing elements and other ones of the allocated processing elements to transfer between the respective computation portions data comprising one or more sources and results; and (pg. 2854: each neuron has a virtual address; thus, every data packet that goes to a node is specified a virtual channel [virtual channel specifier]; pg. 2853, col 2, paragraph 3, the routing path for packets is an example of a virtual channel).
wherein the virtual channel specifier identifies one of the one or more virtual channels (pg. 2853, col 2, paragraph 3, the module for neural network mapping setting up the routing for packets with routing entries of the nodes, including the origin and the target, in a virtual address space, pg. 2854: each neuron has a virtual address; thus, every data packet that goes to a node is specified a virtual channel [virtual channel specifier]).
	It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the virtual addresses of Khan, as the virtual network approach to neural modelling extends the size and class of neural networks that hardware can model (Khan pg. 2857 left).
	
As per claim 7, the rejection of claims 1 and 2 is incorporated.
Imam teaches:
wherein the data flow graph corresponds to all or any portions of a neural network, and at least a portion of the performing the computations corresponds to computing weights of the neural network (0027: neural network along with weight computation).

As per claim 8, the rejection of claims 1 and 2 is incorporated.
Imam teaches:
wherein the locally performed computations and the exchanging fabric packets are respectively performed by the compute engines and the fabric routers of the respective processing elements (0033: computations and fabric exchanges are performed by processing blocks and routers of fabric).

As per claim 9, the rejection of claims 1 and 2 is incorporated.
Imam teaches:
wherein the sources and results are with respect to one or more of: multiply and accumulate operations, partial sums, activations, and final sums (0034: activations with regard to nodes).

As per claim 21, the rejection of claim 7 is incorporated.
Khan teaches:
wherein the allocating is performed by a node to processing element mapping process in accordance with predetermined criteria (pg. 2854 left : mapping is done so as to minimize routes and traffic).


As per claim 22, the rejection of claim 21 is incorporated.
Khan teaches:
wherein the mapping process is performed at least in part manually (Fig. 3: mapping is done through a manually chosen pattern).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the virtual addresses of Khan, as the virtual network approach to neural modelling allows minimizing of routes and traffic (Khan pg. 2854 left).

As per claim 24, the rejection of claim 21 is incorporated.
Khan teaches:
wherein the predetermined criteria comprises one or more of: reducing at least one data movement latency metric  (pg. 2854 left: mapping is done so as to minimize routes and traffic [metric]).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the virtual addresses of Khan, as the virtual network approach to neural modelling allows minimizing of routes and traffic (Khan pg. 2854 left).

Claims 25, 26, 31, 32, 33, 45, 46, and 48 contain similar limitations to those of claims 1, 2, 7, 8, 9, 21, 22, and 24 respectively. Thus, they are similarly rejected under 35 U.S.C. 103.


Claims 3, 4, 5, 27, 28, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Schemmel et al. (“Wafer-scale integration of analog neural networks”).
As per claim 3, the rejection of claim 1 is incorporated.
While the previously cited art do not teach the claim’s limitations, Schemmel teaches:
wherein the processing elements are fabricated via wafer-scale integration  (Abstract: neural network implemented with wafer-scale integration).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the wafer-scale integration of Schemmel, so as to allow the mapping of a multitude of network models derived from biology on the VLSI neural network while maintaining a high resource usage (Schemmel Abstract).

As per claim 4, the rejection of claim 1 is incorporated.
Schemmel teaches:
wherein the at least two processing elements are fabricated via wafer-scale integration on separate die of a single wafer (Sec. B(1): “Fig. 7 shows an exemplary connection from the neuron labeled N1 to the neuron N2 located on a different HICANN die”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the wafer-scale integration of Schemmel, so as to allow the mapping of a multitude of network models derived from biology on the VLSI neural network while maintaining a high resource usage (Schemmel Abstract).

As per claim 5, the rejection of claim 2 is incorporated.
 wherein the at least single one of the processing elements and other ones of the allocated processing elements are fabricated via wafer-scale integration on separate die of a single wafer (Sec. B(1): “Fig. 7 shows an exemplary connection from the neuron labeled N1 to the neuron N2 located on a different HICANN die”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the wafer-scale integration of Schemmel, so as to allow the mapping of a multitude of network models derived from biology on the VLSI neural network while maintaining a high resource usage (Schemmel Abstract).

Claims 27-29 contain similar limitations to those of claims 3-5 respectively. Thus, they are similarly rejected under 35 U.S.C. 103.


Claims 6 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Akopyan et al. (US 2013/0073497 A1).
As per claim 6, the rejection of claims 1 and 2 is incorporated.
While the previously cited art do not teach the claim’s limitations, Akopyan does teach:
at least some of the exchanged fabric packets are fabric vectors (0068: F is a binary vector).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the vectors of Akopyan, as binary vectors allow for efficient computation of bits (Akopyan 0068).

Claim 30 contains similar limitations to those of claim 6. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claims 10, 11, 12, 13, 34, 35, 36, and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Shams (“Mapping of Neural Networks onto Programmable Parallel Machines”). 
As per claim 10, the rejection of claims 1 and 2 is incorporated.
While the previously cited art do not teach the claim’s limitations, Shams does teach:
wherein the allocating enables parallel partitioned node computations on multiple of the processing elements providing reduced wall-clock time, compared to performing sequential non-partitioned node computations on a single one of the processing elements (pg. 2615 right column: parallel computation within nodes; pg. 2616: increased performance from parallel computing, and thus reduced wall-clock time)
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Shams, as doing so improves performance (Shams pg. 2616).

As per claim 11, the rejection of claim 10 is incorporated.
Shams teaches:
wherein the parallel computations at times comprise the concurrent use of respective all digital multipliers (pg. 2615: multipliers can be used in parallel and thus concurrently).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Shams, as doing so improves performance (Shams pg. 2616).

As per claim 12, the rejection of claim 10  is incorporated.
Shams teaches:
wherein the parallel computations comprise at least partially overlapped computations (pg. 2615: “Additional parallelism was exploited by overlapping computation”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Shams, as doing so improves performance (Shams pg. 2616).

As per claim 13, the rejection of claims 1 and 2 is incorporated.
Shams teaches:
further comprising initializing the fabric with all node and branch parameters required for the concurrent execution of the communications and computations respectively corresponding to the dataflow graph (pg. 2614 right column: the nodes are naturally initialized with starting parameters before the actual concurrent learning phase).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Shams, as doing so improves performance (Shams pg. 2616).

Claims 34-37 contain similar limitations to those of claims 10-13 respectively. Thus, they are similarly rejected under 35 U.S.C. 103.

s 14-17 and 38-41 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti, Khan, and Shams, and further in view of Li (US 10,268,679 B2).
As per claim 14, the rejection of claim 13 is incorporated.
While the previously cited art do not teach the claim’s limitations, Li does teach:
subsequent to the initializing, concurrently  executing all layers of the dataflow graph for one or more of inference and training (col. 6 lines 15-20: multiple layers of neural network can be processed simultaneously).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Li so as for improved accuracy and processing speed (Li col. 2 lines 1-5).

As per claim 15, the rejection of claim 14 is incorporated.
Li teaches:
wherein the layer of the dataflow graph comprise input, hidden, and output layers (col. 12 lines 30-40: input, hidden, and output layers).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the parallel computation of Li so as for improved accuracy and processing speed (Li col. 2 lines 1-5).

As per claim 16, the rejection of claim 14 is incorporated.

wherein the concurrently executing does not require any access to storage external to the fabric for any intermediate state or additional node and branch parameters of the dataflow graph (Fig. 2: no external storage required for nodes and branches; note that Data Store is for input data, not the model itself).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the in-house storage of Li, so as to avoid unnecessary and slow access to external memory for computations.

As per claim 17, the rejection of claim 16 is incorporated.
Imam teaches:
wherein the dataflow graph is a neural network, the nodes correspond to neurons, the partitioned node corresponds to a split neuron, and at least some of the node and branch parameters of the dataflow graph correspond to a plurality of weights of the neural network (Fig. 2C: each node corresponds to a neuron, where the neuron is split into dendrite and soma; 0026: weights are stored for neural network).

Claims 38-41 contain similar limitations to those of claims 14-17 respectively. Thus, they are similarly rejected under 35 U.S.C. 103.

s 18 and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Venkatarami (US 2019 / 0303743 A1) and Davis (US 2016/0182398 A1).
As per claim 18, the rejection of claims 1 and 2 is incorporated.
While the previously cited art do not teach the claim’s limitations, Venkatarami teaches:
wherein except for defects, the fabric is homogeneous, the plurality of processing elements numbers three million (0040: number of nodes can number from 0.65 to 15 million; since each node corresponds to a PE, number of PE is in that range too).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the server of Venkatarami, as having millions of nodes can allow for 6 to 28 times improvement in performance on a DNN topologies (Venkatarami 0040).
While Venkatarami does not teach the remaining limitations, Davis teaches:
each processing element comprises 48kB of private local storage for instructions and data (0052: each core can have 48 kb in cache).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the cache of Davis, as having local cache allows for fast access to time-sensitive data.

Claim 42 contains similar limitations to those of claim 6. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claims 19, 20, 43, and 44 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Sigal (US20080222646A1)
As per claim 19, the rejection of claims 1 and 2 is incorporated.
While the previously cited art do not teach the claim’s limitations, Sigal teaches:
wherein the fabric is enabled to concurrently store and execute a dataflow graph having communications and computations requirements of up to a combined 24GB of instruction and data storage (0028: storing requirements for neural network can be max of 16 GB).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the storage maximum of Sigal, as keeping the memory requirements from becoming too big reduces hardware requirements.

As per claim 20, the rejection of claims 1 and 2 is incorporated.
Imam teaches:
wherein the data storage is used for one or more of weights, forward partial sums, activations, gradient accumulations, delta partial sums, layer errors, duplicated weights, and other implementation overhead, as required by the concurrently executing (0026: local memory is used to store synaptic weights).

.

Claims 23 and 47 are rejected under 35 U.S.C. 103 as being unpatentable over Imam in view of Korthikanti and Khan, and further in view of Venkatarami.
As per claim 23, the rejection of claim 21 is incorporated.
While the previously cited art do not teach the claim’s limitations, Venkatarami teaches:
wherein the mapping process is performed at least in part via software executing on a placement server external to the fabric (Fig. 26: mapping module is external to fabric; 0159: Fig. 26 architecture can be implemented on a server). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the fabric of Imam to include the server of Venkatarami, as external servers provide flexibility in terms of the connectivity of the system to external components (Venkatarami 0159).

Claim 47 contains similar limitations to those of claim 23. Thus, the claim is similarly rejected under 35 U.S.C. 103.



	Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to LiWu Chang whose telephone number is (571)270-3809, email: li-wu.chang@uspto.gov.  The examiner can normally be reached on M-F. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda M Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-

/LI WU CHANG/           Primary Examiner, Art Unit 2124                                                                                                                                                                                             	May 28, 2021