DETAILED ACTION
Claims 1-23 are pending.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The office acknowledges the following papers:
Claims and remarks filed on 1/11/2021,
IDS filed on 1/12/2021.

	Withdrawn objections and rejections
The drawing objections have been withdrawn.

New Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Farabet et al. (U.S. 2012/0303932), in view of Bittner, JR. (U.S. 2007/0186036), in view .
As per claim 22:
Farabet disclosed a data flow computer architecture comprising: 
a dataflow processor providing set of functional units and programmable switches interconnecting the functional units between input ports receiving input values and output ports providing output values (Farabet: Figure 1 elements 100-110, paragraphs 19 and 26-29)(The dataflow processor includes processing tiles that are configured to receive input data and output data via multiplexer interconnections (i.e. programmable switches). Each processing tile includes a set of functional units.), the functional units providing programmable arithmetic functions (Farabet: Figures 1-3 elements 110 and 320, paragraphs 26 and 34)(The processing tiles are configured to perform a specific operation out of a plurality of selectable operations.) and the interconnection providing paths from input ports through functional units to output ports determined by the switch programming (Farabet: Figure 1 element 110, paragraphs 19 and 26-29)(The configured multiplexer connections provide dataflow paths between data inputs sent from the multiplexers to the operators and data outputs sent from the operators to the multiplexers.);
a configuration store holding data configuring the interconnection of the functional units and the arithmetic functions of the functional units to execute a predetermined program in which data received at the input ports is clocked through the functional units and programmable switches to the output ports to implement a sequence of arithmetic functions on the data (Farabet: Figures 1-2 element 110 and 120, paragraphs 17, 21, 26, 28, and 35)(The controller outputs a configuration to the reconfigurable dataflow 
wherein the functional units operate so that calculations occur as soon as operands are available at the functional units and so that memories for storing operands at the functional units are not required (Farabet: Figures 1-2 elements 110a-b and 110d-I, paragraphs 22, 24, and 28)(The operators in the processing tiles perform the configured operation once the data is received in a given clock cycle. The use of FIFOs aren’t required for the operation in figure 2. The use of FIFOs allows for buffering when paths are configured to require more processing than is capable.), and wherein the configuration store defines paths of data through the dataflow processor ensuring corresponding operands arrive at the same time at each functional unit according to the program by adjusting the path of data through the dataflow processor without a need for additional buffer storage elements (Farabet: Figures 1-2 element 110 and 120, paragraphs 17, 21, 26, 28, and 35)(The controller outputs a configuration to the reconfigurable dataflow processor to configure multiplexer routing and operator functions performed in each processing tile to perform a processing task. The configuration allows 
Farabet failed to teach a clock requiring synchronous movement of data among functional units and programmable switches by one step for each clock cycle, a step being from a functional unit to a switch or from a switch to a functional unit.
However, Bittner combined with Farabet disclosed a clock requiring synchronous movement of data among functional units and programmable switches by one step for each clock cycle, a step being from a functional unit to a switch or from a switch to a functional unit (Bittner: Figures 22-23, paragraphs 131-133)(Farabet: Figures 1-2 element 110, paragraphs 22, 28, and 35)(Farabet disclosed operators in the processing tiles producing one result per clock cycle. Farabet in figure 2 shows an example dataflow calculation where tile 110h stalls one cycle waiting for data from tile 110g. Bittner disclosed vector dot-product operations and mapping portions of the dataflow to machines for execution. The combination allows for Farabet to be configured to execute vector dot-product operations. The vector dot-product operations can be mapped to tiles in Farabet in a way that each tile receiving data produces an execution result to output each clock cycle.).
Farabet disclosed a single algorithm mapped to the processing tiles for execution. The advantage of mapping other complex calculations is that the dataflow processor can perform more useful calculations. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the vector dot-product calculation of Bittner as an operation 
As per claim 23:
Farabet and Bittner disclosed the dataflow computer architecture of claim 22 further including a set of buffers associated with input ports of the dataflow processor, the buffers synchronized with the clock to release data to the input ports at times adapted to ensure corresponding operands arrive at the same time at each functional unit according to the program and the configuration store (Farabet: Figures 1-2 element 110, paragraphs 22, 28, and 35)(The operators in the processing tiles produce one result per clock cycle and can be configured to form pipelines of functions to be performed. Official notice is given that pipeline buffers hold data between clock cycles for the advantage of synchronized processing. Thus, it would have been obvious to one of ordinary skill in the art to include pipeline buffers in the multiplexer routing elements to store input data.).

Claims 1-5, 7-8, and 12-16 are rejected under 35 U.S.C. 103 as being unpatentable over Farabet et al. (U.S. 2012/0303932), in view of Khailany et al. (U.S. 2012/0011349).
As per claim 1:
Farabet and Khailany disclosed a reconfigurable accelerator architecture comprising:
(1) a microcontroller adapted to receive instructions and data to control other components of the accelerator (Khailany: Figure 1 elements 101 and 103, paragraph 
(2) a stream processor receiving instructions from the microcontroller to autonomously read multiple input values stored in memory according to a selected set of predefined memory access patterns (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 element 130, paragraph 26)(Farabet disclosed the controller configuring the memory access module to prefetch data for processing. Khailany disclosed a stream load/store unit that performs load/store operations according to memory access patterns that include strides and record sizes. The combination implements the stream load/store unit (i.e. stream processor) of Khailany into the system of Farabet to read multiple input values based on memory access patterns.) and to autonomously write multiple output values from the accelerator to memory according to a selected set of predefined memory access patterns (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 element 130, paragraphs 26 and 29)(Farabet disclosed the controller configuring the memory access module to store processed results back to memory. Khailany disclosed a stream load/store unit that performs load/store operations according to memory access patterns that include strides and record sizes. The combination implements the stream 
(3) a reconfigurable dataflow processor configured by the microcontroller to receive the multiple input values to provide output values according to the configuration (Farabet: Figure 1 elements 110-120 and 150, paragraphs 19-21, 26, and 28-29)(The processing tiles are configured by the controller to receive input values, perform dataflow calculations, and provide output value back to the memory access module.).
Farabet disclosed a reconfigurable dataflow processor that can perform a plurality of tasks, but doesn’t explicitly state how one of a plurality of tasks to be performed is chosen. Khailany disclosed a host processor instructing a coprocessor to perform operations. The advantage of offloading processing tasks from a host processor to a coprocessor is that various processing tasks can have increased performance. The advantage of the stream load/store operations of Khailany is that large amounts to data can be loaded/stored through single operations, which improves code density. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the host processor and stream load/store unit of Khailany into the system of Farabet for the advantages of increased performance and code density.
As per claim 2:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein successive instructions from the microcontroller to the stream processor may be received asynchronously with respect to the operation of the dataflow processor and provide for autonomous reading of multiple input values stored in memory or an autonomous writing of multiple output values from the accelerator according to different 
As per claim 3:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein the reconfigurable dataflow processor provides a set of programmable switches interconnecting functional units between input ports receiving input values and output ports providing output values (Farabet: Figure 1 element 110, paragraphs 19 and 26-29)(The processing tiles are configured to receive input data and output data via multiplexer interconnections.), the functional units providing selectable multiple arithmetic functions (Farabet: Figures 1-2 element 110, paragraphs 26)(The processing tiles are configured to perform a specific operation out of a plurality of selectable operations.) and the interconnection providing paths from input ports through functional units to output ports determined by the switch programming (Farabet: Figure 1 element 110, paragraphs 19 and 26-29)(The multiplexer connections provide dataflow paths from data inputs to data outputs.).
As per claim 4:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 wherein the interconnection provides direct interconnections between switches and functional units and direct interconnections between switches (Farabet: Figures 1-2 element 110, paragraph 19)(The processing tiles provide direct connections between 
As per claim 5:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 wherein the interconnection provides for at least 32 parallel data lines between switches and between switches and functional units (Farabet: Figures 1-2 element 110, paragraph 19)(Each processing tile provides at least 5 direct connections between neighbors and operators. The processing tiles can be arranged in various configurations. The shown configuration provides at least 32 connections. In addition, according to “In re Rose” (105 USPQ 237 (CCPA 1955)), changes in size or range doesn’t give patentability over prior art.).
As per claim 7:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 wherein the functional units operate in parallel (Farabet: Figures 1-2 element 110, paragraphs 28-29)(The processing tile functional units are configured to operate in parallel according to the dataflow operation to be performed.).
As per claim 8:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 further including a clock permitting a moving of data between switches by one switch or between functional units by one functional unit for each clock cycle (Farabet: Figures 1-2 element 110, paragraphs 22, 28, and 35)(The operators in the processing tiles produce one result per clock cycle and can be configured to form pipelines of functions to be performed. It would have been obvious to one of ordinary skill in the art that operators not performing a function on a processing tile can instead pass data to 
As per claim 12:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein the stream processor provides pre-defined memory access patterns including a linear access pattern of contiguous addresses between two memory addresses (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 elements 120 and 130, paragraphs 26-28)(The controller separately configures the memory access module for prefetching data for various tasks. The combination allows for prefetching multiple input values based on memory access patterns. A stride value of one allows for contiguous address data fetching.) and a strided access pattern of regularly spaced discontiguous addresses between two memory addresses (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 elements 120 and 130, paragraphs 26-28)(The controller separately configures the memory access module for prefetching data for various tasks. The combination allows for prefetching multiple input values based on memory access patterns. A stride value of greater than one allows for a strided access pattern of discontiguous addresses.).
As per claim 13:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 12 wherein the stream processor uses data obtained with the pre-defined memory access patterns as addresses of data to be used as the multiple input values provided to the reconfigurable dataflow processor (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 elements 120 and 130, paragraphs 
As per claim 14:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein the stream processor operates autonomously with respect to the microcontroller after programming by the microcontroller (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 elements 120 and 130, paragraphs 26-28)(The controller separately configures the memory access module for prefetching data for various tasks. The combination allows for prefetching multiple input values based on memory access patterns.).
As per claim 15:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein the reconfigurable dataflow processor includes input and output buffers to operate asynchronously with respect to the stream processor (Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 element 110, paragraph 24)(The operators include input and output FIFOs. The dataflow operators operate asynchronously with the stream load/store units based on input data and configurations.).
As per claim 16:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1 wherein the microcontroller issues stream commands to the stream processor .

Claims 6, 9, 17, and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Farabet et al. (U.S. 2012/0303932), in view of Khailany et al. (U.S. 2012/0011349), in view of Official Notice.
As per claim 6:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 wherein the functional units may provide different selections of arithmetic and logical operations (Farabet: Figure 2 elements 110a-I, paragraphs 28-29)(The processing tiles can be configured to perform different arithmetic operations. Official notice is given that logical operations can be implemented for the advantage of performing data movement and filtering operations. Thus, it would have been obvious to one of ordinary skill in the art to implement logical operators in the processing tiles of Farabet.).
As per claim 9:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 3 wherein the microcontroller controls the reconfigurable dataflow processor by 
As per claim 17:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 16 wherein the microcontroller further issues barrier commands to the stream processor defining a necessary completion order of memory accesses before and after the barrier command (Farabet: Figure 1 element 120, paragraph 26)(Official notice is given that synchronization commands can be used for the advantage of ensuring proper data ordering. Thus, it would have been obvious to implement barrier operations in Farabet.).
As per claim 20:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1, wherein the microcontroller responds to predetermined instructions to provide information about a number and type of functional units in the reconfigurable dataflow processor (Khailany: Figure 1 elements 101 and 103, paragraph 18)(Farabet: Figure 1 element 120)(Official notice is given that host processors can offload processing tasks to multiple coprocessors for the advantage of increased performance. Thus, it would have been obvious to one of ordinary skill in the art that the host processor of Khailany can offload multiple dataflow processing tasks to multiple instances of dataflow processors. 
As per claim 21:
Claim 21 essentially recites the same limitations of claim 1. Claim 21 additionally recites the following limitations:
an out-of-order, speculative processor core communicating with a memory for receiving instructions and reading and writing data (Khailany: Figure 1 element 101, paragraph 18)(Farabet: Figure 1 element 100)(Official notice is given that processors can implement out-of-order processing for the advantage of increased performance. Thus, it would have been obvious to one of ordinary skill in the art to implement out-of-order processing in the host processor of Khailany.); and 
a plurality of reconfigurable accelerators controlled by the out-of-order, speculative processor core (Khailany: Figure 1 elements 101 and 103, paragraph 18)(Farabet: Figure 1 element 100)(Official notice is given that host processors can offload processing tasks to multiple coprocessors for the advantage of increased performance. Thus, it would have been obvious to one of ordinary skill in the art that the host processor of Khailany can offload multiple dataflow processing tasks to multiple instances of dataflow processors.).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Farabet et .
As per claim 10:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1.
Farabet and Khailany failed to teach wherein the microcontroller is a von Neuman, single-issue, in-order core.
However, Dockser combined with Farabet and Khailany disclosed wherein the microcontroller is a von Neuman, single-issue, in-order core (Dockser: Figure 2 element 206 and 240-241, paragraphs 25 and 36)(Khailany: Figure 1 elements 101 and 103, paragraph 18)(Farabet: Figure 1 element 120, paragraphs 21 and 26)(Dockser disclosed a single-issue, in-order coprocessor execution pipeline for executing instructions not executable by a main processor. Khailany disclosed a host processor outputting commands to a stream coprocessor. The combination allows for a coprocessor to receive executable commands by the in-order core and dataflow processing tasks.).
The advantage of performing instructions in a coprocessor is that certain instructions can be offloaded for acceleration, which improves processor performance. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the coprocessor of Dockser in Farabet to offload additional tasks for acceleration.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Farabet et al. (U.S. 2012/0303932), in view of Khailany et al. (U.S. 2012/0011349), in view of Dockser et al. (U.S. 2012/0204008), further in view of Official Notice.
As per claim 11:
Farabet, Khailany, and Dockser disclosed the reconfigurable accelerator architecture of claim 10 wherein the microcontroller is further adapted to receive instructions and data from a primary processor to execute logical and arithmetic instructions in response to the instructions (Dockser: Figure 2 element 206 and 240-241, paragraphs 25 and 36)(Khailany: Figure 1 elements 101 and 103, paragraph 18)(Farabet: Figure 1 element 120, paragraphs 21 and 26)(Dockser disclosed a single-issue, in-order coprocessor execution pipeline for executing instructions not executable by a main processor. Khailany disclosed a host processor outputting commands to a stream coprocessor. The combination allows for a coprocessor to receive executable commands by the in-order core and dataflow processing tasks. Official notice is given that coprocessor execution units can process arithmetic and logical operations for the advantage of accelerating execution of both. Thus, it would have been obvious to one of ordinary skill in the art to implement both types of execution in Dockser.) and data to return data to the primary processor without involvement of a stream processor and the reconfigurable dataflow processor (Dockser: Figure 2 element 242, paragraph 36)(Khailany: Figure 1 elements 101 and 103, paragraph 18)(Official notice is given that coprocessors can return execution results to main processors for the advantage of a main processor performing further execution. Thus, it would have been obvious to one of ordinary skill in the art to implement a coprocessor result return path in Dockser.).

Claims 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Farabet et al. (U.S. 2012/0303932), in view of Khailany et al. (U.S. 2012/0011349), .
As per claim 18:
Farabet and Khailany disclosed the reconfigurable accelerator architecture of claim 1.
Farabet and Khailany failed to teach including a scratchpad memory communicating with the stream processor to read data from the memory or write data to the memory as controlled by the stream processor.
However, Asher combined with Farabet and Khailany disclosed including a scratchpad memory communicating with the stream processor to read data from the memory or write data to the memory as controlled by the stream processor (Asher: Figures 1 and 3C element 130, paragraphs 21, 55-57, and 70)(Khailany: Figures 2, 9A-B, and 11A-B element 143, paragraphs 215-217 and 220)(Farabet: Figure 1 element 130, paragraph 20 and 26)(Farabet disclosed a DMA memory control module prefetching data from memory. Khailany disclosed a stream load/store unit reading/writing data from/to memory based on an access pattern. Asher disclosed a scratchpad memory to store prefetched data based on input output bridge direct memory access instructions. The combination allows for Farabet to include a scratchpad memory to store data prefetched from the off-chip memory.).
The advantage of including a scratchpad memory for storing prefetch data is that multiple prefetch requests can be performed (Asher: Paragraph 56). Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement a scratchpad memory in Farabet to store prefetched data.
As per claim 19:
.

Response to Arguments
The arguments presented by Applicant in the response, received on 1/11/2021 are considered persuasive.
Applicant argues for claim 22:
“The Applicant submits that this indicates that Farabet both has a mechanism to stall data and expects the data to have to be stalled in order to align operands. As noted, stalling is inconsistent with the current claim limitations that require movement on each clock cycle.
Importantly, it is believed that this citation from Farabet indicates that Farabet does not recognize the possibility of a simple clocking system that mo ves data on each clock cycle. Instead Farabet requires a more complex system of buffering and managing those buffers to accommodate different path lengths.”  

This argument is found to be persuasive for the following reason. The examiner agrees that Farabet failed to teach execution of a large operation that reads upon the newly claimed limitation. However, a new ground of rejection has been given due to amendment.

	Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The following is text cited from 37 CFR 1.111(c): In amending in reply to a rejection of claims in an application or patent under reexamination, the applicant or patent owner must clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. The applicant or patent owner must also show how the amendments avoid such references or objections.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB A. PETRANEK whose telephone number is (571)272-5988.  The examiner can normally be reached on M-F 8:00-4:30.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JACOB PETRANEK/Primary Examiner, Art Unit 2183