DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Priority
Examiner acknowledges Applicant's claim for benefit of 62/465,063 filed 2/28/2017.

CLAIM INTERPRETATION

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Claim limitations “pipeline configured to process…”, “decoder configured to decode… further configured to map …”, “matrix vector unit is configured to process…”, “first multifunction unit is configured to process…”, “second multifunction unit is configured to process…”, “third multifunction unit is configured to process…” , “matrix vector unit or the first multifunction unit is configured to process…”, “input message processor configured to process…”, “scalar processor configured to process…”, and “neural function unit configured to process…” has/have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses/they use a generic placeholders “pipeline”, “decoder”, “matrix vector unit”, “first multifunction unit, “second multifunction unit”, “third multifunction unit”, “input message processor”, “scalar processor”, and “neural function unit” coupled with functional language “configured to process…” without reciting sufficient structure to achieve the function.  Furthermore, the generic placeholder is not preceded by a structural modifier.  The claims lack specific structure linked to these claimed placeholders or their functions.

A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation:
Figure 9: Depicts Matrix Vector Unit as being hardware including SRAM;
¶21 as filed: FIG. 14 shows a diagram of how chains of instructions may be processed using a hardware node (e.g., an FPGA), and Figure 14 depicts that the decoder and multifunction units (MFU) are part of this hardware FPGA;
¶23 as filed: the invention is implemented on hardware blocks of a node (e.g. logic blocks and reconfigurable interconnects of an FPGA);
¶26 as filed: The nodes may be hardware programmable logic devices that could be customized specifically to perform the types of operations that occur in the context of neural networks, such as DNNs. In one example, the state of a neural network model and the parameters used to control the model may be pinned to the on-chip memories of the nodes comprising a distributed hardware platform. The neural network model may be pinned (e.g., preloaded) to the on-chip memories at the service start up time and the contents of the on-chip memories may not be altered unless the model requires alteration or another event that requires reloading the on-chip memories with the model. Thus, in this example, contrary to other arrangements, neural network model may not be accessed from the DRAM associated with the hardware platform, and instead, be loaded directly into the on-chip memories (e.g., SRAMs) of the hardware node. Pinning a model across a distributed set of programmable logic blocks (e.g., FPGA resources) may allow the nodes (e.g., FPGAs) to operate at full capacity and that may advantageously improve the throughput and the latency associated with the service. As an example, even a single request from the service may result in the distributed set of nodes to operate at full capacity and thereby delivering results requested by a user of the service at very low latency;
¶27 as filed: Programmable hardware logic blocks in the nodes may process the matrices or vectors to perform various operations, including multiply, add, and other operations against input vectors representing encoded information related to the service. In one example, the matrices or vectors of weights may be partitioned and pinned across multiple nodes by using techniques such as graph partitioning;
¶¶30-31 as filed: each node may be implemented as a server and may further include at least one hardware node (e.g., an FPGA.) or several FPGAs which may be coupled via transport links or a network interface;
¶32 as filed: Figure 3 is a hardware node which may include an input message processor (IMP), an output messenger processor (OMP), a control/scalar processor (CSP), and a neural function unit (NFU).
Accordingly, the “pipeline”, “decoder”, “matrix vector unit”, “first multifunction unit, “second multifunction unit”, “third multifunction unit”, “input message processor”, “scalar processor”, and “neural function unit” are each interpreted to be hardware based on the instant disclosure, such as an FPGA or equivalents, or dedicated on-chip memories with preloaded models.
If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action. 
If applicant does not intend to have the claim limitation(s) treated under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112 , sixth paragraph, applicant may amend the claim(s) so that it/they will clearly not invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, or present a sufficient showing that the claim recites/recite sufficient structure, material, or acts for performing the claimed function to preclude application of 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
For more information, see MPEP § 2173 et seq. and Supplementary Examination Guidelines for Determining Compliance With 35 U.S.C. 112 and for Treatment of Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. §112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. §112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 6, 14, and 18 is/are rejected under 35 U.S.C. §112(b) or 35 U.S.C. §112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 6, 14, and 18: The term "substantially" is a relative term which renders the claim indefinite.  The term "substantially in parallel" is not defined by the claim, the specification does not provide a standard for 
Appropriate corrections are required.

PRIOR ART
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. §103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. §102(b)(2)(C) for any potential 35 U.S.C. §102(a)(2) prior art against the later invention.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Pechanek (US 2015/0039855) in view of
Sankaranarayanan (US 2016/0124651).

Claim 1 (Independent).
Pechanek discloses: A method in a processor (e.g. ¶8: executed by a processor) including a pipeline for processing instructions, the pipeline including a matrix vector unit, a first multifunction unit, wherein the first multifunction unit is connected to receive an input from the matrix vector unit, a second multifunction unit, wherein the second multifunction unit is connected to receive an output from the first multifunction unit, and a third multifunction unit, wherein the third multifunction unit is connected to receive an output from the second multifunction unit (e.g. ¶172: FIG. 24 illustrates … first set of pipeline latches … The T node system 2440 comprises a decoder 2441 having node operation (NodeOp) inputs 2442, three node function units 2444-2446 and a multiplexer 2453 or Figures 24-27 and the associated disclosure; EN: The three node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc., and the multiplexer 2453 is interpreted to be the matrix vector unit which multiplexes the outputs from multifunction units / node function units 2444-2446), the method comprising:
decoding instructions received via an input queue, wherein a subset of the received instructions comprises a set of instructions including a first type of instruction for processing by only the matrix vector unit and a second type of instruction for processing by only at least one of the first multifunction unit, the second multifunction unit, or the third multifunction unit (e.g. ¶8: packet of chained instructions … instruction of the chain of instructions is decoded to determine a function specified by the first instruction to identify an execution unit to provide the function, and to identify an operated input pipeline register (OIPR) of a destination instruction of the chain of instructions as a destination for the result generated by the identified execution unit or ¶9: identify control information encoded in the first instruction that is used for execution of a second instruction that is a pre-specified destination instruction placed in a sequence of instructions at a pre-specified location relative to the first instruction, and to identify an operand input pipeline register (OIPR) associated with the second instruction as a destination for a result generated by execution of the first instruction. The control information is transferred across a local network between execution units to store the control information in a pending register. The first instruction is executed to produce the result which is transferred across the local network between execution units to the identified OIPR. The second instruction is executed to fetch the result from the identified OIPR and operate on the result using the control information fetched from the pending register to adjust the second execution unit for executing the second instruction or ¶172: decoder 2441 or Figure 24: 2441; EN: The reference stipulates that the decoder identifies an execution unit for the function of the instruction and a destination for the result generated, and one of ordinary skill in the art before the earliest effective filing date of the invention would have clearly understood this to include ensuring the identified components being capable of performing the instructions assigned to them on the data given to them); and 
mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit (e.g. ¶8: packet of chained instructions … .  
Pechanek fails to explicitly recite:
mapping depending on whether the first instruction is the first type of instruction or the second type of instruction.
Sankaranarayanan discloses:
mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction (e.g. ¶44 or Figures 1, 2: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113).
Rationale:
Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 


Claim 9 (Independent).
Pechanek discloses: A processor (e.g. ¶¶7-8: processor) comprising:
a pipeline configured to process instructions, the pipeline including a matrix vector unit, a first multifunction unit, wherein the first multifunction unit is connected to receive an input from the matrix vector unit, a second multifunction unit, wherein the second multifunction unit is connected to receive an output from the first multifunction unit, and a third multifunction unit, wherein the third multifunction unit is connected to receive an output from the second multifunction unit (e.g. ¶172: FIG. 24 illustrates … first set of pipeline latches … The T node system 2440 comprises a decoder 2441 having node operation (NodeOp) inputs 2442, three node function units 2444-2446 and a multiplexer 2453 or Figures 24-27 and the associated disclosure; EN: The three node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc., and the multiplexer 2453 is interpreted to be the matrix vector unit which multiplexes the outputs from multifunction units / node function units 2444-2446); and 
a decoder configured to decode instructions received via an input queue, wherein a subset of the received instructions comprises a set of instructions including a first type of instruction for processing by only the matrix vector unit and a second type of instruction for processing by only at least one of the first multifunction unit, the second multifunction unit, or the third multifunction unit (e.g. ¶8: packet of chained instructions … instruction of the chain of instructions is decoded to determine a function specified by the first instruction to identify an execution unit to provide the function, and to identify an operated input pipeline register (OIPR) of a destination instruction of the chain of instructions as a destination for the result generated by the identified execution unit or ¶9: identify control information encoded in the first instruction that is used for EN: The reference stipulates that the decoder identifies an execution unit for the function of the instruction and a destination for the result generated, and one of ordinary skill in the art before the earliest effective filing date of the invention would have clearly understood this to include ensuring the identified components being capable of performing the instructions assigned to them on the data given to them).
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan discloses:
wherein the decoder (e.g. Figure 1 element 113 and the associated disclosure) is further configured to:
map a first instruction for processing by the matrix vector unit or the first multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction (e.g. ¶44: decoded instructions are supplied to scalar datapath side A and vector datapath side B to functional units within scalar datapath side A and vector datapath side B … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: elements 221, 241 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions or ¶57: mapping to vector functional unit 241 for vector operation), 
map a second instruction for processing by the second multifunction unit depending on whether the second instruction is the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 222 the associated discussion or ¶52 :mapping to scalar functional unit 222 for scalar instructions), and 
map a third instruction for processing by the third multifunction unit depending on whether the third instruction is the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 223 the associated discussion or ¶53 :mapping to scalar functional unit 223 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 


Claim 17 (Independent).
Pechanek discloses: A system comprising:
an input message processor configured to process incoming messages, wherein the input message processor is further configured to split the incoming messages into a first set of messages and a second set of messages (e.g. ¶8: packet of chained instructions … instruction of the chain of instructions is decoded to determine a function specified by the first instruction to identify an execution unit to provide the function, and to identify an operated input pipeline register (OIPR) of a destination instruction of the chain of instructions as a destination for the result generated by the identified execution unit or ¶172: decoder 2441 or Figure 24: 2441);
a scalar processor configured to process the first set of messages and not the second set of messages (e.g. ¶172: three node function units 2444-2446; EN: The node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc.);
a neural function unit configured to process instructions placed in a plurality of queues by the scalar processor on input data received at least via the second set of messages (e.g. ¶172: node T22 755 is coupled to the three memory nodes 731, 735, and 739 which supply the weights and a current neuron value for , the neural function unit comprising:
a pipeline configured to process the instructions, the pipeline including a matrix vector unit, a first multifunction unit, wherein the first multifunction unit is connected to receive an input from the matrix vector unit, a second multifunction unit, wherein the second multifunction unit is connected to receive an output from the first multifunction unit, and a third multifunction unit, wherein the third multifunction unit is connected to receive an output from the second multifunction unit (e.g. ¶172: FIG. 24 illustrates … first set of pipeline latches … The T node system 2440 comprises a decoder 2441 having node operation (NodeOp) inputs 2442, three node function units 2444-2446 and a multiplexer 2453 or Figures 24-27 and the associated disclosure; EN: The three node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc., and the multiplexer 2453 is interpreted to be the matrix vector unit which multiplexes the outputs from multifunction units / node function units 2444-2446); and 
a decoder configured to decode instructions received via an input queue, wherein a subset of the received instructions comprises a set of instructions including a first type of instruction for processing by only the matrix vector unit and a second type of instruction for processing by only at least one of the first multifunction unit, the second multifunction unit, or the third multifunction unit (e.g. ¶8: packet of chained instructions … instruction of the chain of instructions is decoded to determine a function specified by the first instruction to identify an execution unit to provide the function, and to identify an operated input pipeline register (OIPR) of a destination instruction of the chain of instructions as a destination for the result generated by the identified execution unit or ¶9: identify control information encoded in the first instruction that is used for execution of a second instruction that is a pre-specified destination instruction placed in a sequence of instructions at a pre-specified location relative to the first instruction, and to identify an operand input pipeline register (OIPR) associated with the second instruction as a destination for a result generated by execution of the first instruction. The control information is transferred across a local network between execution units to store the control information in a pending register. The first instruction is executed to produce the result which is transferred across the EN: The reference stipulates that the decoder identifies an execution unit for the function of the instruction and a destination for the result generated, and one of ordinary skill in the art before the earliest effective filing date of the invention would have clearly understood this to include ensuring the identified components being capable of performing the instructions assigned to them on the data given to them).
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan discloses:
wherein the decoder (e.g. Figure 1 element 113 and the associated disclosure) is further configured to:
map a first instruction for processing by the matrix vector unit or the first multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction (e.g. ¶44: decoded instructions are supplied to scalar datapath side A and vector datapath side B to functional units within scalar datapath side A and vector datapath side B … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: elements 221, 241 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions or ¶57: mapping to vector functional unit 241 for vector operation), 
map a second instruction for processing by the second multifunction unit depending on whether the second instruction is the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 222 the associated discussion or ¶52 :mapping to scalar functional unit 222 for scalar instructions), and 
map a third instruction for processing by the third multifunction unit depending on whether the third instruction is the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional .
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 


Claim 2.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses:
providing the first instruction as an input for processing by the matrix vector unit if the first instruction is of the first type (e.g. ¶44: decoded instructions are supplied to vector datapath side B to functional units within vector datapath side B or Figure 1: providing instructions to element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: element 241 and the associated discussion or ¶57: mapping to vector functional unit 241 for vector operation).
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 3.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
further comprising providing the first instruction as an input for processing by the first multifunction unit if the first instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: element 221 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 4.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
further comprising mapping a second instruction for processing by the second multifunction unit depending on whether the second instruction is the second type of instruction and providing the second instruction as an input for processing by the second multifunction unit if the second instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel .  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 5.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
further comprising mapping a second instruction for processing by the second multifunction unit depending on whether the second instruction is the second type of instruction (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 222 the associated discussion or ¶52 :mapping to scalar functional unit 222 for scalar instructions) and providing the third instruction as an input for processing by the third multifunction unit if the third instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 223 the associated discussion or ¶53 :mapping to scalar functional unit 223 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 6.
Pechanek further discloses: 
wherein the first instruction is input for processing by the matrix vector unit or the first multifunction unit, the second instruction is input for processing by the second multifunction unit, and the third instruction is input for processing by the third multifunction unit substantially in parallel (e.g. ¶172: FIG. 24 illustrates … first set of pipeline latches … The T node system 2440 comprises a decoder 2441 having node operation (NodeOp) inputs 2442, three node function units 2444-2446 and a multiplexer 2453 or Figures 24-27 and the associated disclosure; EN: The three node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc., and the multiplexer 2453 is interpreted to be the matrix vector unit which multiplexes the outputs from multifunction units / node function units 2444-2446 and these units operate on instructions in parallel).
Sankaranarayanan also discloses: 
wherein the first instruction is input for processing by the matrix vector unit or the first multifunction unit, the second instruction is input for processing by the second multifunction unit, and the third instruction is input for processing by the third multifunction unit substantially in parallel (e.g. ¶40: operate on plural instructions in parallel or ¶44: decoded instructions are supplied to scalar datapath side A and vector datapath side B to functional units within scalar datapath side A and vector datapath side B … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: elements 221, 222, 223, 241 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions or ¶53:mapping to scalar functional unit 222 for scalar instructions or ¶54 :mapping to scalar functional unit 223 for scalar instructions or ¶57: mapping to vector functional unit 241 for vector operation).  
Rationale:
Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 10.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein only the matrix vector unit is configured to process the first instruction when the first instruction is of the first type (e.g. ¶44: decoded instructions are supplied to vector datapath side B to functional units within vector datapath side B or Figure 1: providing instructions to element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: element 241 and the associated discussion or ¶57: mapping to vector functional unit 241 for vector operation).
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 


Claim 11.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein the first multifunction unit is configured to process the first instruction when the first instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: element 221 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 12.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein the second multifunction unit is configured to process the second instruction when the second instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 222 the associated discussion or ¶52 :mapping to scalar functional unit 222 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 13.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein the third multifunction unit is configured to process the third instruction when the second instruction is of the second type (e.g. ¶44: decoded instructions are supplied to scalar datapath side A to functional units within scalar datapath side A … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 2: elements 223 the associated discussion or ¶53 :mapping to scalar functional unit 223 for scalar instructions).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claim 14 and 18.
Pechanek further discloses: 
wherein the first instruction is input for processing by the matrix vector unit or the first multifunction unit, the second instruction is input for processing by the second multifunction unit, and the third instruction is input for processing by the third multifunction unit substantially in parallel (e.g. ¶172: FIG. 24 illustrates … first set of pipeline latches … The T node system 2440 comprises a decoder 2441 having node operation (NodeOp) inputs 2442, three node function units 2444-2446 and a multiplexer 2453 or Figures 24-27 and the EN: The three node function units 2444-2446 are interpreted to be the multifunction units which process scalar information by multiplying, adding, etc. as instructed in the corresponding instructions using the corresponding weights, etc., and the multiplexer 2453 is interpreted to be the matrix vector unit which multiplexes the outputs from multifunction units / node function units 2444-2446 and these units operate on instructions in parallel).
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein one of the matrix vector unit or the first multifunction unit is configured to process the first instruction depending on whether the first instruction is of the first type or the second type, the second multifunction unit is configured to process the second instruction when the second instruction is of the second type, the third multifunction unit is configured to process the third instruction when the second instruction is of the second type substantially in parallel (e.g. ¶40: operate on plural instructions in parallel or ¶44: decoded instructions are supplied to scalar datapath side A and vector datapath side B to functional units within scalar datapath side A and vector datapath side B … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: elements 221, 222, 223, 241 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions or ¶53:mapping to scalar functional unit 222 for scalar instructions or ¶54 :mapping to scalar functional unit 223 for scalar instructions or ¶57: mapping to vector functional unit 241 for vector operation).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data cache, handling address generation automatically freeing up the address generation instruction slots for other computations and enabling advantageous use of vector SIMD processing (Sankaranarayanan especially e.g. ¶¶73,112). 

Claims 7 and 15 and 19.
Pechanek further discloses: 
wherein each of the first multifunction unit, the second multifunction unit, and the third multifunction unit further comprises a pointwise addition block, a pointwise multiplication block, a sigmoid block, a hyperbolic tangent block, and a no-operation block (e.g. ¶172: node T22 755 is coupled to the three memory nodes 731, 735, and 739 which supply the weights and a current neuron value for processing neural functions in a neural network. As controlled by the NodeOp inputs 2442 and decoder 2441, the multipliers 2447-2449 are configured to multiply their input values and provide the results as input to the corresponding three-input adders 2450-2452 that are configured to provide a sum of the weighted neuron node results or Figure s24-27 and the associated disclosure).  

Claims  8 and 16 and 20.
Pechanek fails to explicitly recite:
mapping to the units depending on instruction type.
Sankaranarayanan further discloses: 
wherein the first type of instruction comprises a vector type of instruction and the second type of instruction comprises a scalar type of instruction (e.g. ¶40: operate on plural instructions in parallel or ¶44: decoded instructions are supplied to scalar datapath side A and vector datapath side B to functional units within scalar datapath side A and vector datapath side B … scalar datapath side A includes plural functional units that preferably operate in parallel or Figure 1: element 115 is a scalar multifunction unit, which can be accessed independently of the element 116 which is a vector processing unit, based on the type of the instruction decoded by element 113 or Figure 2: elements 221, 222, 223, 241 and the associated discussion or ¶52 :mapping to scalar functional unit 221 for scalar instructions or ¶53:mapping to scalar functional unit 222 for scalar instructions or ¶54 :mapping to scalar functional unit 223 for scalar instructions or ¶57: mapping to vector functional unit 241 for vector operation).  
Rationale:
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pechanek to incorporate mapping scalar and vector operations to corresponding units based on the operation type as taught by Sankaranarayanan for the benefit of multidimensional memory accesses, increasing the available bandwidth to the functional units, minimizing cache miss stall by bypassing level one data Sankaranarayanan especially e.g. ¶¶73,112). 

Examiner’s Note
The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well.

Conclusion
Any prior art made of record on the attached PTO-892 and not relied upon is considered pertinent to applicant's disclosure.
Applicant is reminded that in amending in response to a rejection of claims, the patentable novelty must be clearly shown in view of the state of the art disclosed by the references cited and the objections made.  Applicant must also show how the amendments avoid such references and objections.  See 37 CFR §1.111(c).  Additionally when amending, in their remarks Applicant should particularly cite to the supporting paragraphs in the original disclosure for the amendments.

Correspondence Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN J BUSS whose telephone number is (571)272-5831.  The examiner can normally be reached on M-F 9A-5P ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
PTO/SB/439 if applicant desires the examiner to be able to communicate by email.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 


/B. B./
Examiner, Art Unit 2125



/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125