DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 10 is objected to because of the following informalities:  “claim 13” (line 1).  It appears that claim 10 may be dependent from claim 1 not claim 13. Claim 10 also has other issues, see also lack of antecedent basis and improper dependency below. 
Claim 18 is objected to because of the following informalities:  “claim 11” (line 1).
 It appears that claim 18 should be dependent from claim 15 because the proper antecedent basis for the first column and the Nth column can be found in claim 15. Correction and feedback from applicant is advised in the next response. For the examination purpose, claim 18 is assumed as being dependent from claim 15. See also lack of antecedent basis below.
Claim 13 is objected to because of the following informalities:  
“the left/right adjacent or nonadjacent operation units” (line 2);
“the upper/lower adjacent or nonadjacent operation units” (lines 2, 3).
Suggested correction: 
“
“
Appropriate correction(s) is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 10 recites the limitations:
 “the N*N operation units" (line 1); 
“the first column” (Line 2);
“the right and right diagonally adjacent operation units” (line 4);
“the first row” (line 4);
 “the Nth column” (line 7);
“the left diagonal and the left operation units” (line 7);
“the ALU” (line 8);
“the (N*N)th operation unit” (last line).
There is insufficient antecedent basis for each of the limitations in the claim.
Claim 18 recites the limitations:
“the right and right lower adjacent operation units” (line 3);
“the left upper and the left operation units” (lines 6, 7).
There is insufficient antecedent basis for each of the limitations in the claim.
Claim 19 recites the limitations:
“the first row” (lines 1, 2);

“the Nth column” (line 2);
“the left upper adjacent operation units” (line 3);
“the right lower adjacent operation units” (line 3).
There is insufficient antecedent basis for each of the limitations in the claim.
Claim 20 recites the limitations:
“the ALU” (line 3);
“the exponential operation” (line 3).
There is insufficient antecedent basis for each of the limitations in the claim.

The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim 10 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  Claim 10 is dependent from claim 13 which does not contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed.  
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 11, 12, 21 is/are rejected under 35 U.S.C. 102a (1) as being anticipated by Kamo 20160097853.
As to claim 1, Kamo teaches a computation device comprising (see fig.13 that includes a signal processing circuit 30, which includes at least the neural network NN, a memory 31, and a transmission reception circuit 20, which includes at least a controller 28)  
a control module [controller 28] (see [0180], the controller 28 may be realized by a central processing unit which controls the entire transmission/reception circuit 20 and signal 
processing circuit 30); and 

wherein the control module [controller 28] is configured to send an instruction (e.g. by the program) to the multiple operation units [nodes/neurons] (see [0165], the neural network NN is configured or programmed to perform computation using the reception signals or secondary signal(s) and learned data, and output a signal indicating the number of arriving waves) and control data transmit [transmission/reception] between the operation units [nodes/neurons]. (See the controller 28 may be a microcomputer, an electronic control unit, etc., for example.  Based on a computer program which is stored in a memory such as a ROM, the controller 28 controls the entire transmission/reception circuit 20.  The controller 28 does not need to be provided inside the transmission/reception circuit 20, but may be provided inside the signal processing circuit 30.  In other words, the transmission/reception circuit 20 may operate in accordance with a control signal from the signal processing circuit 30.  Alternatively, some or all of the functions of the controller 28 may be realized by a central processing unit which controls the entire transmission/reception circuit 20 and signal processing circuit 30. See also how the input signals xi (i.e. data) are provided to the neurons to compute the output Yk(x) of the neural network in [0084]-[0086]. The neural network NN is shown in fig.13 as one of the functional components of the signal processing circuit 30; see [0142], the signal processing circuit 30 is configured or programmed to perform computation by 
As to claim 2, Kamo teaches wherein each of the operation units [nodes/neurons] is configured to receive the instruction (as to the instruction, see the citation of the neural network NN programmed to perform the computation in [0165]) and transmit data [input xi] to be computed or an intermediate result  [output Ф] to the other operation units except itself in one or more directions according to the instruction (see the input xI is inputted to input node I, and the output node J computes the output Ф, and subsequently multiplied by weight Wkj in fig.3, this process is expressed in Equation 7 in the sum of the weight Wkj times the output Ф in para [0085]; none of the data is feedback to itself. The J is the jth neuron. The fig.3 shows specific details of the Neural Network NN in fig.13).
As to claim 11, Kamo teaches a computation method comprising: 
sending, by a control module [controller 28], an instruction (e.g. by the control program or programming, see [0180], the controller 28 may be realized by a central processing unit which controls the entire transmission/reception circuit 20 and signal processing circuit 30); and 
receiving, by multiple operation units [node/neuron] of an operation module [Neural Network NN], the instruction [program], and performing, by the multiple operation units [nodes/neurons] of an operation module [Neural Network NN], data transmit according to the instruction. (see [0165], the neural network NN is configured or programmed to perform computation using the reception signals or secondary signal(s) and learned data, and output a signal indicating the number of arriving waves. See also how the input signals xi (i.e. data) are programmed to perform computation by using the reception signals or secondary signal(s), as well as learned data of the neural network NN, and output a signal indicating the number of arriving waves)
	As to dependent claim 12, dependent claim 12 includes similar limitations of dependent claim 2 except claim 12 is directed to a method. Claim 12 is rejected under the same reason as in clam 2 above.
As to claim 21, claim 21 includes similar limitations as in claim 1 except it is directed to “an electronic device” (preamble). Kamo is also directed to an electronic device (See radar system in [0020]). Therefore, claim 21 is rejected under the same reason as in claim 1 above. The details of the rejection are not being repeated herein.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention 

Claims 3, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Stewart et al.   20040133750.
As to claim 3, Kamo does not but Stewart teaches wherein the direction included a direction of transmitting to the left/right adjacent (e.g. East/West) or nonadjacent operation units, a direction of transmitting to the upper/lower adjacent (e.g. North/South) or nonadjacent operation units, and a direction of transmitting to diagonally adjacent (e.g. NW/SE) or nonadjacent operation units. (See the example for CU 253 in the center computational unit 253 in fig.6, para [0066], it is connected to the north CU 255/south CU 257, east CU 261/west CU 259, NW CU 262/SE CU 265. Fig.6 is for illustration purpose, the central CU 253 should be applicable to every CU, and it is also connected indirectly to non-adjacent CUs. See [0065], in addition to performing north to south (or south to north) and east to west (or west to east) byte-wise transfers, it may also be desirable to perform byte-wise transfers between neighboring groups of switching elements on the diagonal.  An example of a two dimensional array of computational units in which byte-wise transfers between a computational unit and its nearest eight neighbors is shown in FIG. 6. [0066]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include a direction of transmitting to the left/right adjacent or nonadjacent operation units, a direction of transmitting to the upper/lower adjacent or nonadjacent operation units, and a direction of transmitting to diagonally adjacent or nonadjacent operation units, as claimed (see details of claim mapping above) because one of 
As to claim 13, claim 13 includes similar limitations of claim 3 and is rejected under the same reason as in claim 3 above.
Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Culurciello et al. 20180341495.
As to claim 4, Kamo does not but Culurciello teaches further comprising 
a storage module (see fig.2 storage module 200 includes at least Memory, Memory Interconnect 160, Compute Cluster 102A-N that includes at least MAPS CACHE 132), wherein the storage module includes a data storage unit [Memory] and a temporary cache [MAPS Cache 132] and is configured to store the data [data] to be computed or the intermediate result in the same region or separately  (See [0036], the maps cache 132 stores map trace data (also referred to as "maps") corresponding to a contiguous set of map data in a contiguous regions of memory in the original input to the CNN or intermediate input data received from another layer of the CNN), 
wherein the control module [trace decoder 230][vMAX 140] includes: 

a computational control unit [trace decoder 230] configured to control computation manners (e.g. the signal for the last trace of the computation)  in the operation units [MACs 128A-D] and data transmit between the operation units [MACs 128A-D]  ([0041], the trace decoders 230 include a MAC trace decoder, a MAX trace decoder, and a trace move decoder, and a VMOV decoder.  The MAC trace decoder fetches cache lines from the maps cache 132 and forwards them to the vMACs 128A-128D.  It also provides addresses to index into the weights cache that is accessed by each vMAC.  This decoder is also responsible for signaling the vMACs to output the result accumulated in their internal accumulate registers. This signal is sent along with the last address of the final trace of the computation).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include a storage module, wherein the storage module includes a data storage unit and a temporary cache and is configured to store the data to be computed or the intermediate result in the same region or separately, wherein the control module includes: a storage control unit configured to control the storage module to store or read required data, and a computational control unit configured to control computation manners in the operation units and data transmit between the operation units, as claimed (see details of the claim mapping above) because one of ordinary skill in the art should be able to recognize the application of a known technique, such as the Culurciello’s storage module that .
Claims 5, 6, 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Baji 5091864.
As to claim 5, Kamo does not but Baji teaches:
wherein each operation unit [SPE-1-M 15]  (see fig.2, each of the SPE-1 to SPE-M; see fig.1 for more details in each SPE 15) includes multiple input ports [Qin, Din, and input  from the coefficient memory 1 to multiply-accumulate circuit 114 in fig.2), and the multiple input ports include a port configured to receive data [coefficient] (see the input  from the coefficient memory 1 ) transmitted by the storage module [coefficient memory 1] and a port [Din] configured to receive data transmitted by the other operation units (the Din port in each of the SPE 15), an output port [Dout/Qout] configured to transmit the data back to the storage module or to a specified operation unit (see the Dout and Qout connected to the next SPE 15), and at least one multiplier [multiplier] and at least one adder [adder]  (see an alternate embodiment in which output 15  derived from Qout of SPE-M is feeding back to the Qin of SPE-1, col.12, lines 24-37; see each SPE includes a multiplier and an adder as shown in fig.1).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the clamed invention to include wherein each operation unit includes multiple input ports, and the multiple input ports include a port configured to receive data transmitted by the storage module and a port configured to receive data transmitted by the other operation units, 
As to claim 6, Kamo does not but Baji teaches wherein each operation unit further includes at least one memory (see fig.1 coefficient memory 1 for storing the coefficient data, col.8, lines 1-8).
Claim 6 is dependent from clam 5, and the obviousness reasoning in claim 5 is also applicable to claim 6, and not being repeated herein.
As to claim 7, Kamo does not but Baji teaches wherein the operation module includes N*N operation units [3 x 3] (see fig.6 shows at least operation units SPEs 22, 32,33 that compute and provide the result at the sigmoid function  lookup table 16, see col.12, lines 12-23)  and an ALU  (Note 1: an ALU is not explicitly shown, but see col.5, lines 51-53, a bus is provided that allows the weight coefficient memories and the nonlinear function look-up tables to be accessed by the host CPU. Examiner holds that ALU is the heart of CPU, i.e. without ALU, CPU cannot function. Therefore, CPU must have ALU), wherein a data transmit direction between the N*N operation units is an S shape and N is a positive integer [3 x 3] (see fig.6, the 
Claim 7 is dependent from claim 6 and is rejected under the same reason in claim 5. The details of the obviousness reasoning is not being repeated herein.
Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Shi et al 20150310311 in view of Kim et al. 20180285104.
As to claim 9, Kamo does not but Shi teaches wherein the operation module [reconfigurable PE/SOM neural network 2]  includes N*N operation units [4 x 4 PEs] and N-1 ALUs (N-1 ALUs are not specifically shown, but see the obviousness reasoning in view of Kim below), and the N*N operation units [PEs] are configured to transmit computational data in different directions [west east, north south], wherein N is a positive integer (see the illustrated example in fig.1 and more details of each PE in fig.3 that includes the connections of the directions to the adjacent PEs, [0029]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the operation module includes N*N operation units, and the N*N operation units are configured to transmit computational data in different directions, wherein N is a positive integer, as claimed because one of ordinary skill in the art should be able to recognize the application of a known technique, such as the 4 x 4 operating units (PEs) that transfer data in north south east, and west to the adjacent PEs as taught by Shi, to a known device/method, such as the neural network NN of Kamo, for the purpose of communicating data with its adjacent four PE units at east, south, west and north (see Shi [0029]. MPEP 2143 KSR Example D).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include N-1 ALUs, as claimed, because one of ordinary skill in the art should be able to recognize the application of a known technique, such as any number of ALUs can be used as taught by Kim, which encompasses any N number of ALUs, e.g. N, N-1,N-2 etc., to a known device/method, such as the neural network NN of Kamo, so that data patterns can be generated for processing by a plurality of ALUs belonging to the corresponding groups of ALUs (see Kim[0062]. MPEP 2143 KSR Example D).
Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Culurciello et al. 20180341495.
As to claim 14, Kamo does not but Culurciello teaches:
wherein a storage control unit [vMAX 140] of the control module [compute unit cluster 102] controls the storage module [MAPS cache 132] to read neuron data [map trace data] to be computed for transmit [output] (see [0036], the vMAX 140 generates a single output corresponding to the maximum value of a set of data that are read from the maps cache 132),
reads (e.g. by preloading or forwarding the weight into a cache; see [0034] the destination ID is used by the DDN 150 to forward data to the correct buffer, such as the instruction cache in the fetch unit 106 of the control core 104 or a particular maps cache or weights cache within the compute core 120) synaptic weight data [weight] and store the synaptic weight data [weight] in a memory [weight cache 136] of each operation unit [MAC 
a computational control unit [MAC trace decoder] of the control module [compute unit cluster 102] sends a computational signal [map trace] to be computed to each operation unit and initializes each operation unit [MAC 128A-C] (see each of the MAC 128A-D has the read/write access to the MAPS Cache 132 in [0036], see also MAC trace decoder upon receiving the vector instructions from the control core 104,[0040], fetches the cache lines from the maps cache 132 and forward them to the vMAC 128A-128D, [0041]), and 
wherein the storage control unit [control core 104] of the control module [cluster 102A] sends an instruction (see [0040], the vector instructions received from the control core 104), and 
the storage module [MAPS Cache 132] transmits the neuron data to be computed to each operation unit (see each of the MAC 128A-D has the read/write access to the MAPS Cache 132 in [0036], see also MAC trace decoder upon receiving the vector instructions from the control core 104,[0040], fetches the cache lines from the maps cache 132 and forward them to the vMAC 128A-128D, [0041]), and the computational control unit of the control module sends an instruction [signal to instruct the MACs to output the accumulated result] (see [0041], the 
each operation unit (see each of the MAC 128A-D) receives the neuron data [map trace M1] and multiplies the neuron data [M1] and the corresponding synaptic weight data [weight value]  in the memory [weight cache 136]  of itself to obtain a product result [multiplication product].(see each of the MAC 128A-D multiplies M1 from the maps cache 132 with the weight value from the weight cache 136  and adds the output from the multiplier 240 to another value M2 from the maps cache 132).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include wherein a storage control unit of the control module controls the storage module to read neuron data to be computed for transmit, reads synaptic weight data and store the synaptic weight data in a memory of each operation unit, and a computational control unit of the control module sends a computational signal to be computed to each operation unit and initializes each operation unit, and wherein the storage control unit of the control module sends an instruction, and the storage module transmits the neuron data to be computed to each operation unit, and the computational control unit of the control module sends an instruction, and each operation unit receives the neuron data and multiplies the neuron data and the corresponding synaptic weight data in the memory of itself to obtain a product result, as claimed (see details of claim mapping above) because one of ordinary skill in the art should be able to recognize the application of a known technique, such as the MAC units and the vector instructions of Culurciello for performing the product of the trace data and the .
Claim 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kamo 20160097853 in view of Yao et al. 20180032844, and in view of Local Response Normalization (LRN) by oneAPI Deep Neural Network Library (oneDNN)  (https://oneapi-src.github.io/oneDNN/dev_guide_lrn.html), used as an extrinsic reference to show the characteristic feature of LRN that Yao does not explicitly show (MPEP 2131.01).
As to claim 17, Kamo teaches a multiple operation units [nodes/neurons] to generate an accumulated result (seefig.3 shows the structure of the connected nodes/neurons, [0084]-[0086], equation 7, where the output is the accumulated sum of the weight times the input Xi at each neuron j).  
Kamo does not but Yao teaches wherein during computation for a local response normalization (LRN) layer (see Yao, LRN in fig.3 local response normalization, para [0031] teaches the convolutional layers of the CNN are following by local response normalization, paras [0073] teaches the LRN data are stored in the memory 703 for use by the processors [graphic processors or CPUs], as claimed.
It would have been obvious to one of ordinary skill in the art to include the local response normalization LRN, as claimed because one of ordinary skill in the art should be able to recognize the application of a known technique, such as the LRN as taught in Yao, to a known device/method, such as the neural network NN of Kamo, for the purpose of providing a deep convolutional neural network (see Yao [0031]. MPEP 2143 KSR Example D).
2) and the ALU [CPU/GPU] executes an exponential operation ( ᵦ) on the accumulated result (∑), as claimed. Local Response Normalization (LRN) is used as an extrinsic evidence to show the characteristic feature of the LRN is to take the exponential function/operation of the accumulated sum of the square input, which is implemented/executed by a hardware engine, such as CPU/GPU (See MPEP 2131.01). 
Allowable Subject Matter
Claims 8, 10, 15, 16,18, 19, 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims pending upon all the applicable minor objections, “112b”  and “112d” set forth in this action. None of the prior art of record further teaches (Partial features shown, see claims for full details):

b)  The N*N operation units arranged in an array with N rows and N columns, the operation units in the first column receive the data to be computed from the storage module, to complete square operations, and to transmit square values to the right and right diagonally adjacent operation units, the operation units in the first row receive the data to be computed from the storage module, to complete square operations, and to transmit square values to the right diagonally adjacent operation units, the operation units in the Nth column receive data from the left diagonal and the left operation units and complete accumulation, and the ALU receives the computational result transmitted by the (N*N)th operation unit and execute an exponential operation. (Claim 10)
c) The operation unit in the first row and the first column transmits a computational result rightwards to the operation unit in the first row and the second column, 
the operation unit in the first row and the second column adds the received computational result and a computational product of itself to obtain a partial sum and transmits it rightwards until to the operation unit in the Nth row and the Nth column, an S-shaped transmit path is formed, and the partial sum is accumulated, wherein N is a positive integer, the ALU in the 
d) Each operation unit receives the computational result of its previous operation unit, adds the computational result and the computational product of itself to obtain a partial sum, and continues transmitting the partial sum backwards, wherein the multiple operation units transmit and accumulate the partial sum along the S-shaped path, wherein the ALU of the operation module, after accumulation, receives the computational results of the multiple operation units and executes activation computation, and wherein the activation computational result is written into the storage module. (Claim 16)
e) The operation units of the operation module in the first column receive data to be computed from the storage module, complete square operations and transmit square values to the right and right lower adjacent operation units, the operation units of the operation module in the first row receive the data to be computed from the storage module, complete square operations and transmit square values to the right lower adjacent operation units, and the operation units of the operation module in the Nth column receive data from the left upper and the left operation units, and complete accumulation, wherein N is a positive integer. (Claim 18)
f) The operation units, except those in the first row, the first column, and the Nth column, of the operation module, after receiving the data from the left upper adjacent operation units, transmit the data to the right lower adjacent operation units, and accumulate the data and data transmitted by the left adjacent operation units and transmit accumulated data to the right adjacent operation units. (Claim 19. Claim 20 is dependent from claim 19).

a) Henry et al. 20180157962 is cited for the teaching of the multiplication of weight with the input from the previous layer and add all the product of the artificial neural network (see fig.2, [0099]). 
2) Werner et al.  20170097884 is cited for the teaching of  a matrix of processing units, ALUs, for writing output to memory or to be used by a subsequent ALU for deep learning applications that can comprise convolution operations, linear contrast operations, local response normalization operations, max pooling operations, etc.  (See [0027][0028]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL H PAN whose telephone number is (571)272-4172.  The examiner can normally be reached on M-F 8:30 am -5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on 571 272 4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished 


DANIEL H. PAN
Examiner
Art Unit 2182


/DANIEL H PAN/Primary Examiner, Art Unit 2182