Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This Office Action is responsive to Applicants' Amendment filed on February 28, 2022, in which claims 1-3, 5-7, 10-12, 15, 16, and 19 are amended. Claims 1-19 are currently pending.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on January 28, 2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Response to Arguments
The rejections to claims 1-19 under 35 U.S.C. § 112(b) are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections.
Applicant’s arguments with respect to rejection of claims 1-19 under 35 U.S.C. 103(a) in regards to Goyals teaching of controlling the operation module to perform the convolution operation by inputting each of the plurality of elements to the plurality of processing elements in a predetermined order and sequentially applying the plurality of elements to the input data have been considered, but are not deemed persuasive.  The original citation (FIG. 3 [¶0026] " The weight matrix W of 
Applicant’s arguments with respect to rejection of claims 1-19 under 35 U.S.C. 103(a) based on amendment have been considered and are persuasive. The argument is moot in view of a new ground of rejection set forth below.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“An operation module configured to” in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


	Claims 1-3, 5, 10-12, 14, and 19 are rejected under 35 U.S.C. 102 as being unpatentable over Goyal (US 20170316312 A1) and in view of Majumder (“SVD and Neural Network Based Watermarking Scheme”, 2010). 

	Regarding claim 1, Goyal teaches An electronic apparatus for performing machine learning, the electronic apparatus comprising: an operation module configured to include a plurality of processing elements arranged in a predetermined pattern and share data between the plurality of processing elements which are adjacent to each other to perform an operation; and (FIG. 1 [Abstract] "the DLP includes a plurality of tensor engines configured to perform operations for pattern recognition and classification based on a neural network." [¶0024] "the retrieved data is multiplexed to the tensors engines 104 by a multiplexer/crossbar 110" tensor engine interpreted as synonymous with processing element. DLP interpreted as synonymous with operation module.  FIG. 1 shows that data transfer between crossbar and tensor engines is bidirectional.) 
	a processor configured to control the operation module to perform a convolution operation by applying a filter to input data, ([¶0017] "Each tensor engine includes one or more matrix multiplier (MatrixMul) engines each configured to perform a plurality of dense and/or sparse vector-matrix and matrix-matrix multiplication operations, one or more convolutional network (ConvNet) engines" [¶0021] "During its operation, the DLP 102 is configured to accept instructions from a host 103 and submit the instructions to the tensor engines 104...The DLP 102 is also configured to provide deep learning processing results by the DLP 102 back to the host 103 via the DLP interface 112" Host is interpreted as synonymous with processor configured to control the operation module.)
	and control the operation module to perform the convolution operation by inputting each of the plurality of elements to the plurality of processing elements in a predetermined order and sequentially applying the plurality of elements to the input data. (FIG. 3 [¶0026] " The weight matrix W of N1×N2 is stored in column 
	While Goyal teaches using multidimensional filters, and processing said filter in an element-wise fashion, Goyal does not explicitly teach wherein the processor is configured to arrange a plurality of elements in a predetermined order by dividing a two-dimensional filter into the plurality of elements which is one dimensional data.  

Majumder, who teaches a related art of mapping neural network inputs, teaches wherein the processor is configured to arrange a plurality of elements in a predetermined order by dividing a two-dimensional filter into the plurality of elements which is one dimensional data. ([p. 1 §1] "Therefore for noise, filtering or compression mechanisms, it is better to go for the transform domain, despite of 
higher computing cost to attain higher robustness of the watermark" [p. 2 §2] "The singular value decomposition (SVD) of host image is performed to obtain the matrices U, S, and V. The S matrix consisting of the diagonal values is converted to one dimension via zig zag scan, done in order to add the logo near the most significant Eigen values. This leads to a matrix S'" See also FIG. 1.  Converting 2d to 1d.  

	Goyal and Majumder are both directed towards mapping neural network inputs.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Goyal with the teachings of Majumder by using a zigzag pattern to map 1d inputs to a 2d filter in a bidirectional fashion.  Majumder teaches as a motivation for combination ([p. 1 §1] “In the spatial domain, embedding of watermark is implemented by directly adding it to the data in terms of any particular algorithm. It is faster than the latter, due to its simpler operations and implementation, but is less robust. Therefore for noise, filtering or compression mechanisms, it is better to go for the transform domain, despite of higher computing cost to attain higher robustness of the watermark”).

	Regarding claim 2, the combination of Goyal, and Majumder teaches The electronic apparatus of claim 1,  wherein the processor controls the operation module to perform the convolution operation by applying each of the plurality of elements to two-dimensional or three-dimensional input data. (Goyal [¶0026] "corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine"" FIG 3 input layer is two dimensional data.). 

	Regarding claim 3, the combination of Goyal, and Majumder teaches The electronic apparatus of claim 1, wherein the processor is further configured to input the plurality of elements, except for the plurality of elements having a zero value, to the plurality of processing elements in the predetermined order. (Goyal [¶0026] "FIG. 5A depicts an example of vector-matrix multiplication wherein the vector in dense form of length N1 is read only once in blocks of size B each. The weight matrix W of N1×N2 is stored in column major form" [¶0027] " For scalable matrix-matrix multiplication, the DLP 102 is configured to partition a large dense or sparse matrix into smaller portions and distribute the portions of the matrix across multiple tensor engines 104. In some embodiments, separate Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) Format can be adopted for the corresponding portion of the large matrix distributed to each of the tensor engines 104." two-dimensional filter represented as synonymous with weight matrix W.). 

	Regarding claim 5, the combination of Goyal, and Majumder teaches The electronic apparatus of claim 1, wherein the processor is further configured to perform an operation of multiplying a first element of the plurality of elements with each of a plurality of first data values belonging to a first row of the input data, perform an operation of multiplying the first elements with each of a plurality of second data values belonging to a second row of the input data, perform an operation of multiplying a second element of the plurality of elements with each of the plurality of first data values, and perform an operation of multiplying the second element with each of the plurality of second data values. (Goyal [¶0026] "The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector" FIG. 3 shows element wise multiplication of source image data with convolution filter. Column wise multiplication is interpreted as being interchangable with row wise multiplication.). 

	Claims 10-12, and 14 are substantially similar to claims 1-3 and 5.  Therefore, the rejection applied to claims 1-3 and 5 also apply to claims 10-12. 

	Claim 19 is directed towards a non-transitory computer-readable recording medium having a program stored thereon which when executed performs the methods of claim 1.  Therefore the rejection applied to claim 1 also applies to claim 19.  Claim 19 mentions additional elements non-transitory computer-readable recording medium and program stored thereon ([¶0020] “In the example of FIG. 1, the system 100 includes a hardware-based programmable deep learning processor (DLP) 102, wherein the DLP 102 further includes at least a plurality of tensor engines (TEs) 104, which are dedicated hardware blocks/components each including one or more microprocessors and on-chip memory units storing software instructions programmed by a user for various machine learning operations. When the software instructions are executed by the microprocessors, each of the hardware components becomes a special purposed 

	Claims 4, 6, 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Goyal, and Majumder and in further view of Young (US 20160342890 A1).

	Regarding claim 4, the combination of Goyal and Majumder teaches The electronic apparatus of claim 1.
	However, the combination of Goyal and Majumder does not explicitly teach wherein the processor is further configured to control the operation module to perform the convolution operation by transferring an accumulation of values obtained by multiplying different data values of the input data with each of the plurality of elements to an adjacent processing element.  

Young, who teaches a related system for processing neural networks teaches The electronic apparatus of claim 1, wherein the processor is further configured to control the operation module to perform the convolution operation by transferring an accumulation of values obtained by multiplying different data values of the input data with each of the plurality of elements to an adjacent processing element. ([¶0022] "The system generates accumulated values from the weight inputs and the activation inputs using a matrix multiplication unit of the special-purpose hardware circuit (step 106)" [¶0035] "On each clock cycle, each cell can 

	Goyal, Majumder, and Young are all directed towards partitioning neural networks for processing.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to transfer the accumulation results in the combination of Goyal and Majumder to parallel processing units as suggested by Young. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Young that the proposed system is capable of ([¶0008] “maximizing throughput in the circuit and avoiding stalling of the circuit. The circuit can efficiently perform the computation even if weight inputs are reused a different number of times at each layer.”).

	Regarding claim 6, the combination of Goyal and Majumder teaches wherein the predetermined direction is a direction in which the second element is disposed based on the first element in the two-dimensional filter. (Goyal FIG. 3 [¶0023] "Here, each kernel is a multi-dimensional" [¶0026] " The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block FIG. 3 shows that multiplication of the convolution filter is relative to an individual pixel of the source image, but that the multiplication is dependent on the pixels surrounding the source pixel.  Goyal also teaches that the 
	However, the combination of Goyal and Majumder does not explicitly teach  when the operation for the first element is completed and the operation for the second elements starts in the first row, the processor is further configured to shift a plurality of operation values for the first element in a predetermined direction to perform an accumulation for the operation values,  

Young, who teaches a related system for processing neural networks teaches The electronic apparatus of claim 5, wherein when the operation for the first element is completed and the operation for the second elements starts in the first row, the processor is further configured to shift a plurality of operation values for the first element in a predetermined direction to perform an accumulation for the operation values, ([¶0039] "The summation circuitry can sum the product and the accumulated value from the sum in register 404 to generate a new accumulated value. The summation circuitry 410 can then send the new accumulated value to another sum in register located in a bottom adjacent cell. The new accumulated value can be used as an operand for a summation in the bottom adjacent cell." [¶0040] "The cell can also shift the weight input and the activation input to adjacent cells for processing. For example, the weight register 402 can send the weight input to another weight register in the bottom adjacent cell. " Sending the accumulated value and shifting corresponding weights and activations is interpreted as synonymous with shifting a plurality of operation values for the first element in a predetermined direction. See also FIG. 3). 

	Goyal, Majumder, and Young are all directed towards partitioning neural networks for processing.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to transfer the accumulation results in the combination of Goyal and Majumder to parallel processing units as suggested by Young. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Young that the proposed system is capable of ([¶0008] “maximizing throughput in the circuit and avoiding stalling of the circuit. The circuit can efficiently perform the computation even if weight inputs are reused a different number of times at each layer.”).

	Claims 13 and 15 are substantially similar to claim 4 and 6.  Therefore, the rejection applied to claims 4 and 6 also applies to claims 13 and 15.

	Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Goyal, Majumder, and Young and in further view of Montero 

	Regarding claim 7, Goyal teaches The electronic apparatus of claim 1,  the predetermined direction is a direction in which the order proceeding in one side direction is based on a certain element in the two-dimensional filter, (FIG. 3 [¶0026] " The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block FIG. 3 
	However, Goyal does not explicitly teach  wherein the processor is further configured to shift an operation value used in each row for the input data in a predetermined direction to perform an accumulation of operation values, and that after moving in the predetermined direction processing proceeds to an element which is adjacent to a corresponding element in a next row or a next column of the element positioned at the end of the proceeding direction, and proceeds in a direction opposite to one side direction in the adjacent element.  

Young teaches wherein the processor is further configured to shift an operation value used in each row for the input data in a predetermined direction to perform an accumulation of operation values, and ([¶0039] "The summation circuitry can sum the product and the accumulated value from the sum in register 404 to generate a new accumulated value. The summation circuitry 410 can then send the new accumulated value to another sum in register located in a bottom adjacent cell. The new accumulated value can be used as an operand for a summation in the bottom adjacent cell." [¶0040] "The cell can also shift the weight input and the activation input to adjacent cells for processing. For example, the weight register 402 can send the weight input to another weight register in the bottom adjacent cell."). 



	While the combination of Goyal, Majumder, and Young teaches proceeds to an element which is adjacent to a corresponding element in a next row or a next column of the element positioned at the end of the proceeding direction, 
The combination does not explicitly teach that the processing then proceeds in a direction opposite to one side direction in the adjacent element.  

Montero, who teaches a related art of encoding matrices, teaches that the processing then proceeds to an element which is adjacent to a corresponding element in a next row or a next column of the element positioned at the end of the proceeding direction, and proceeds in a direction opposite to one side direction in the adjacent element. (FIG. 2 Any direction is opposite to one side direction. Zig zag matrix traversal involves proceeding to an adjacent element then in an opposite direction.). 


	Claim 16 is substantially similar to claim 7.  Therefore the rejection applied to claim 7 also applies to claim 16.

	Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Goyal, and Majumder and in further view of Cencini (US 20170264493 A1).

	Regarding claim 8, the combination of Goyal and Majumder teaches The electronic apparatus of claim 1, the processor is further configured to control the plurality of processing elements to perform operations according to a convolutional neural network (CNN) algorithm ([Abstract} "A hardware-based programmable deep learning processor (DLP) is proposed, wherein the DLP comprises with a plurality of accelerators dedicated for deep learning processing. Specifically, the 
	However, the combination of Goyal and Majumder does not explicitly teach the plurality of processing elements form a network having a structure in which a tree topology network is coupled to a mesh topology network, and 
	and a recurrent neural network (RNN) algorithm using the network having the coupled structure.  

Cencini, who teaches a related art of distributing tasks across a network, teaches the plurality of processing elements form a network having a structure in which a tree topology network is coupled to a mesh topology network, and ([¶0160] "In some embodiments, the topology may be a structured topology, such as a tree topology like that shown in FIG. 7, or in other embodiments, the topology may be an unstructured topology, e.g., in other forms of mesh topologies." FIG. 7 shows a network of processing elements.).
	and a recurrent neural network (RNN) algorithm using the network having the coupled structure. ([¶0212] "A variety of different approaches may be implemented by the policy engine 166 to optimize an objective function...such as a recurrent neural network configured to optimize the objective function over time"). 



Claim 17 is substantially similar to claim 8.  Therefore, the rejection applied to claim 8 also applies to claim 17. 
	Claims 9 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Goyal, Majumder, and Cencini and in further view of Zhou (“A C-LSTM Neural Network for Text Classification”, 2015).

	Regarding claim 9, the combination of Goyal, Majumder, and Cencini teaches The electronic apparatus of claim 8, wherein the processor is further configured to control the plurality of processing elements to perform the operation according to the mesh topology network in a convolution layer and a pooling layer of the CNN algorithm (Goyal [¶0023] "one or more pooling (or sub-sampling) layers, each of 
	While the combination of Goyal, Majumder, and Cencini teaches to perform operations according to a tree topology network associated with an RNN.  The combination does not explicitly teach to perform the operation according to the tree topology network in a fully connected layer of the CNN algorithms and each layer of the RNN algorithm.  

Zhou, who teaches a related system of partitioning neural networks for processing, teaches to perform the operation according to the tree topology network in a fully connected layer of the CNN algorithms and each layer of the RNN algorithm. ([Sec. 2 ¶4] "With the ability of explicitly modeling time-series data, RNNs are being increasingly applied to sentence modeling. For example, Tai et al. (2015) adjusted the standard LSTM to tree-structured topologies and obtained superior results over a sequential LSTM on related tasks. Most of these models use multi-layer CNNs and train 

The combination of Goyal, Cencini, and Majumder, is analogous to Zhou as they are both directed towards partitioning a neural network for processing.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network system of the combination of Goyal, Majumder, and Cencini with the teachings of Zhou by using a CNN in combination with a CNN and implementing a tree-based topology. Zhou teaches as motivation for combination ([p. 98 Sec. A] “In this paper, we introduce a new architecture short for C-LSTM by combining CNN and LSTM to model sentences. To benefit from the advantages of both CNN and RNN, we design a simple end-to-end, unified architecture by feeding the output of a one-layer CNN into LSTM.”).
Claim 18 is substantially similar to claim 9.  Therefore the rejection applied to claim 9 also applies to claim 18.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126