Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This Office Action is responsive to Applicants' Amendment filed on October 31, 2022, in which claims 1, 2, 4, 10 and 19 are currently amended. Claims 5-6 and 14-15 are canceled. Claims 1-2, 4, and 7-11, 13, and 16-19 are currently pending. 

Response to Arguments
The rejections to claims 1-2, 4, and 7-9 under 35 U.S.C. § 112(b) are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections.
Applicant’s arguments with respect to rejection of claims 1-2, 4, and 7-11, 13, and 16-19 under 35 U.S.C. 103 based on amendment have been considered and are persuasive. The argument is moot in view of a new ground of rejection set forth below.

Claim Objections
Claims 1, 10, and 19 objected to because of the following informalities:  "perform an operation of multiplying a first elements with" should read "perform an operation of multiplying a first element with".  Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

	Claims 1, 2,  4, 10-11, and 13, and 19 are rejected under U.S.C. §103 as being unpatentable over the combination of Goyal (US20170316312A1) and Ross (US9805304B2) and in further view of Chen (“Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks”, 2016).

	 Regarding claim 1, Goyal teaches An electronic apparatus for performing machine learning, the electronic apparatus comprising: an integrated circuit configured to include a plurality of processing elements arranged in a predetermined pattern and share data between the plurality of processing elements which are adjacent to each other to perform an operation; and(FIG. 1 [Abstract] "the DLP includes a plurality of tensor engines configured to perform operations for pattern recognition and classification based on a neural network." [¶0024] "the retrieved data is multiplexed to the tensors engines 104 by a multiplexer/crossbar 110" tensor engine interpreted as synonymous with processing element. DLP interpreted as synonymous with integrated circuit.  FIG. 1 shows that data transfer between crossbar and tensor engines is bidirectional.)
	a processor configured to control the integrated circuit to perform a convolution operation by applying a filter to input data,([p. 252 §VIIC] "EIE has the potential to support 1x1 convolution and 3x3 Winograd convolution by turning the channel-wise reduction into an M ×V . Winograd convolution saves 2.25× multiplications than naive convolution [33], and for each Winograd patch the 16 M × V can be scheduled on an EIE."  Host is interpreted as synonymous with processor configured to control the integrated circuit.)
	wherein the processor is configured to divide a two-dimensional filter into a plurality of elements which is one dimensional data, (See FIG. 5A where the two-dimensional weight kernel is divided into one-dimensional vectors of size B.  Two-dimensional filter interpreted as synonymous with kernel.)
	and control the integrated circuit to perform the convolution operation by inputting each of the plurality of elements to the plurality of processing elements in a predetermined order and sequentially applying the plurality of elements to the input data.(FIG. 3 [¶0026] " The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine" FIG. 3 shows the two dimensional convolution filter while ¶0026 describes the predetermined order in which the elements are input and applied.)
	wherein the processor is further configured to: perform an operation of multiplying a first element of the identified plurality of elements with each of a plurality of first data values belonging to a first row of the input data, ([¶0026] "The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine 408 as a partial sum to the corresponding output value" [¶0027] "In some embodiments, the MatrixMul engine 408 in each tensor engine 104 is configured to achieve efficient vector-matrix multiplication by minimizing or avoiding data movement for multiplication between a sparse vector and a dense or sparse matrix, wherein only data that corresponds to non-zero values in the sparse vector is loaded into memory 406 of the tensor engine 104 upon request. For scalable matrix-matrix multiplication, the DLP 102 is configured to partition a large dense or sparse matrix into smaller portions and distribute the portions of the matrix across multiple tensor engines 104. In some embodiments, separate Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) Format can be adopted for the corresponding portion of the large matrix distributed to each of the tensor engines 104. The MatrixMul engine 408 of each tensor engine 104 is then configured to perform a matrix-matrix multiplication on its corresponding portion of the partitioned matrix to speed up the matrix-matrix multiplication." [¶0029] "In the example of FIG. 4, the ConvNet engine 410 in each tensor engine 104 is configured to explore sparsity of the vectors and/or matrices across the spectrum of various convolution layers of the neural network for efficient convolution. During convolution on the network, a rectified linear unit (ReLU), which applies an activation function defined as f(x)=max(0, x) where x is an input to a neuron, is widely used. As a result of such ReLU application, the resulting/output matrices become increasingly sparse as the data processes along the processing pipeline." FIG. 5A shows a first row of the input data and block of size B of said first row of input data interpreted as synonymous with a plurality of first data values.  Block of weights interpreted as synonymous with a first element.)
	perform an operation of multiplying a first elements with each of a plurality of second data values belonging to a second row of the input data, ([¶0026] "The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine 408 as a partial sum to the corresponding output value," weight vector for second column interpreted as synonymous with plurality of second data values belonging to a second row of the input data.  Vector block interpreted as synonymous with a first element)
	and an operation of multiplying a second element of the identified plurality of elements with each of the plurality of first data values, and perform an operation of multiplying the second element with each of the plurality of second data values, ([¶0026] "The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine 408 as a partial sum to the corresponding output value...the MatrixMul engine 408 as a partial sum to the corresponding output value, which is updated N1/B times during the vector-matrix multiplication" [¶0027] "In some embodiments, the MatrixMul engine 408 in each tensor engine 104 is configured to achieve efficient vector-matrix multiplication by minimizing or avoiding data movement for multiplication between a sparse vector and a dense or sparse matrix, wherein only data that corresponds to non-zero values in the sparse vector is loaded into memory 406 of the tensor engine 104 upon request. For scalable matrix-matrix multiplication, the DLP 102 is configured to partition a large dense or sparse matrix into smaller portions and distribute the portions of the matrix across multiple tensor engines 104. In some embodiments, separate Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) Format can be adopted for the corresponding portion of the large matrix distributed to each of the tensor engines 104. The MatrixMul engine 408 of each tensor engine 104 is then configured to perform a matrix-matrix multiplication on its corresponding portion of the partitioned matrix to speed up the matrix-matrix multiplication." [¶0029] "In the example of FIG. 4, the ConvNet engine 410 in each tensor engine 104 is configured to explore sparsity of the vectors and/or matrices across the spectrum of various convolution layers of the neural network for efficient convolution. During convolution on the network, a rectified linear unit (ReLU), which applies an activation function defined as f(x)=max(0, x) where x is an input to a neuron, is widely used. As a result of such ReLU application, the resulting/output matrices become increasingly sparse as the data processes along the processing pipeline."  Goyal explicitly teaches that there are N1/B elements where it would be obvious to one of ordinary skill in the art from FIG. 5A and 5B that N1/B > 1.)
	and wherein the predetermined direction is a direction in which the second element is disposed based on the first element in the two-dimensional filter(FIG. 3 [¶0023] "Here, each kernel is a multi-dimensional" [¶0026] " The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block FIG. 3 shows that multiplication of the convolution filter is relative to an individual pixel of the source image, but that the multiplication is dependent on the pixels surrounding the source pixel.  Goyal also teaches that the multiplication is done procedurally column wise.  Kernel interpreted as synonymous with filter.).
	However, Goyal does not explicitly teach identify a plurality of elements, except for at least one elements having a zero value among the plurality of elements
	wherein when the operation for the first element is completed and the operation for the second elements starts in the first row the processor is further configured to shift a plurality of operation values for the first element in a predetermined direction by a predetermined interval and perform an accumulation for the operation values, 
	and the predetermined interval is determined based on a number of the zero values existing between the first element and the second element in the two-dimensional filter..

	Ross, in the same field of endeavor, teaches identify a plurality of elements, except for at least one elements having a zero value among the plurality of elements([Col. 8 l. 10-20] " If the weight control register 414 stores a non-zero control value, the weight register 402 can ignore the weight input sent by the weight path register 412" Ross explicitly teaches identifying specifically non-zero elements which is interpreted as synonymous with identifying an element except for at least one element having a zero value.  FIG. 8 shows a plurality of weight control value elements except for at least one element having a zero value (see row for Control 806).)
	wherein when the operation for the first element is completed and the operation for the second elements starts in the first row, the processor is further configured to shift a plurality of operation values for the first element in a predetermined direction by a predetermined interval and perform an accumulation for the operation values, ([Col. 10 l. 15-22] "Activation inputs can be shifted in a similar fashion in the other dimension, e.g., from left to right. Once the activation inputs and the weight inputs are in place, the processor can perform a convolution calculation, e.g., by using the multiplication and summation circuitries within the cells, to generate a set of accumulated values to be used in a vector computation unit.")
	and the predetermined interval is determined based on a number of the zero values existing between the first element and the second element in the two-dimensional filter.([Col. 6 l. 6-20] "In some implementations, a host interface, e.g., the host interface 202 of FIG. 2, shifts activation inputs throughout the array 306 along one dimension, e.g., to the right, while shifting weight inputs throughout the array 306 along another dimension, e.g., to the bottom. For example, over one clock cycle, the activation input at cell 314 can shift to an activation register in cell 316, which is to the right of cell 314. Similarly, the weight input at cell 316 can shift to a weight register at cell 318, which is below cell 314." [Col. 10 l. 35-41] "In some implementations, if a control value in a given weight sequencer is non-zero, a weight input in a corresponding cell of the systolic array will shift to an adjacent cell. If the control value in the given weight sequencer is zero, the weight input can be loaded into the corresponding cell and used to compute a product with an activation input in the cell." Ross explicitly teaches that the matrix multiplication structure is a kernel matrix structure for convolution kernels such that a kernel is interpreted as synonymous with a filter of the claimed invention.  Shifting if the control value in the kernel is non-zero is interpreted as synonymous with being based on the number of zeros between a first and second element in the two-dimensional filter.).

	Goyal as well as Ross are directed towards hardware accelerators for convolutional neural networks.  Therefore, Goyal as well as Ross are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Goyal with the teachings of Ross by having a zero aware shift mechanism for matrix multiplications.  Ross provides as additional motivation for combination ([Col. 1 l. 35-61] "using a sliding window, the software then can shift the kernel to overlay a second portion of activation inputs and calculate another dot product corresponding to another activation input. The software can repeatedly perform this process until each activation input has a corresponding dot product. In some implementations, the dot products are input to an activation function, which generates activation values. The activation values can be combined, e.g., pooled, before being sent to a subsequent layer of the neural network.").  This motivation for combination also applies to the remaining claims which depend on this combination.

	While the combination of Goyal and Ross does teach the claim limitations as drafted and are in the same field of endeavor, the combination of Goyal and Ross does not explicitly teach that non-zero weight/activation values can be excepted from the plurality of weight-activation values.  While the reference Chen in not necessarily relied upon to teach the instant claim limitations, it is introduced to reinforce the obviousness of excepting weight/activation values in a CNN accelerator.  Chen, in the same field of endeavor, teaches to identify a plurality of elements, except for at least one elements having a zero value among the plurality of elements ([p. 134 §IV] "Fig. 12. PE architecture. The datapaths in red show the data gating logic to skip the processing of zero ifmap data.").

	The combination of Goyal, and Ross as well as Chen are directed towards hardware accelerators for convolutional neural networks.  Therefore, the combination of Goyal, and Ross as well as Chen are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Goyal, and Ross with the teachings of Chen by excepting weight/activation values from the plurality of weight/activation values.  Chen provides as additional motivation for combination ([Abstract] "Eyeriss is an accelerator for state-of-the-art deep convolutional neural networks (CNNs). It optimizes for the energy efficiency of the entire system, including the accelerator chip and off-chip DRAM, for various CNN shapes by reconfiguring the architecture...Eyeriss achieves these goals by using a proposed processing dataflow, called row stationary (RS), on a spatial architecture with 168 processing elements. RS dataflow reconfigures the computation mapping of a given shape, which optimizes energy efficiency by maximally reusing data locally to reduce expensive data movement, such as DRAM accesses. Compression and data gating are also applied to further improve
energy efficiency.").  This motivation for combination also applies to the remaining claims which depend on this combination.

	 Regarding claim 2, the combination of Goyal, Ross, and Chen teaches The electronic apparatus of claim 1,  wherein the processor controls the integrated circuit to perform the convolution operation by applying each of the plurality of elements to two-dimensional or three-dimensional input data.(Goyal [¶0026] "corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block of weights is read from the weight matrix, they are multiplied element-wise with the block of the vector, summed, and added by the MatrixMul engine"" FIG 3 input layer is two dimensional data.).
	
	 Regarding claim 4, the combination of Goyal, Ross, and Chen teaches The electronic apparatus of claim 1, wherein the processor is further configured to control the integrated circuit to perform the convolution operation by transferring an accumulation of values obtained by multiplying different data values of the input data with each of the plurality of elements to an adjacent processing element.(Ross [Col. 1 l. 20-40] " Each cell is configured to pass a weight control signal to an adjacent cell, the weight control signal causing circuitry in the adjacent cell to shift or load a weight input for the adjacent cell. A weight path register configured to store the weight input shifted to the cell; a weight register coupled to the weight path register; a weight control register configured to determine whether to store the weight input in the weight register; an activation register configured to store an activation input and configured to send the activation input to another activation register in a first adjacent cell along the first dimension;" [Col. 10 l. 15-22] "Activation inputs can be shifted in a similar fashion in the other dimension, e.g., from left to right. Once the activation inputs and the weight inputs are in place, the processor can perform a convolution calculation, e.g., by using the multiplication and summation circuitries within the cells, to generate a set of accumulated values to be used in a vector computation unit.").	

	Claims 10-11 and 13 are substantially similar to claims 1-2 and 4.  Therefore, the rejections applied to claims 1-2 and 4 also apply to claims 10-11 and 13. 

	Claim 19 is directed towards a non-transitory computer-readable recording medium having a program stored thereon which when executed performs the methods of claim 1.  Therefore the rejection applied to claim 1 also applies to claim 19.  Claim 19 mentions additional elements non-transitory computer-readable recording medium and program stored thereon (Goyal [¶0020] “In the example of FIG. 1, the system 100 includes a hardware-based programmable deep learning processor (DLP) 102, wherein the DLP 102 further includes at least a plurality of tensor engines (TEs) 104, which are dedicated hardware blocks/components each including one or more microprocessors and on-chip memory units storing software instructions programmed by a user for various machine learning operations. When the software instructions are executed by the microprocessors, each of the hardware components becomes a special purposed hardware component for practicing certain deep learning functions as discussed in detail below”).  

	Claims 7 and 16 are rejected under U.S.C. §103 as being unpatentable over the combination of Goyal and Ross and Chen and Mishra (US20160100193A1).

	 Regarding claim 7, the combination of Goyal, Ross, and Chen teaches The electronic apparatus of claim 1, wherein the processor is further configured to shift an operation value used in each row for the input data in a predetermined direction to perform an accumulation of operation values, and(Ross [Col. 10 l. 15-22] "Activation inputs can be shifted in a similar fashion in the other dimension, e.g., from left to right. Once the activation inputs and the weight inputs are in place, the processor can perform a convolution calculation, e.g., by using the multiplication and summation circuitries within the cells, to generate a set of accumulated values to be used in a vector computation unit.")
	wherein the predetermined direction is a direction in which the order proceeding in one side direction is based on a certain element in the two-dimensional filter, (Goyal FIG. 3 [¶0026] " The weight matrix W of N1×N2 is stored in column major form, wherein corresponding weights for the vector are also read once in blocks of size B at a time first from the first column and then from the second column, etc. Each time a block FIG. 3 shows that multiplication of the convolution filter is relative to an individual pixel of the source image, but that the multiplication is dependent on the pixels surrounding the source pixel.  Goyal also teaches that the multiplication is done procedurally column wise.).
	However, the combination of Goyal, Ross, and Chen doesn't explicitly teach proceeds to an element which is adjacent to a corresponding element in a next row or a next column of the element positioned at the end of the proceeding direction, and proceeds in a direction opposite to one side direction in the adjacent element..

	Mishra, in the same field of endeavor, teaches proceeds to an element which is adjacent to a corresponding element in a next row or a next column of the element positioned at the end of the proceeding direction, and proceeds in a direction opposite to one side direction in the adjacent element.([¶0067] "The input buffer 402 may store an N×N input block of input values that are to be transformed. If the transform module architecture 400 is used to implement a transform module, then the input buffer 402 may store residual pixel values of a residual block. If the transform module architecture 400 is used to implement an inverse transform module, then the input buffer 402 may store transform coefficients (e.g., in a frequency domain) of a transform block. The input buffer 402 may be implemented as any of a variety of buffers (e.g., as an inverse zig-zag buffer)." [¶0105] " Furthermore, the rows may be iterated from top-to-bottom or from bottom-to-top. The transform-dependent coefficients may selected such that the columns are iterated from left-to-right or from right-to-left. Zig-zag, alternating, or other scan patterns may also be used.").

	The combination of Goyal, Ross, and Chen as well as Mishra are directed towards accelerating neural networks.  Therefore, the combination of Goyal, Ross, and Chen as well as Mishra are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Goyal, Ross, and Chen with the teachings of Mishra by using a zigzag transform to convert the 2D matrix into 1D vector data. Mishra gives as an additional motivation for combination ([¶0120] "When lower performance is required, the cycle count budget may be decreased. The architecture may accordingly scale by reducing the input buffer read bandwidth, K, and the number of multipliers in each PE. When this reduction is performed, the associated die area may be reduced. The scalability may reduce or eliminate the requirement of redesigning an architecture for different performance requirements. Accordingly, both time and cost may be saved through a scalable architecture in accordance with the disclosed principles.").  

Claim 16 is substantially similar to claim 7.  Therefore the rejection applied to claim 7 also applies to claim 16.

	Claims 8 and 17 are rejected under U.S.C. §103 as being unpatentable over the combination of Goyal and Ross and Chen and Cencini (US 20170264493 A1).

	 Regarding claim 8, the combination of Goyal, Ross, and Chen teaches the processor is further configured to control the plurality of processing elements to perform operations according to a convolutional neural network (CNN) algorithm (Goyal [Abstract} "A hardware-based programmable deep learning processor (DLP) is proposed, wherein the DLP comprises with a plurality of accelerators dedicated for deep learning processing. Specifically, the DLP includes a plurality of tensor engines configured to perform operations for pattern recognition and classification based on a neural network...one or more convolutional network (ConvNet) engines each configured to perform a plurality of efficient convolution operations on sparse or dense matrices).
	However, the combination of Goyal, Ross, and Chen doesn't explicitly teach The electronic apparatus of claim 1, wherein the plurality of processing elements form a network having a structure in which a tree topology network is coupled to a mesh topology network, and
	and a recurrent neural network (RNN) algorithm using the network having the coupled structure..

	Cencini, in the same field of endeavor, teaches The electronic apparatus of claim 1, wherein the plurality of processing elements form a network having a structure in which a tree topology network is coupled to a mesh topology network, and([¶0160] "In some embodiments, the topology may be a structured topology, such as a tree topology like that shown in FIG. 7, or in other embodiments, the topology may be an unstructured topology, e.g., in other forms of mesh topologies." FIG. 7 shows a network of processing elements.)
	and a recurrent neural network (RNN) algorithm using the network having the coupled structure.([¶0212] "A variety of different approaches may be implemented by the policy engine 166 to optimize an objective function...such as a recurrent neural network configured to optimize the objective function over time").

	The combination of Goyal, Ross, and Chen as well as Cencini are directed towards accelerating neural networks.  Therefore, the combination of Goyal, Ross, and Chen as well as Cencini are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Goyal, Ross, and Chen with the teachings of Cencini by combining the network representations.  Cencini provides as additional motivation for combination ([¶0161] “systems with rigid, predefined, unchangeable roles may be relatively sensitive to failure by any one computing device, as often those systems require human intervention to replace that one computing device or otherwise reconfigure the system. In contrast, some embodiments may be fault tolerant and resilient to failures by computing devices, applications therein, and network.”).  This motivation for combination also applies to the claims dependent on this combination.

Claim 17 is substantially similar to claim 8.  Therefore, the rejection applied to claim 8 also applies to claim 17. 

	Claims 9 and 18 are rejected under U.S.C. §103 as being unpatentable over the combination of Goyal and Ross and Chen and Cencini and Zhou (“A C-LSTM Neural Network for Text Classification”, 2015).

	 Regarding claim 9, the combination of Goyal, Ross, Chen, and Cencini teaches The electronic apparatus of claim 8, wherein the processor is further configured to control the plurality of processing elements to perform the operation according to the mesh topology network in a convolution layer and a pooling layer of the CNN algorithm (Goyal [¶0023] "one or more pooling (or sub-sampling) layers, each of which is configured to aggregate information/data amongst a set of neighbors of a neuron of the current layer and one or more classification layers, each of which is configured to perform a linear or multi-layer perceptron (MLP) operation on the FC neural network and apply a non-linear activation function to output from the neuron." [¶0029] "In the example of FIG. 4, the ConvNet engine 410 in each tensor engine 104 is configured to explore sparsity of the vectors and/or matrices across the spectrum of various convolution layers of the neural network for efficient convolution." Goyal teaches that the pooling layer, the fully connected (FC) layer, and convolution layers of the CNN algorithm operate with respect to the topology of the neural network.  Cencini teaches using a recurrent neural network and the network topologies.).
	However, the combination of Goyal, Ross, Chen, and Cencini doesn't explicitly teach and perform the operation according to the tree topology network in a fully connected layer of the CNN algorithms and each layer of the RNN algorithm..

	Zhou, in the same field of endeavor, teaches and perform the operation according to the tree topology network in a fully connected layer of the CNN algorithms and each layer of the RNN algorithm.([Sec. 2 ¶4] "With the ability of explicitly modeling time-series data, RNNs are being increasingly applied to sentence modeling. For example, Tai et al. (2015) adjusted the standard LSTM to tree-structured topologies and obtained superior results over a sequential LSTM on related tasks. Most of these models use multi-layer CNNs and train CNNs and RNNs separately or throw the output of a fully connected layer of CNN into RNN as inputs." Zhou explicitly teaches using a CNN in combination with an RNN and making use of a tree based topology, as well as teaching the advantages of doing so.).

	The combination of Goyal, Ross, Chen, and Cencini as well as Zhou are directed towards accelerating neural networks.  Therefore, the combination of Goyal, Ross, Chen, and Cencini as well as Zhou are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Goyal, Ross, Chen, and Cencini with the teachings of Zhou by using a CNN in combination with a CNN and implementing a tree-based topology. Zhou teaches as motivation for combination ([p. 98 Sec. A] “In this paper, we introduce a new architecture short for C-LSTM by combining CNN and LSTM to model sentences. To benefit from the advantages of both CNN and RNN, we design a simple end-to-end, unified architecture by feeding the output of a one-layer CNN into LSTM.”).

Claim 18 is substantially similar to claim 9.  Therefore the rejection applied to claim 9 also applies to claim 18.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/VIKER A LAMARDO/Primary Examiner, Art Unit 2126