DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 12/04/2020. Claims 1-8 are pending in the application and have been examined.
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-8 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an mathematical calculation without significantly more.
Claim 1 recites intermediate data are updated to correspond to a convolution result of the updated row data of the input data.
The limitation of intermediate data are updated to correspond to a convolution result of the updated row data of the input data, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical calculation/optimization/operations of "row data" "input data" "intermediate data". For example, “updated” in the context of this claim encompasses the calculation of the convolution result of intermediate data based on the input data. Similarly, the limitation of an addressing sequence of the updated input data is adjusted to perform an operation on the updated input data, as drafted, is a process that, under its broadest reasonable interpretation, covers mathematical calculation/optimization/operations of "addressing" "input data". For example, “adjusted to perform” in the context of this claim encompasses the computation of the address of the storage location of the input data. If a claim limitation, under its broadest reasonable interpretation, covers mathematical computation, then it falls within the “mathematical formula or calculation” grouping of abstract ideas. Accordingly, the claim recites a mathematical concept.
This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional element – using speech feature reuse-based storage method, this additional element does not integrate the mathematical concept into a practical application because it does not impose any meaningful limits on practicing the mathematical computation. The claim is directed to a mathematical concept. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Similarly claim 2, is directed to the computing the addressing sequence based on a computation of a circular shifting which , under its broadest reasonable interpretation, covers mathematical computation, then it falls within the “mathematical formula or calculation” grouping of abstract ideas. Accordingly, the claim recites a mathematical concept. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Similarly, claims 3-7 are directed updated row number of the input data is equal or not equal to a convolution step size, under its broadest reasonable interpretation, covers mathematical computation, then it falls within the “mathematical formula or calculation” grouping of abstract ideas. Accordingly, the claims recite a mathematical concept. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are patent eligible.
Claim 8 is based on a method to compute for keyword spotting convolution neural network pooling layer based on claim 1. Based on under its broadest reasonable interpretation, covers mathematical computation, then it falls within the “mathematical formula or calculation” grouping of abstract ideas. Accordingly, the claim recites a mathematical concept. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-8 are rejected under 35 U.S.C. 103 as being unpatentable over S. Zheng et al., "An Ultra-Low Power Binarized Convolutional Neural Network-Based Speech Recognition Processor With On-Chip Self-Learning," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 12, pp. 4648-4661, Dec. 2019 in view of Ding, W., Huang, Z., Huang, Z., Tian, L., Wang, H., & Feng, S. (2019). Designing efficient accelerator of depthwise separable convolutional neural network on FPGA. Journal of Systems Architecture, 97, 278-286.
Regarding claim 1, Zheng teaches a speech feature reuse-based storing and calculating compression method for a keyword-spotting CNN, 
when each frame of input data arrives (see Zheng, pg. 4653, sect IV A, Fig. 6a, each frame (40-dimension feature vector) is combined with 10 contiguous frames into an 11×40 sized feature map. Therefore, two consecutive feature maps have 10 frames in common (shaded)), wherein a part of rows of data of a previous frame of input data is replaced with updated row data of input data of a current frame (see Zheng, pg. 4653, sect IV A, We address this problem by exploiting frame-level data reuse. The immediate data in 4 CONV layers generated from the starting input speech feature map (Fmap 1) are buffered as follows. First 3 layers buffer the last two rows of their output feature map, and the final layer buffers the whole output feature map except the oldest row),  an addressing sequence of the updated input data is adjusted to perform an operation on the updated input data and a convolution kernel in an arrival sequence of the input data (see Zheng, pg. 4653, sect IV A, In cooperation with the above frame-level activation reuse, we propose a memory partitioning technique to ensure conflict-free data loading, as illustrated in Fig. 6(d). In the proposed computing data-flow, features from 3 rows of input feature maps (across 32/64 channels) are needed for XNOR computation in one cycle. Therefore, parallel memory access of multiple rows is demanded in BCNN computation. We partition the feature buffer into 3 banks, ensuring the potential to provide multiple data stream. The mapping of data in the memory is decided by its row index modulo 3, e.g. the 4th row is stored in bank 4%3 = 1. As shown in the Fig. 6(b), in this way, data from arbitrary 3 consecutive rows can be read out without conflict, e.g., rows 3,4,5 are loaded from bank 0,1,2, respectively), and intermediate data are updated to correspond to a convolution result of the updated row data of the input data under the two conditions that the updated row number of the input data is equal to a convolution step size (see Zheng, pg. 4653, sect IV A, The immediate data in 4 CONV layers generated from the starting input speech feature map (Fmap 1) are buffered as follows. First 3 layers buffer the last two rows of their output feature map, and the final layer buffers the whole output feature map except the oldest row. For the consequent speech inputs (Fmap 2), only the newest 3 frames (frame 10–12 in Fig. 6(c)) of the input feature maps are used to compute the non-overlapping output row. And this row is combined with 2 buffered rows of the first layer to generate the non-overlapping row for next layer. The buffered features are updated by adding the newly generated row and discarding the oldest row of each channel).  However, Zheng fails to teach, the updated row number of the input data is not equal to the convolution step size.
	However, Ding teaches the updated row number of the input data is not equal to the convolution step size (see Ding, pg. 282, sect. 4.3, To reduce the number of data access, this paper proposes a method of horizontal row priority computing pattern as shown in Fig. 9. The input tile matrix and the M = 10 tile matrices in the weight matrix are performed simultaneously every cycle, and each tile is calculated in parallel, which
can speed up the matrix calculation and reduce the data read times; the parallel calculation operation is interpreted as updating the row number is not equal to the convolution step size).
Zheng and Ding  are considered to be analogous to the claimed invention because they relate to depthwise separable CNNs is orthogonal to model compression. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Zheng on quantizing network parameters with low bit-width on algorithmic level with the depthwise separable convolutional neural network accelerator with all the layers working concurrently in pipelined fashion teachings of Ding to improve the system throughput and performance (see Ding, pg. 279, sect. 1).
Regarding claim 2, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 1. Zheng further teaches wherein adjusting addressing sequence of the updated input data comprises circularly shifting the data addressing sequence down by m bits, m being the updated row number of the input data (see Zheng, pg. 4653, sect IV A, We partition the feature buffer into 3 banks, ensuring the potential to provide multiple data stream. The mapping of data in the memory is decided by its row index modulo 3, e.g. the 4th row is stored in bank 4%3 = 1. As shown in the Fig. 6(b), in this way, data from arbitrary 3 consecutive rows can be read out without conflict, e.g., rows 3,4,5 are loaded from bank 0,1,2, respectively).
Regarding claim 3, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 1, Zheng further teaches when the updated row number of the input data is equal to the convolution step size, the intermediate data to correspond to the convolution result of the updated row data of the input data specifically comprises directly updating the intermediate data to be the convolution result obtained after the addressing sequence of the input data is adjusted (see Zheng, pg. 4653, sect IV B, in the convolution of speech feature maps, apart from spatial locality, temporal locality is also observed, as shown in Fig. 6 (a). Each frame (40-dimension feature vector) is combined with 10 contiguous frames into an 11×40 sized feature map. Therefore, two consecutive feature maps have 10 frames in common (shaded). In cooperation with the above frame-level activation reuse, we propose a memory partitioning technique to ensure conflict-free data loading, as illustrated in Fig. 6(d). In the proposed computing data-flow, features from 3 rows of input feature maps (across 32/64 channels) are needed for XNOR computation in one cycle. We partition the feature buffer into 3 banks, ensuring the potential to provide multiple data stream. The mapping of data in the memory is decided by its row index modulo 3, e.g. the 4th row is stored in bank 4%3 = 1. As shown in the Fig.6(b), in this way, data from arbitrary 3 consecutive rows can be read out without conflict, e.g., rows 3,4,5 are loaded from bank 0,1,2, respectively).
	Regarding claim 4, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 1. Ding further teaches when the updated row number of the input data is not equal to the convolution step size, the intermediate data to correspond to the convolution result of the updated row data of the input data specifically comprises reserving all convolution calculation intermediate results of the input data between adjacent repeated input feature values (see Ding, pg. 283, sect. 5.2 (1), in CNN, because of the difference of computation and the number of channels of the feature maps for every layer, the parallelization parameter pn and pm for each layer is configured differently. Besides, we also adopt input data reuse in the design as shown in Fig. 12. Multiple filters are applied to the same feature map, so the input feature
map activations are used multiple times across filters; as shown in Fig. 12, the reuse data along with multiple filters as shown in Fig. 12 is interpreted as reserving all convolution calculation intermediate results of the input data between adjacent repeated input feature values ).
Regarding claim 5, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 3, Zheng further teaches wherein when the updated row number of the input data is equal to the convolution step size, the stored row number of the input data is compressed into a size of a first dimension of a convolution kernel of this layer, and a convolution operation result of each step is compressed into the size of the first dimension of the convolution kernel of the layer (see Zheng, pg. 4653, sect IV B, Fig. 6 (b) illustrates that when they are convolved with
3×3 kernels of the first convolutional layer, the output feature maps will have 8 rows in common (out of 9 rows in total). Similar phenomenon can be observed on all the subsequent layers & Zheng, pg. 4650 Table 1 Statistics of Proposed BCNN, shows the compression to the first dimension of the convolution kernel ).
Regarding claim 6, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 4. Zheng further teaches wherein when the updated row number of the input data is not equal to the convolution step size, data storage of an input layer is compressed into a size of a first dimension of a convolution kernel of this layer (see Zheng, pg. 4653, sect IV B, For the sake of reducing energy consumption of weight accessing, we consider the compression of BCNN weights), the intermediate data of each convolution layer is stored as K times of the size of the first dimension of the convolution kernel of this layer, K being a ratio of the convolution step size to the updated row number of the input data (see Zheng, pg. 4624, sect. IV B, A 2b flag table is designed to record the bank types and direct the accessing of these hybrid banks. Each hybrid bank owns a separate address generator. In each cycle, the 2-4 decoder indicates which bank to read, and the address generator provides the exact address; hybrid banks address decoder interpreted as the K ratio).
Regarding claim 7, Zheng in view of Ding teach the speech feature reuse-based storing and calculating compression method for the keyword-spotting CNN according to claim 6. Zheng further teaches wherein the convolution operation result of each step is stored into first to K-th intermediate data memories in sequence (see Zheng, pg. 4653, sect IV A, The immediate data in 4 CONV layers generated from the starting input speech feature map (Fmap 1) are buffered as follows. First 3 layers buffer the last two rows of their output feature map, and the final layer buffers the whole output feature map except the oldest row. For the consequent speech inputs (Fmap 2), only the newest 3 frames (frame 10–12 in Fig.6(c)) of the input feature maps are used to compute the non-overlapping output row. And this row is combined with 2 buffered rows of the first layer to generate the non-overlapping row for next layer. The buffered features are updated by adding the newly generated row and discarding the oldest row of each channel. Parallel memory access of multiple rows is demanded in BCNN computation. We partition the feature buffer into 3 banks, ensuring the potential to provide multiple data stream; Fig. 6 operation is interpreted as each step intermediate data in sequence).
 Regarding claim 8, Zheng in view of Ding teach the method according to claim 1 as indicated earlier.  Zheng further teaches speech feature reuse-based storing and calculating compression method for a keyword-spotting convolutional neural network pooling layer, wherein it is achieved by using the method according to claim 1  (see Zheng,pg. 4650, Fig. 2,  pg. 4652, Fig. 4).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Huimin Li, Xitian Fan, Li Jiao, Wei Cao, Xuegong Zhou and Lingli Wang, "A high performance FPGA-based accelerator for large-scale convolutional neural networks," 2016 26th International Conference on Field Programmable Logic and Applications (FPL), 2016, pp. 1-9) teaches given the kernel size Size kernel and the corresponding shifting stride stride, the numbers of multipliers, accumulators and multiplexers required for 1-D and 2-D PE to implement a CONV layer (see Huimin, pg. 4 section IV B).
Chen et. al, (US Patent Application Publication, 2019/0197083) teaches a lightweight neural network, a MobileNet which uses the idea of depthwise separable convolutions, and instead of fusing channels when calculating convolutions (e.g., 3*3 convolution kernel or larger size), it uses depthwise (or known as channel-wise) and 1*1 pointwise convolution method to decompose convolution, such that the speed and model size are optimized, and the calculation accuracy is basically kept (see Chen, [0033]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 2:00pm - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NANDINI SUBRAMANI/            Examiner, Art Unit 2656                                                                                                                                                                                            
/EDGAR X GUERRA-ERAZO/            Primary Examiner, Art Unit 2656