DETAILED ACTION
This office action is in response to the Application No. 15917076 filed on
11/17/2017. Claims 1-12 are presented for examination and are currently pending. Applicant’s arguments have been carefully and respectfully considered.

Response to Arguments
2.	In page 7 of the remarks, the applicant has argued that “smaller matrix (B) [0043] will have smaller number of memory elements in the row direction on the 11th line from the bottom of page 4, larger matrix (A) [0043] will have larger number of memory elements in row direction on the 9th line from the bottom of page 4, and larger matrix (A) [0043] will have larger number of memory elements in column direction on the 2nd line from the bottom of page 4. It is not clear to the Applicant what the Office Action the matrix A or B is compared with.”
	The office respectfully disagrees with the arguments of the applicant above. The Office Action compared matrix A as larger matrix with matrix B as smaller matrix, (see 602, Fig. 6 of Cadambi). A larger matrix has larger memory elements in the row and column directions than a smaller matrix. Cadambi teaches larger matrix A [0043] which is interpreted as the first array having rows and columns to store elements in the PE local stores. Cadambi teaches smaller matrix B [0043] which is interpreted as third array having rows and columns to store elements in the PE local stores. Referring to Fig. 8c of Cadambi, matrix A, that is first array, elements are stored in local store 102 using 2 rows and 2 columns of matrix A. Cadambi teaches a third storage is PE 106 [0046] where elements are stored in a 1 row and 1 column of matrix B (third array) due to parallelization. Therefore, a 1 row and 1 column matrix B (third array) will be smaller than matrix A (first array) in both directions.
	In page 8 of the remarks, the applicant argued that “Incidentally, the Examiner states in the 6th line from the bottom of page 3, regarding the first storage device, "elements in an array [0034]." However, paragraph [0034] relates to the smart storage 110, but does not relate to the input buffer 102.”
	The office respectfully disagrees with the arguments of the applicant above. The section of the office action that states "elements in an array [0034]” is referring to the fact that elements can be stored in an array, that is elements can be stored in a matrix in the storage area. Local store 102, 106 and smart storage 110 are all storage areas that elements can be stored in. Cadambi teaches “smart storage 110 that selects the top k elements in an array, given a comparison function, is shown. The storage includes a logic component, shown illustratively as a filter 302. The logic component determines whether to store a given input and, if so, where in the memory to store it” [0034]. 
	In page 8 of the remarks, the applicant argued that “Similarly, the Examiner states in the 7th line from the top of page 4, regarding the third storage device, "elements in an array [0034]." However, the paragraph relates to the smart storage 110, but does not relate to the local store 106.”
	The office respectfully disagrees with the arguments of the applicant above. The section of the office action that states "elements in an array [0034]” is referring to the fact that elements can be stored in an array, that is elements can be stored in a matrix in the storage area. Local store 102, 106 and smart storage 110 are all storage areas that elements can be stored.
	In page 8 of the remarks, the applicant argued that “Cadambi neither discloses nor suggests that the number of rows in the matrix "B" is smaller than the number of rows in the matrix "A" and the number of columns in the matrix "B" is smaller than the number of columns in the matrix "A"”. 
	The office respectfully disagrees with the arguments of the applicant above. Cadambi teaches larger matrix A [0043] which is interpreted as the first array having rows and columns to store elements in the PE local stores. Cadambi teaches smaller matrix B [0043] which is interpreted as third array having rows and columns to store elements in the PE local stores. Referring to Fig. 8c of Cadambi, matrix A, that is first array, elements are stored in local store 102 using 2 rows and 2 columns of matrix A. Cadambi teaches a third storage is PE 106 [0046] where elements are stored in 1 row and 1 column of matrix B (third array) due to parallelization. Therefore, a 1 row and 1 column of matrix B (third array) will be smaller than matrix A (first array) in both directions.
	In page 8 of the remarks, the applicant argued that “the Examiner states in the last paragraph of page 12 with reference to Inoue, "matrix 22 and matrix 24 are read out from the system memory 20 [0035] ensuring the speeds of arithmetic processes can be improved to a significant extent.", and Inoue does not disclose the advantageous effect of the high-speed process based on the data transfer from the external storage device to the internal storage device. Therefore, claim 10 would never be obvious from the disclosures of Cadambi, Minoya and Inoue.”
	The office respectfully disagrees with the arguments of the applicant above. None of the claims recites data transfer from the external storage device to the internal storage device, there is no instances of internal storage or data transfer in the claim limitations. Claim 10 only recites external storage device and the external storage is not referred to as data transfer in the claims. Inoue teaches a system memory 20 which is an external storage device to multi-core processor 40 which consist of processing element 410-1 to 410 N of Fig. 1.
	The office withdraws the 112(b) rejection for claim 7, in view of the amendments
made to claim 7.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

3.	Claims 1, 3-7 are rejected under 35 U.S.C 102(a)(1) as being anticipated by Cadambi et al (US20110119467)

	Regarding claim 1, Cadambi teaches an arithmetic processing device (MAPLE processing core and Each core 100 has p=N-M processing elements (PEs) 108 [0027], and each PE 108 performs arithmetic logic unit (ALU) and multiply operations, as well as a multiple-accumulate operation in a single cycle [0031]) comprising:
	a first storage device (local stores 102 [0046])
	including at least one first array having memory elements (elements in an array [0034], matrix A (as first array) [0046])
	arranged in a first direction (row-wise direction of matrix A [0046])
	and a second direction intersecting with the first direction; (column-wise direction of matrix A [0046])
	a second storage device (smart memory block 110 [0046]) 
	including at least one second array (output matrix C of smart memory block 110 [0046], matrix-matrix multiplications {A} x {B} = {C}, where {C} is the output matrix (as second array)[0060])
	having memory elements arranged in the first direction; (rows of the output matrix [0046])
	a third storage device (processing element (PE) local store 106 [0046]) 
	including at least one third array having memory elements (elements in an array [0034], matrix B (as third array) [0046])
	arranged in the first and second directions, (row-wise direction of matrix B and column-wise direction of matrix B [0046])
	the third array having a smaller number of memory elements arranged in the first direction (smaller matrix (B) [0043] will have smaller number of memory elements in the row direction)
	than the memory elements of the first array, arranged in the first direction, (larger matrix (A) [0043] will have larger number of memory elements in row direction, see Fig. 8B, where the B matrix is split into multiple portions stored with elements 106) 
	and having a smaller number of memory elements arranged in the second direction (smaller matrix (B) [0043] and columns of B are split into fractional portions [0063] see Fig. 8b)
	than the memory elements of the first array, arranged in the second direction; (larger matrix (A) [0043] will have larger number of memory elements in column direction)
	and a first process layer, (one layer [0059], chain 1, COMPUTE, Fig. 5; COMPUTE phase is followed by a STORE phase where the outputs are collected and stored in the memory block [0037])
	using data stored in the memory elements of the third array, (matrix B [0060]) 
	to perform a convolution process (core computation in one layer is the convolution and convolutions are expressed in matrix operations [0060])
	to data stored in the memory elements of the first array, (matrix A [0060]) 
	and to store a result of the convolution process in the memory elements of the second array. (output matrix C [0060]; sends its output to its respective smart memory block 110 [0028]; outputs are collected and stored in the memory block 110 [0037])
	Regarding claim 3, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches wherein the second array (output matrix, Fig 7)
	has a smaller number of memory elements arranged in the first direction (two memory elements arranged in the row direction C11 and C12, Fig. 7)
	than the memory elements of the first array, arranged in the first direction. (four memory elements arranged in the row direction, Fig. 7, A11 to A14)

	Regarding claim 4, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches wherein the first process layer performs the convolution process along the first direction. (Both rows are streamed in together; therefore all four output elements are computed simultaneously, making it twice as fast [0061])

	Regarding claim 5, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches wherein the second storage device (memory block 110 [0033])
	includes a plurality of second arrays. (operations in large arrays [0033])

	Regarding claim 6, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches wherein the first storage device includes m (m                        
                            ≥
                        
                     1) first arrays (plurality input LS 102 stores, Fig. 8c (each local store 102 having a row-wise direction and column-wise direction matrix [0046] of array elements [0034]))
	and the third storage device includes m third arrays. (plurality LS 106 store, Fig. 1 (each local store 106 having a row-wise direction and column-wise matrix [0046] of array elements [0034]))

	Regarding claim 7, Cadambi teaches the arithmetic processing device according to claim 6, Cadambi teaches wherein the third storage device further includes m (m≥1) fourth arrays each having memory elements arranged in the first and second directions, (multiple LS 106 stores, Fig. 1 (each local store 106-N-1 to 106-N-N having a row-wise direction and column-wise matrix [0046]) of array elements [0034])
	the fourth arrays having an equal number of memory elements arranged in the first and second directions to the memory elements of the third array, arranged in the first and second directions, respectively, (third and fourth array represent kernel values/weights, [0060], kernel data in the PE private local stores 106; multiple kernel matrices teach third and fourth array stored in the third storage device (local stores 106))
	the second storage device includes two second arrays, (smart memory block 110-1 (output matrix C of each smart memory block 110 [0046]) and 110-2 output matrix C of each smart memory block 110 [0046], Fig. 1,) and
	the first process layer stores a result of a convolution process using the third array in one of the two second arrays (stores in smart memory block 110-1, Fig. 1)
	and stores a result of a convolution process using the fourth arrays in the other of the two second arrays (smart memory block 110-2, Fig. 1).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



4.	Claims 2, 8, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Cadambi et al (US20110119467) in view of Minoya et al (US20150331832)

	Regarding claim 2, Cadambi teaches the arithmetic processing device according to claim 1, however, Cadambi did not explicitly teach wherein the memory elements of the second array are arranged one-dimensionally only in the first direction.
	Minoya teaches wherein the memory elements (the normalized data Nn1, [0030])
	of the second array (Nn1-Nn4, Fig. 2)
	are arranged one-dimensionally only in the first direction. (arranged one-dimensionally in the row direction, Fig. 2, [0051])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Cadambi to incorporate the teachings of Minoya for the benefit of performing pooling processing to a processing result data to improve recognition rate and better extraction processing of feature quantity (Minoya, [0006])
	 
	Regarding claim 8, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches data stored in the memory elements of the second array, (smart memory block 110 [0046])
	However, Cadambi does not explicitly teach further comprising: a fourth storage device including at least one fifth array having memory elements arranged in the first and second directions; and a second process layer to perform a pooling process to and to store a result of the pooling process in the memory elements of the fifth array
	Minoya teaches further comprising: a fourth storage device (arithmetic block 201(4) Fig. 9 [0052] and the arithmetic block stores data [0038])
	including at least one fifth array having memory elements arranged in the first and second directions; (5x5 pixels [0029] with 5 rows and 5 columns)
	and a second process layer to perform a pooling process, (pooling portion 204 [0052], Fig. 9)
	and to store a result of the pooling process in the memory elements of the fifth array. (205c stores pooling data generated by the pooling portion 204 in the arithmetic block 201(4) [0054] including 5x5 pixels [0029] with 5 rows and 5 columns)
	The same motivation to combine as dependent claim 2 applies here.

	Regarding claim 9, Cadambi teaches the arithmetic processing device according to claim 1, Cadambi teaches includes at least one fifth array having memory elements arranged in the first and second directions; (output matrix C [0060], as fifth array)
	includes at least one sixth array having memory elements arranged in the first and second directions; (matrix B (as sixth array), row-wise direction and column-wise direction of matrix B  [0046])
	and a second process layer, (one layer [0059], chain 2, COMPUTE, Fig. 5; COMPUTE phase is followed by a STORE phase where the outputs are collected and stored in the memory block [0037])
	using data stored in the memory elements of the sixth array, (matrix B, as sixth array [0060])
	to perform a convolution process (core computation in one layer is the convolution and convolutions are expressed in matrix operations [0060])
	to data stored in the memory elements of the second array, (smart memory block 110 [0046])
	and to store a result of the convolution process in the memory elements of the fifth array. (output matrix C [0060]; sends its output to its respective smart memory block 110 [0028]; outputs are collected and stored in the memory block 110 [0037])
	However, Cadambi does not explicitly teach further comprising: a fourth storage device, a fifth storage device 
	Minoya teaches further comprising: a fourth storage device (arithmetic block 201(4) Fig. 9, [0052] and the arithmetic block stores data [0038])
	109a fifth storage device (arithmetic block 201(5) Fig. 9,[0052] and the arithmetic block stores data [0038])
	The same motivation to combine as dependent claim 2 applies here.

5.	Claims 10 is rejected under 35 U.S.C. 103 as being unpatentable over Cadambi et al (US20110119467) in view of Inoue (US20080183793) 

	Regarding claim 10, Cadambi teaches an arithmetic processing device (MAPLE processing core, and each core 100 has p=N.M processing elements (PEs) 108, [0027], and each PE 108 performs arithmetic logic unit (ALU) and multiply operations, as well as a multiple-accumulate operation in a single cycle [0031])comprising: 
	a readout device (host (as readout device) can also write to the memory banks via a bank-bank interconnection network [0036], Fig. 4; transfer data and programs between the host and MAPLE [0056])
	that reads out at least part of data (code is retrieved from bulk storage during execution [0025])
	a first storage device (processing element (PE) local store 106 [0046]) 
	including at least one second array having memory element (elements in an array [0034], matrix A (as second array))
	arranged in the first and second directions, (row-wise direction of matrix A and column-wise direction of matrix A [0046])
	the at least part of data read out by the readout device being stored in the second array; (training or test vectors directly from the host (as readout host) to the PE local stores 106 [0067])
	a third storage device (smart memory block 110 [0046])
	including at least one third array having memory elements arranged in the first and second directions; (output matrix C of smart memory block 110 [0046], matrix-matrix multiplications {A} x {B} = {C}, where {C} is the output matrix (as third array)[0060] having rows and columns, Fig. 7)
	a fourth storage device (local store 102 [0046])
	including at least one fourth array having memory elements (elements in an array [0034], matrix B (as fourth array))
	arranged in the first and second directions; (row-wise and column-wise direction of matrix B [0046])
	and a process layer, (one layer [0059], chain 1, COMPUTE, Fig. 5; COMPUTE phase is followed by a STORE phase where the outputs are collected and stored in the memory block [0037])
	using data stored in the memory elements of the fourth array, (matrix B [0060])
	to perform a convolution process (core computation in one layer is the convolution and convolutions are expressed in matrix operations [0060])
	to data stored in the memory elements of the second array, (matrix A [0060])
	and to store a result of the convolution process in the memory elements of the third array. (output matrix [0046], matrix-matrix multiplications {A} x {B} = {C}, where {C} is the output matrix (as third array) [0060], Fig. 7; outputs are collected and stored in the memory block 110 [0037])
	Cadambi does not explicitly teach an external storage device including at least one first array having memory elements arranged in a first direction and a second direction intersecting with the first direction; 
	Inoue teaches an external storage device including at least one first array having memory elements arranged in a first direction and a second direction intersecting with the first direction; (system memory 20 is a main storage and system memory 20 is external connected to the multi-core processor 40 and system memory 20 stores matrix 24 and matrix 26 [0026] having first and second directions, Fig. 2)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Cadambi to incorporate the teachings Inoue for the benefit of an arithmetic device [0026] wherein matrix 22 and matrix 24 are read out from the system memory 20 [0035] ensuring the speeds of arithmetic processes can be improved to a significant extent (Inoue, [0076])


7.	Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Cadambi et al (US20110119467) in view of Inoue (US20080183793) and further in view of Minoya et al (US20150331832)

	Regarding claim 11, Cadambi modified by Inoue teaches the arithmetic processing device according to claim 10, they do not explicitly teach wherein the second array has an equal number of memory elements arranged in the first direction to the memory elements of the first array, arranged in the first direction, and has an equal number of memory elements arranged in the second direction to the memory elements of the first array, arranged in 110 the second direction.
	Minoya teaches wherein the second array has an equal number of memory elements arranged in the first direction (second array in Fig. 2, as second 5x5 array having 5 rows)
	to the memory elements of the first array, arranged in the first direction, (first array in Fig. 2, as first 5x5 array having 5 rows)
	and has an equal number of memory elements arranged in the second direction (second array in Fig. 2, as first 5x5 array having 5 columns, Fig. 2) 
	to the memory elements of the first array, arranged in 110 the second direction. (first array in Fig. 2, as first 5x5 array having 5 columns)
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the device of Cadambi modified by Inoue to incorporate the teachings of Minoya for the benefit of performing pooling processing to a processing result data to improve recognition rate and better extraction processing of feature quantity (Minoya, [0006])

	Regarding claim 12, Cadambi modified by Inoue teaches the arithmetic processing device according to claim 10, they do not explicitly teach wherein the second array has an equal number of memory elements arranged in the first direction to the memory elements of the first array, arranged in the first direction, and has an equal number of memory elements arranged in the second direction to the memory elements of the fourth array, arranged in the second direction.
	Minoya teaches wherein the second array has an equal number of memory elements arranged in the first direction (second array in Fig. 2, as second 5x5 array having 5 rows)
	to the memory elements of the first array, arranged in the first direction, (first array in Fig. 2, as first 5x5 array having 5 rows)
	and has an equal number of memory elements arranged in the second direction (second array in Fig. 2, as first 5x5 array having 5 columns, Fig. 2) 
	to the memory elements of the fourth array, arranged in the second direction (fourth array in Fig. 2, as fourth 5x5 array having 5 columns)
	The same motivation to combine as dependent claim 11 applies here.

Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MORIAM MOSUNMOLA GODO whose telephone number is (571)272-8670. The examiner can normally be reached Monday-Friday 7:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/M.G./Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121