DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to RCE filed on 5/6/2021. Claims 1 and 4 have been amended. Claims 15 and 16 have been cancelled. Claims 1-14 are pending.
The amendments have overcome the previous 35 U.S.C. 112(b) rejections, however, upon further consideration of the current amendments, new 35 U.S.C. 112(b) rejections have been made are provided in more detail below.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/6/2021 has been entered.

Response to Arguments
	Applicant’s arguments on remarks pages 12-13 against the prior art rejections that previously recited “input buffer” in claim 1 limitation “utilizing input data from the plurality of stripes [of an input buffer] in a multiply accumulate (MAC) array” is not 

	Applicant’s arguments for claim 1 in view of the current amendment “looping back the intermediate result from the at least one of the plurality of stripes of the result buffer to at least one of the plurality of stripes of the input data buffer forming a chain process have been fully considered but are moot in light of the new grounds of rejection made in view of Deisher et al. (20180121796, pub. May 3, 2018), provided in more detail below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 1 recites the limitation "the input buffer" in line 10.  There is insufficient antecedent basis for this limitation in the claim.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 is/are rejected under 35 U.S.C. 103 as being unpatentable by Park et al. (20180032859, pub. Feb. 1, 2018), hereinafter “Park”, in view of Deisher et al. (20180121796, pub. May 3, 2018), hereinafter “Deisher”.

Regarding independent claim 1, Park discloses:
A method of hierarchical structuring a multilevel memory in a convolutional neural network (see Park, Fig 3, par. [0057]: operation of an accelerator by layers and memory input/output in a convolutional neural network), comprising:
partitioning a memory into a plurality of sections (see Park, Fig 4, par. [0065]: PTs 410 that are arranged in the form of a 2D array); 
partitioning the plurality of sections into a plurality of stripes (see Park, Fig 4, par. [0066]: The PT 410 is configured through clustering of a plurality of PEs, and see par. [0067]: Each PE 420 includes operation units 421, 423, and 425 that are necessary to process most layers of the convolutional neural network and a PE buffer 427); 
utilizing input data from the plurality of stripes in a multiply accumulate (MAC) array (see Park, Fig 4, par. [0067]: the PE 420 may include a MAC unit 421 for a convolution operation, a non-linear unit 423 for a non-linear operation, and a normalization unit 425 for a normalization operation, and see par. [0068]: The MAC unit 421 may perform a convolution operation whereby input image data is multiplied by constant values and all the resultant values are accumulatively added); 
outputting an intermediate result from the multiply accumulate (MAC) array to at least one of the plurality of stripes of a result buffer (see Park, Fig 4, par. [0067]: the PE 420 may include a PE buffer 427 for data accumulation and data internal reuse, and see Fig 5 item 530, par. [0077]: The operation unit may generate intermediate data by performing a first operation with respect to the input data, and may store the intermediate data in the second buffer 530, and see par. [0080]: the operation unit 520 may process the convolution layer for the input data, generate first intermediate data as the result of the processing, and store the intermediate data in the second buffer 530);
looping back the intermediate result from the at least one of the plurality of stripes of the result buffer to at least one of the plurality of stripes of the input data buffer (see Park, Fig 5, items 520 and 530, par. [0079]: The operation unit 520 may process the non-linear layer for the intermediate data that is fed back from the second buffer 530, generate the output data as the result of the processing, and store the output data in the second buffer 530, and see par. [0080]: The operation unit 520 may process the non-linear layer for the first intermediate data that is fed back from the second buffer 530, generate second intermediate data as the result of the processing, and store the second intermediate data in the second buffer 530.  The operation unit 520 may process the normalization layer for the second intermediate data that is fed back from the second buffer 530, generate the output data as the result of the processing, and store the output data in the second buffer 53) …
… outputting a final result from the at least one of the plurality of stripes of the result buffer to at least one of the plurality of stripes of an output buffer (see Park, Fig 5, par. [0081]: The second buffer 530 may output the output data to any one of any one of the plurality of PEs, the pooling unit, and the external memory, and see par. [0082]: The accelerator may further include a pooling unit that receives plural pieces of output data that are transmitted from the plurality of PEs, and performs a pooling operation with respect to the plural pieces of output data to transmit the output data to a third buffer).

Park does not explicitly disclose:
… forming a chain process; and … 

However, Deisher discloses:
… forming a chain process (see Deisher, par. [0104]: a process 700 provides a method for neural network layer descriptor chain setup, and Process 900 below provides a detailed method of developing the layer descriptor chains for the setup, and see par. [0107]: the process 700 then may include "determine layer descriptors for individual neural network layers" 706.  Specifically, this includes an NN refinement and layer descriptor chain forming process, and see par. [0114]: The process 700 may include "compute 8-bit scale factor" 718, and scale factors to input to the MAC for 8-bit weight, and see par. [0183]: process 900 may include "obtain next sub-chain primitive layer of macro layer" 924, and loops to analyze the next potential sub-chain layer with operations 916 to 922); and …

Park and Deisher are analogous arts, because they are about neural network systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Park, with the feature an NN refinement and layer descriptor chain forming process as disclosed by Deisher, with the motivation to provide a flexible neural network accelerator, as disclosed by Deisher in par. [0030].

Claims 2-3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Park (20180032859, pub. Feb. 1, 2018), in view of Deisher et al. (20180121796, pub. May 3, 2018), further in view of Kim (20120275246, pub. Nov. 1, 2012), hereinafter “Kim”, and further in view of Louie et al. (20130151913, pub. Jun. 13, 2013), hereinafter “Louie”.

Regarding claim 2, the combination of Park and Deisher discloses all the claimed limitations as set forth in the rejection of claim 1 above.

The combination of Park and Deisher further discloses:
partitioning a multiply accumulate (MAC) weight buffer section (see Park, Fig 4 items 420 and 421, par. [0067]: each PE 420 may include a MAC unit 421 for a convolution operation); 
partitioning an input data buffer section (see Park, Fig 4, items 420 and 427, par. [0067]: each PE 420 may include a PE buffer 427 for data accumulation and data internal reuse);  
5partitioning a result buffer section (see Park, Fig 4 item 420, and see Fig 5 item 530, par. [0077]: The operation unit may generate intermediate data by performing a first operation with respect to the input data, and may store the intermediate data in the second buffer 530, and see par. [0080]: the operation unit 520 may process the convolution layer for the input data, generate first intermediate data as the result of the processing, and store the intermediate data in the second buffer 530); 
partitioning an output buffer section (see Park, Fig 5, par. [0081]: The second buffer 530 may output the output data to any one of any one of the plurality of PEs, the pooling unit, and the external memory, and see par. [0082]: The accelerator may further include a pooling unit that receives plural pieces of output data that are transmitted from the plurality of PEs, and performs a pooling operation with respect to the plural pieces of output data to transmit the output data to a third buffer); 
partitioning a shared data buffer section (see Park, Fig 4 items 410 and 440: each PT connected to output buffer, and see par. [0072]: The resultant values of the operations by the PEs 420 are temporarily stored in the out buffer 440); …


… partitioning a bit test weight memory section; 
partitioning a bit test data memory section; and  
10partitioning an external random access memory section.

However, Kim discloses:
… partitioning a bit test data memory section (see Kim, Fig 2, par. [0029]: The write control unit 240 controls the write drivers 220_1 to 220M so that the test data is divided in two or more time periods and written in the memory banks 230_1 to 230_M in response to a test control signal TM); and …

Park, Deisher, and Kim are analogous arts, because they are about memory systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Park and Deisher, with the feature of Kim as described above, with the motivation of reducing instantaneously consumed current by not simultaneously driving write drivers and input buffers in a multi-test of semiconductor chips, as disclosed by Kim in par. [0007].

The combination of Park, Deisher, and Kim does not disclose:
… partitioning a bit test weight memory section; …
… 10partitioning an external random access memory section.

However, Louie discloses:
partitioning a bit test weight memory section (see Louie, Fig 1, par. [0020]: determining (204) a base block size for testing a memory drive (212) may be carried out, for example, by determining the total capacity of the memory drive (212) and dividing the memory drive (212) into a predetermined number of logical blocks.  For example, if the total capacity of a memory drive is 10 Gigabytes and the predetermined number of logical blocks desired is 1000, the base block size for testing a memory drive (212) will be set as 0.01 Gigabytes); …
… 10partitioning an external random access memory section (see Louie, Fig 1, par. [0011]: Stored in RAM (168) is a drive self test module (212), a module of computer program instructions for testing memory drives according to embodiments of the present invention, and see par. [0015]: Also stored in RAM (168) is an operating system (154)).

Park, Deisher, Kim, and Louie are analogous arts, because they are about memory systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Park, Deisher, and Kim, with the feature of Louie as described above, with the motivation for expedited memory drive self test, as disclosed by Louie in par. [0002].

Regarding claim 3, the combination of Park, Deisher, Kim, and Louie further discloses:
partitioning the input data buffer section into a plurality of input buffer data stripes (see Park, Fig 4, items 420 and 427, par. [0067]: each PE 420 may include a PE buffer 427 for data accumulation and data internal reuse;  
15partitioning the result buffer section into a plurality of result buffer stripes (see Park, Fig 4 item 420, and see Fig 5 item 530, par. [0077]: The operation unit may generate intermediate data by performing a first operation with respect to the input data, and may store the intermediate data in the second buffer 530, and see par. [0080]: the operation unit 520 may process the convolution layer for the input data, generate first intermediate data as the result of the processing, and store the intermediate data in the second buffer 530); 
partitioning the output buffer section into a plurality of output buffer stripes (see Park, Fig 5, par. [0081]: The second buffer 530 may output the output data to any one of any one of the plurality of PEs, the pooling unit, and the external memory, and see par. [0082]: The accelerator may further include a pooling unit that receives plural pieces of output data that are transmitted from the plurality of PEs, and performs a pooling operation with respect to the plural pieces of output data to transmit the output data to a third buffer); 
partitioning the shared data buffer section into a plurality of shared data stripes (see Park, Fig 4 items 410 and 440: each PT connected to output buffer, and see par. [0072]: The resultant values of the operations by the PEs 420 are temporarily stored in the out buffer 440); and 
partitioning the bit test data memory section into a plurality of bit test data memory stripes (see Kim, Fig 2, par. [0029]: The write control unit 240 controls the write drivers 220_1 to 220M so that the test data is divided in two or more time periods and written in the memory banks 230_1 to 230_M in response to a test control signal TM).

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Park (20180032859, pub. Feb. 1, 2018), in view of Deisher et al. (20180121796, pub. May 3, 2018), further in view of Kim (20120275246, pub. Nov. 1, 2012), further in view of Louie (20130151913, pub. Jun. 13, 2013), and further in view of Maaninen (20150199963, pub Jul. 16, 2015), hereinafter “Maaninen”.

Regarding claim 4, the combination of Park, Deisher, Kim, and Louie discloses all the claimed limitations as set forth in the rejection of claim 3 above.

The combination of Park, Deisher, Kim, and Louie does not disclose:
receiving bit test weights and input data from the external random access memory section; 
decompressing the received bit test weights from the bit test weight memory section; and  
5storing the decompressed bit test weights from the bit test weight memory section in the multiply accumulate (MAC) weight buffer section.

However, Maaninen discloses:
 receiving bit test weights and input data from the external random access memory section (see Maaninen, Fig 4, par. [0043]: Hardware accelerator 312 may also include various buffers or registers (e.g., RAMs 13-18) to store, for example, weights, bias terms, activation function co-efficients, input data, and intermediate output data, etc. The weights may be double buffered, for example, to allow parallel data decompression and MAC operations by decompressor unit 19 and MAC unit 10); 
decompressing the received bit test weights from the bit test weight memory section (see Maaninen, Fig 4, par. [0043]: Data transceiver/decompressor unit 19 may include decompression circuitry configured to decompress any compressed weights and bias terms that may be included in the data supplied by application 320; and  
5 storing the decompressed bit test weights from the bit test weight memory section in the multiply accumulate (MAC) weight buffer section (see Maaninen, Fig 4, par. [0046]: The compressed weight and bias terms may be decompressed by decompressor 19 and stored or buffered in RAMs 14-16 for use by MAC unit 10 for the matrix multiply and accumulate operations).

Park, Deisher, Kim, Louie, and Maaninen are analogous arts, because they are about memory systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Park, Deisher, Kim, and Louie, with the feature of Maaninen as described above, with the motivation of overcoming inconsistent connection quality and background noise, as disclosed by Maaninen in par. [0003].

 
10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Park (20180032859, pub. Feb. 1, 2018), in view of Deisher et al. (20180121796, pub. May 3, 2018), further in view of Kim (20120275246, pub. Nov. 1, 2012), further in view of Louie (20130151913, pub. Jun. 13, 2013), and further in view of Kim (20180129935, pub. May 10, 2018), hereinafter Kim [2].

Regarding claim 10, the combination of Park, Deisher, Kim, and Louie discloses all the claimed limitations as set forth in the rejection of claim 3 above.

The combination of Park, Deisher, Kim, and Louie does not disclose:
partitioning the plurality of input buffer data stripes into a plurality of input buffer data tiles; 
5partitioning the plurality of result buffer stripes into a plurality of result buffer tiles; 
partitioning the plurality of output buffer stripes into a plurality of output buffer tiles;
partitioning the plurality of shared data stripes into a plurality of shared data tiles; and 
partitioning the plurality of bit test data stripes into a plurality of bit test data memory 10tiles.

However, Kim [2] discloses:
partitioning the plurality of input buffer data stripes into a plurality of input buffer data tiles (see Kim [2], Fig 4, par. [0072]: the input buffer device 110 may load an input tile Din_T that is a part of input data Din.  At this point, the input tile Din_T may have a size of TnxTwxT. Tn denotes the number of channels of the input tile Din_T, Tw denotes a width of the input tile Din-T, and Th denotes a height of the input tile Din_T);  
5partitioning the plurality of result buffer stripes into a plurality of result buffer tiles (see Kim [2], Fig 4, par. [0073]: The MAC core 121 may use a plurality of kernels KER_1 to KER_M from the weight kernel buffer device 140 to perform convolution computations on the input tile Din_T loaded to the input buffer device 110); 
partitioning the plurality of output buffer stripes into a plurality of output buffer tiles (see Kim [2], Fig 4, par. [0074]: The generated output tile Dout_T may be loaded to the output buffer device 130.  In example embodiment, the output tile Dout_T may have as a size of TmxTcxTr.  Tm denotes the number of channels of the output tile Dout_T, Tc denotes a width of the output tile Dout_T, and Tr denotes a height of the output tile Dout_T);
partitioning the plurality of shared data stripes into a plurality of shared data tiles (see Kim [2], Fig 4, par. [0074]: The generated output tile Dout_T may be loaded to the output buffer device 130.  In example embodiment, the output tile Dout_T may have as a size of TmxTcxTr.  Tm denotes the number of channels of the output tile Dout_T, Tc denotes a width of the output tile Dout_T, and Tr denotes a height of the output tile Dout_T); and 
partitioning the plurality of bit test data stripes into a plurality of bit test data memory 10tiles (see Kim [2], Fig 4, par. [0074]: The generated output tile Dout_T may be loaded to the output buffer device 130.  In example embodiment, the output tile Dout_T may have as a size of TmxTcxTr.  Tm denotes the number of channels of the output tile Dout_T, Tc denotes a width of the output tile Dout_T, and Tr denotes a height of the output tile Dout_T).

Park, Deisher, Kim, Louie, and Kim [2] are analogous arts, because they are about memory systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Park, Deisher, Kim, and Louie, with the feature of Kim [2] as described above, with the motivation of improving an overall performance by reducing a computation performance time, as disclosed by Kim [2] in par. [0006].

Examiner’s Note
	Claims 5-9 and 11-14 do not have prior art rejections because prior arts either alone or in combination does disclosed the claimed features of claims 5 and 11. However, these claims are rejected under 35 U.S.C. 112(b) rejections by virtue of their dependency to base claim 1 as described above.




Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAZZAD HOSSAIN whose telephone number is (571)272-9841.  The examiner can normally be reached on MON-FRI 10AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, April Y Blair can be reached on (571) 270-1014.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/APRIL Y BLAIR/Supervisory Patent Examiner, Art Unit 2111