DETAILED ACTION

Status of Application
	The Preliminary Amendment filed 03/23/2022 has been entered.
Claims 25-48 are pending in the present application.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/10/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 25 and 39 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 and 15, respectively, of U.S. Patent No. 11,226,816 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because the claims perform the same function with but different terminology.
Please note that in the interest of time, the examiner is selecting only one of the independent claims from the instant application and U.S. Patent for the table below.
Instant Application
U.S. Patent No. 11,226,816 B2
	Claim 25. A memory module comprising: 
	a memory die comprising a dynamic random access memory (DRAM) bank comprising: 
	an array of DRAM cells; and 
	an in-memory compute (IMC) module comprising: 
		an arithmetic logic unit (ALU); and 
	a memory controller configured to: 	receive, from a host processor, an operand and an instruction; 
	determine, based on the instruction, a data layout from a plurality of data layouts; 
	supply the operand to the DRAM bank in accordance with the data layout; and 
	control the IMC module of the DRAM bank to perform an ALU operation on the operand in accordance with the instruction.
	Claim 1. A memory module comprising: 
	a memory die comprising a dynamic random access memory (DRAM) bank comprising: 
	an array of DRAM cells arranged in pages, comprising DRAM cells, the DRAM cells storing bit values; 
	a row buffer configured to store values of an open page of the pages; 
	an input/output (IO) module; and 	an in-memory compute (IMC) module comprising: 
		an arithmetic logic unit 	(ALU) configured to receive 	operands from the row buffer or 	the IO module and to compute an 	output based on the operands and 	a selected ALU operation of a 	plurality of ALU operations; and 
		a result register configured 	to store the output computed by 	the ALU; and 
	a memory controller configured to: 	receive, from a host processor, a first operand, a second operand, and an instruction; 
	determine, based on the instruction, a data layout from a plurality of data layouts; 
	supply the first operand and the second operand to the DRAM bank in accordance with the data layout; and 	control the IMC module of the DRAM bank to perform an ALU operation of the plurality of ALU operations on the first operand and the second operand in accordance with the instruction.


Claims 26, 28-38, 40, and 42-48 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 2, 4-14, 16, and 18-24 respectively, of U.S. Patent No. 11,226,816 B2.
Claims 27 and 41 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 5 and 19, respectively, of U.S. Patent No. 11,226,816 B2.




Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 25 and 39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rao, U.S. Patent No. 5,953,738, in view of Wang et al (hereinafter Wang), U.S. Patent No. 10,185,499 B1, in view of Simionescu et al (hereinafter Simionescu), U.S. Patent No. 9,542,101 B2.
Referring to claims 25 and 39, taking claim 25 as exemplary, Rao discloses a memory module comprising: 
a memory die [fig. 4] comprising a dynamic random access memory (DRAM) bank [fig. 4, element 401a; col. 8, lines 28-33] comprising: 
an array of DRAM cells [fig. 4, element 402]; and 
an in-memory compute (IMC) module [fig. 4, element 414] comprising: 
an arithmetic logic unit (ALU) [fig. 4, ALUs 414]; and 
a memory controller [fig. 4, control and configuration circuitry 407] configured to: 
receive, from a host processor [fig. 1A, CPU 101], an operand [col. 9, lines 8-15, Control circuitry 407 receives conventional DRAM control signals and clocks from an external source, such as processor 101 or core logic 105 in system 100 or CPUs 201 in multiprocessing systems 200A-200C. These signals include a synchronous clock (SCLK), a row address strobe (/RAS), a column address strobe (/CAS), read/write select (R/W) and output enable (/OE), along with data (DQ) and addresses (A.sub.dd); col. 3, line 67 – col. 4, lines 1-10, “a memory is disclosed which includes an array of dynamic random access memory cells, an operand register for storing a received operand for selective use in an arithmetic logic operation, and a results register for storing a result from an arithmetic logic operation. An arithmetic logic unit is included for selectively performing an operation on data retrieved from said array of dynamic random access memory cells and an operand retrieved from the operand register, a result of the operation output by the arithmetic logic unit for selective storage in the result register”; also see fig. 1A where CPU 101 sends DATA to 105 which is then sent to 106; data sent to 106 is received by element 407 (fig. 4) which can then be operated by on ALUs within the banks]; 
control the IMC module of the DRAM bank to perform an ALU operation on the operand in accordance with an instruction [col. 3, line 67 – col. 4, lines 1-10, “a memory is disclosed which includes an array of dynamic random access memory cells, an operand register for storing a received operand for selective use in an arithmetic logic operation, and a results register for storing a result from an arithmetic logic operation. An arithmetic logic unit is included for selectively performing an operation on data retrieved from said array of dynamic random access memory cells and an operand retrieved from the operand register, a result of the operation output by the arithmetic logic unit for selective storage in the result register”; col. 9, lines 60-66, Under the control of bits written into mode register 416, ALUs 414 of one or more banks 401 can operate on bits output from DRAM column decoder/sense amplifiers 405 of the given bank or banks 401. The available operation include AND, OR, NOR, NAND, INVERT, SHIFT, ADD and SUBTRACT; fig. 4, controlling the ALUs to perform an ALU operation in accordance bits written into mode register 416 (equivalent to an instruction)].
Rao does not explicitly disclose the memory controller receiving the instruction from a host processor.
 However, Wang discloses the memory controller [fig. 5, element 530] receiving the instruction from a host processor [fig. 5, host controller 102 equivalent to a host processor; col. 15, lines 51-63, Data plane 512 stores the payload of a software driver that decodes commands and initiates transactions or data transformation operations using information provided on system bus 110 and by control plane 513. IMDB applications other applications can control operations of the CDIMM 502 by sending commands to the control plane 513, which the CDIMM 502 accepts and uses to operate on data stored in the data plane 512. Some transactions or data transformation operations from data plane 512 using commands from control plane 513 are launched on processor cores 514 and off-load engines 515. For example, processor cores 514 can execute a data transformation operation for dataset objects stored in the plurality of DRAM devices 519].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Wang in the invention of Rao, to implement the memory controller receiving the instruction from a host processor, in order to provide computing performance speed-ups [Wang, col. 6, lines 29-35].
The modified Rao does not explicitly disclose determine, based on the instruction, a data layout from a plurality of data layouts; 
supply the operand to the DRAM bank in accordance with the data layout.
However, Simionescu discloses determine, based on the instruction, a data layout from a plurality of data layouts [claim 16, “receiving, with a storage controller, a set of parameters that define a layout for distributing data across a set of N solid state storage modules”]; 
supply the operand to the DRAM bank in accordance with the data layout [claim 16, col. 4, lines 4-10, layout for distributing data across a set of N modules].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Simionescu in the invention of the modified Rao, to implement determine, based on the instruction, a data layout from a plurality of data layouts; supply the operand to the DRAM bank in accordance with the data layout, in order to avoid complex serialization and remapping when writing data to storage modules [Simionescu, col. 5, lines 48-51].
Claim(s) 28 and 42 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rao, in view of Wang, in view of Simionescu, as applied to claims 25 and 39 above, and further in view of Xiang et al (hereinafter Xiang), U.S. Publication No. 2010/0026697 A1.
Referring to claims 28 and 42, taking claim 28 as exemplary, the modified Rao does not explicitly disclose the memory module of claim 25, wherein the operand is divided into a plurality of first tiles and a second operand Is divided into a plurality of second tiles, each tile comprising a plurality of values, and 
wherein the data layouts comprise a same page (SR) data layout, wherein the memory controller stores one or more of the first tiles and one or more of the second tiles in a same page of the DRAM cells.
However, Xiang discloses wherein the operand is divided into a plurality of first tiles and a second operand Is divided into a plurality of second tiles, each tile comprising a plurality of values [paragraph 61, grid of data may be partitioned into adjacent pages of tiles], and 
wherein the data layouts comprise a same page (SR) data layout, wherein the memory controller stores one or more of the first tiles and one or more of the second tiles in a same page of the DRAM cells [paragraph 61, Thus a page of tiles may be accessed within one page of DRAM memory without accessing other pages of DRAM memory].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Xiang in the invention of the modified Rao, to implement wherein the operand is divided into a plurality of first tiles and a second operand Is divided into a plurality of second tiles, each tile comprising a plurality of values, and wherein the data layouts comprise a same page (SR) data layout, wherein the memory controller stores one or more of the first tiles and one or more of the second tiles in a same page of the DRAM cells, because accessing one page of memory is more efficient than accessing multiple pages of memory [Xiang, paragraph 61].
Claim(s) 35-37 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rao, in view of Wang, in view of Simionescu, as applied to claim 25 above, and further in view of Sundaram et al (hereinafter Sundaram), U.S. Publication No. 2019/0310911 A1.
Referring to claim 35, the modified Rao does not explicitly disclose the memory module of claim 25, the IMC module of the DRAM bank further comprising a hardware buffer configured to buffer the output computed by the ALU.
However, Sundaram discloses the IMC module of the DRAM bank further comprising a hardware buffer configured to buffer the output computed by the ALU [fig. 1, memory 104 contains in-memory compute unit(s) 130; paragraph 21, resultant data is written to the output matrix C where matrix C 4 is temporarily stored as matrix data in the scratch pads 132; paragraph 32, the compute device 100 can then write the result of the in-memory compute operation to the memory media 110; hence memory media 110 is equivalent to the claimed “hardware buffer”].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Sundaram in the invention of the modified Rao, to implement the IMC module of the DRAM bank further comprising a hardware buffer configured to buffer the output computed by the ALU, in order to provide increased efficiency of in-memory compute operations and decrease consumed energy [Sundaram, paragraphs 1, 13].
Referring to claim 36, the modified Rao does not explicitly disclose the memory module of claim 35, wherein the hardware buffer is at least four times the size of a result register of the IMC module.
However, it has been held that “where the only difference between the prior art and the claims was a recitation of relative dimensions of the claimed device and a device having the claimed relative dimensions would not perform differently than the prior art device, the claimed device was not patentably distinct from the prior art device” [MPEP, 2144.04 IV. A.]. In this case, the only difference between Sundaram’s hardware buffer 110 and the claims, is a recitation of the hardware buffer having the claimed relative dimensions. Hence, the claimed hardware buffer having the claimed relative dimensions would not perform differently than the prior art device. The hardware buffer being at least four times the size of the result register is an obvious variant.
Referring to claim 37, the modified Rao does not explicitly disclose the memory module of claim 25, the IMC module of the DRAM bank further comprising an accumulator, the accumulator comprising an accumulator register configured to store an accumulated value, the accumulator being configured to: 
receive the output computed by the ALU; and 
update the accumulator register with the sum of the accumulated value and the output.
However, Sundaram discloses the IMC module of the DRAM bank further comprising an accumulator, the accumulator comprising an accumulator register configured to store an accumulated value [paragraph 31, “The matrix multiplication operation(s) may include one or more matrix multiply-accumulate operations (e.g., multiplying an input matrix A by a weight matrix B and accumulating into an output matrix C)”; paragraph 21, resultant data is written to the output matrix C where matrix C 4 is temporarily stored as matrix data in the scratch pads 132], the accumulator being configured to:
receive the output computed by the ALU [paragraphs 31, 21, resultant data is written to the output matrix C where matrix C 4 is temporarily stored as matrix data in the scratch pads 132; Tensor Unit 130 comprising Compute Logic 136]; and
update the accumulator register with the sum of the accumulated value and the output [paragraph 31, “The matrix multiplication operation(s) may include one or more matrix multiply-accumulate operations (e.g., multiplying an input matrix A by a weight matrix B and accumulating into an output matrix C)”; paragraph 21, resultant data is written to the output matrix C where matrix C 4 is temporarily stored as matrix data in the scratch pads 132].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Sundaram in the invention of the modified Rao, to implement the IMC module of the DRAM bank further comprising an accumulator, the accumulator comprising an accumulator register configured to store an accumulated value, the accumulator being configured to: 
receive the output computed by the ALU; and update the accumulator register with the sum of the accumulated value and the output, in order to provide increased efficiency of in-memory compute operations and decrease consumed energy [Sundaram, paragraphs 1, 13].
Claim(s) 38 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rao, in view of Wang, in view of Simionescu, as applied to claim 25 above, and further in view of Stocksdale et al (hereinafter Stocksdale), U.S. Publication No. 2018/0032437 A1.
Referring to claim 38, the modified Rao does not explicitly disclose the memory module of claim 25, wherein the memory module is a high bandwidth memory (HBM) module comprising a stack of memory dies connected by through silicon vias, the stack of memory dies comprising the memory die.
However, Stocksdale discloses wherein the memory module is a high bandwidth memory (HBM) module comprising a stack of memory dies connected by through silicon vias, the stack of memory dies comprising the memory die [paragraph 35, The physical configuration of an HBM stack 105 may include a logic die 110, and a three dimensional DRAM or "DRAM stack" 115, including a plurality of DRAM dies (e.g., 8 such dies) stacked on top of the logic die 110. Interconnections are formed within the stack with through-silicon vias (TSVs)].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Stocksdale in the invention of the modified Rao, to implement wherein the memory module is a high bandwidth memory (HBM) module comprising a stack of memory dies connected by through silicon vias, the stack of memory dies comprising the memory die, in order to provide faster conditional execution when performed entirely in the HBM stack than when such conditional execution involves the host processor [Stocksdale, paragraph 51].


Allowable Subject Matter
Claims 26-27, 29-34, 40-41, and 43-48 are objected to as being dependent upon a rejected base claim, but would be allowable if the nonstatutory double patenting rejection is overcome AND if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the data layouts comprise: a one operand (10P) data layout, wherein a first operand is read from the DRAM cells and the operand comprises a second operand, the second operand being supplied directly from the host processor to the IMC of the DRAM bank, in combination with other recited limitations in claim 26.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the IMC module further comprises an operand register, and wherein the memory controller is further configured to: store a first tile of the one or more of the first tiles in the operand register; and perform the ALU operation on the first tile stored in the operand register and each of the one or more second tiles stored in the same page of the array of DRAM cells as the first tile, in combination with other recited limitations in claim 29.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the operand is divided into a plurality of first tiles and a second operand Is divided into a plurality of second tiles, each tile comprising a plurality of values, and wherein the data layouts comprise a different page (DR) data layout wherein the memory controller stores a subset of the first tiles in a first page of the array of DRAM cells and a subset of the second tiles in a second page of the array of DRAM cells, in combination with other recited limitations in claim 33.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the data layouts comprise: a one operand (10P) data layout, wherein a first operand is written to the DRAM cells and the operand comprises a second operand, the second operand being supplied directly from a host processor to the IMC module of the DRAM bank, in combination with other recited limitations in claim 40.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the IMC module further comprises an operand register, and wherein the memory controller is further configured to: store a first tile of the one or more of the first tiles in the operand register; and perform the ALU operation on the first tile stored in the operand register and each of the one or more second tiles stored in the same page of the array of DRAM cells as the first tile, in combination with other recited limitations in claim 43.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the operand is divided into a plurality of first tiles and a second operand is divided into a plurality of second tiles, each tile comprising a plurality of values, and wherein the data layouts comprise a different page (DR) data layout wherein the memory controller stores a subset of the first tiles in a first page of the array of DRAM cells and a subset of the second tiles in a second page of the array of DRAM cells, in combination with other recited limitations in claim 47.
Claims 27, 30-32, 34, 41, and 43-46, and 48 are objected to by virtue of their dependency.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARLEY J ABAD whose telephone number is (571)270-3425. The examiner can normally be reached Mon-Thurs 8 AM - 7 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Farley Abad/           Primary Examiner, Art Unit 2181