DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. The prior-filed application is Application No. 16/158,593, filed on October 12, 2018. 

Specification
The disclosure is objected to because of the following informalities.
In ¶¶ 36 and 41, “memory region B (111)” may be amended to “memory region B (113)” to correct a typographical error according to FIG. 1 of the drawings and ¶ 37 of the specification.  (Emphasis added.)
In ¶¶ 49 and 92-93, “processing device (108)” may be amended to “processing device (109)” to correct a typographical error according to FIG. 1 of the drawings and ¶ 39 of the specification.  (Emphasis added.)
In ¶¶ 64-65 and 67, “another memory region C (127)” may be amended to “another memory region D (127)” to correct a typographical error according to FIG. 3 of the drawings.  (Emphasis added.)
Appropriate correction is required.
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

Claim Objections
Claims 1-3, 8-9, 12, 14, 17-18, and 20 are objected to because of the following informalities:
In claim 1, lines 19-22, “memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region” may be amended to “a memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating a computation of the list of results and a memory access to the third memory region” to correct a grammatical error.  (Emphasis added.)  
In claim 2, lines 2-3, “dynamic random access memory (DRAM), cross point memory, or flash memory, or any combination therein” may be amended to “a dynamic random access memory (DRAM), a cross point memory, or a flash memory, or any combination thereof” to correct a grammatical error.  (Emphasis added.)  
In claim 3, lines 1-2, “the plurality of memory regions is formed” may be amended to “the plurality of memory regions are formed” to correct a grammatical error.  (Emphasis added.)  
In claim 8, lines 3-4, “the plurality of data sets that can be processed in parallel” may be amended to “the plurality of data sets that are processed in parallel” for clarity to avoid ambiguity.  (Emphasis added.)  The term “can” in the limitation does not clearly and unambiguously define features that are essential to the invention.  The term causes the limitation to be indefinite because the term denotes that it is optional for the plurality of data sets to be processed in parallel.  For examination purpose, the limitation is interpreted to be “the plurality of data sets that are processed in parallel”.
In claim 9, line 3, “a list of results” may be amended to “the list of results” to follow proper antecedent basis.  (Emphasis added.)  
In claim 12, lines 18-24, “a third memory region in the plurality of memory regions; … memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region” may be amended to “a third memory region in a plurality of memory regions; … a memory access to the plurality of first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and a memory access to the third memory region” to correct a grammatical error and to follow proper antecedent basis.  (Emphasis added.)  
In claim 14, lines 1-2, “identifying a computation” may be amended to “identifying the computation” to follow proper antecedent basis.  (Emphasis added.)  
In claim 17, line 1, “the computing of the output” may be amended to “the computing of the list of results” to follow proper antecedent basis.  (Emphasis added.)  
In claim 20, lines 4-5, “the data loaded into the third memory” may be amended to “the input data loaded into the third memory region” to follow proper antecedent basis.  (Emphasis added.)  
Claims (e.g., claim 18, lines 24-27, etc.) with informalities that are the same as those above and not included here should be amended due to the same reasons set forth above.
Appropriate correction is required.  

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: 
“arithmetic compute element matrix” in claims 1, 3, 12, 18, and 20,
“communication interface” in claims 1, 12, and 18,
“state machine” in claims 7-8, and
“processing device” in claims 18-20.
The disclosure of the application describes corresponding structures of “arithmetic compute element matrix”, “communication interface”, and “processing device” in paragraphs [0025] and [0028] of the specification.  However, the disclosure of the application not describe corresponding structures of “state machine” anywhere in the specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-17 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-17 of U.S. Patent No. US 11/157,213 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because of the reasons as shown below.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 7.  An integrated circuit memory device, comprising:
a plurality of memory regions;
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request;
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions;

wherein, during a time period after the first request and before completion of storing the list of results into the second memory region,
the communication interface is configured to receive a second request to access a third memory region in the plurality of memory regions; and
in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating a computation of the list of results and memory access to the third memory region to service the second request through the communication interface, and 
further load input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix;
wherein after the time period, in response to a third request, the arithmetic compute element matrix is configured to compute a next list of results using the data loaded into the third memory region via the second request;
wherein during the time period in which the next list of results is computed by the arithmetic compute element matrix, the communication interface is configured to receive a fourth request to access the list of results computed responsive to the first request;
wherein the integrated circuit memory device is encapsulated within an integrated circuit package.
Claim 1.  An integrated circuit memory device, comprising: 
a plurality of memory regions;
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request;
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions;
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region,
the communication interface is configured to receive a second request to access a third memory region in the plurality of memory regions; and
in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface; and


















wherein the integrated circuit memory device is encapsulated within an integrated circuit package.


Although claim 1 of the instant application is directed to an integrated circuit memory device that is not identical to an integrated circuit memory device of claim 7 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 8. The integrated circuit memory device of claim 7, wherein the plurality of memory regions provides dynamic random access memory (DRAM), cross point memory, or flash memory, or any combination therein.
Claim 2.  The integrated circuit memory device of claim 1, wherein the plurality of memory regions provides dynamic random access memory (DRAM), cross point memory, or flash memory, or any combination therein.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 2 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 8 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 9.  The integrated circuit memory device of claim 8, wherein the plurality of memory regions is formed on a first integrated circuit die; and the arithmetic compute element matrix is formed on a second integrated circuit die different from the first integrated circuit die.
Claim 3.  The integrated circuit memory device of claim 2, wherein the plurality of memory regions is formed on a first integrated circuit die; and the arithmetic compute element matrix is formed on a second integrated circuit die different from the first integrated circuit die.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 3 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 9 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 10.  The integrated circuit memory device of claim 9, further comprising:
a set of through-silicon vias (TS Vs) coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions.
Claim 4.  The integrated circuit memory device of claim 3, further comprising:
a set of through-silicon vias (TSVs) coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 4 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 10 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 11.  The integrated circuit memory device of claim 9, further comprising:
wires encapsulated within the integrated circuit package and coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions.
Claim 5.  The integrated circuit memory device of claim 3, further comprising:
wires encapsulated within the integrated circuit package and coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 5 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 11 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 12.  The integrated circuit memory device of claim 7, wherein the arithmetic compute element matrix comprises:
an array of arithmetic logic units configured to perform an operation on a plurality of data sets in parallel, wherein each of the data sets includes one data element from each of the lists of operands.
Claim 6.  The integrated circuit memory device of claim 1, wherein the arithmetic compute element matrix comprises:
an array of arithmetic logic units configured to perform an operation on a plurality of data sets in parallel, wherein each of the data sets includes one data element from each of the lists of operands.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 6 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 12 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 13.  The integrated circuit memory device of claim 12, wherein the arithmetic compute element matrix comprises:
a state machine configured to control the array of arithmetic logic units to perform different computations identified by different codes of operations.
Claim 7.  The integrated circuit memory device of claim 6, wherein the arithmetic compute element matrix comprises:
a state machine configured to control the array of arithmetic logic units to perform different computations identified by different codes of operations.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 7 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 13 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 14.  The integrated circuit memory device of claim 13, wherein the state machine is further configured to control the array of arithmetic logic units to perform computations for the lists of operands that have more data sets than the plurality of data sets processed in parallel by the array of arithmetic logic units.
Claim 8.  The integrated circuit memory device of claim 7, wherein the state machine is further configured to control the array of arithmetic logic units to perform computations for the lists of operands that have more data sets than the plurality of data sets that can be processed in parallel by the array of arithmetic logic units.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 8 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 14 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 15.  The integrated circuit memory device of claim 13, wherein the arithmetic compute element matrix further comprises:
a cache memory configured to store the list of results generated in parallel by the array of arithmetic logic units.
Claim 9.  The integrated circuit memory device of claim 7, wherein the arithmetic compute element matrix further comprises:
a cache memory configured to store a list of results generated in parallel by the array of arithmetic logic units.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 9 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 15 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 16. The integrated circuit memory device of claim 15, wherein the third memory region is the same as the second memory region.
Claim 10.  The integrated circuit memory device of claim 9, wherein the third memory region is the same as the second memory region.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 10 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 16 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 17. The integrated circuit memory device of claim 15, wherein the third memory region is different from the second memory region.
Claim 11.  The integrated circuit memory device of claim 9, wherein the third memory region is different from the second memory region.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 11 of the instant application is directed to the integrated circuit memory device that is not identical to the integrated circuit memory device of claim 17 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 1.  A method implemented in an integrated circuit memory device, the method comprising:
storing a plurality of lists of operands in a plurality of first memory regions of the integrated circuit memory device;
receiving, in a communication interface of the integrated circuit memory device, a first request;
in response to the first request,
accessing, by an arithmetic compute element matrix of the integrated circuit memory device, the plurality of first memory regions in parallel;
computing, by the arithmetic compute element matrix, a list of results from the plurality of lists of operands stored in the plurality of first memory regions; and
storing, into a second memory region of the integrated circuit memory device, the list of results;
during a time period after the first request and before completion of the storing of the list of results into the second memory region, receiving, in the communication interface, a second request to access a third memory region in a plurality of memory regions;
in response to the second request and during the time period, providing, in parallel and by the integrated circuit memory device, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating a computation of the list of results and memory access to the third memory region to service the second request through the communication interface;
loading input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix;
wherein after the time period, in response to a third request, computing a next list of results using the data loaded into the third memory region via the second request; and
wherein during the time period in which the next list of results is computed by the arithmetic compute element matrix, receiving a fourth request to access the list of results computed responsive to the first request.
Claim 12.  A method implemented in an integrated circuit memory device, the method comprising:
storing a plurality of lists of operands in a plurality of first memory regions of the integrated circuit memory device;
receiving, in a communication interface of the integrated circuit memory device, a first request;
in response to the first request,
accessing, by an arithmetic compute element matrix of the integrated circuit memory device, the plurality of first memory regions in parallel;
computing, by the arithmetic compute element matrix, a list of results from the plurality of lists of operands stored in the plurality of first memory regions; and
storing, into a second memory region of the integrated circuit memory device, the list of results;
during a time period after the first request and before completion of the storing of the list of results into the second memory region, receiving, in the communication interface, a second request to access a third memory region in the plurality of memory regions; and
in response to the second request and during the time period, providing, in parallel and by the integrated circuit memory device, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface.


Although claim 12 of the instant application is directed to a method that is not identical to a method of claim 1 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 2.  The method of claim 1, wherein the first request is a memory access command configured to access a memory location in the integrated circuit memory device.
Claim 13.  The method of claim 12, wherein the first request is a memory access command configured to access a memory location in the integrated circuit memory device.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 13 of the instant application is directed to the method that is not identical to the method of claim 2 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 3.  The method of claim 2, 
wherein the memory location stores a code identifying the computation to be performed by the arithmetic compute element matrix to generate the list of results.
Claim 14.  The method of claim 13, 
wherein the memory location stores a code identifying a computation to be performed by the arithmetic compute element matrix to generate the list of results.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 14 of the instant application is directed to the method that is not identical to the method of claim 3 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 4. The method of claim 3, 
wherein the memory location is predefined to store the code.
Claim 15.  The method of claim 14, 
wherein the memory location is predefined to store the code.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 15 of the instant application is directed to the method that is not identical to the method of claim 4 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 5. The method of claim 3, 
wherein the second request is a memory read command, or a memory write command, or any combination thereof.
Claim 16.  The method of claim 14, 
wherein the second request is a memory read command, or a memory write command, or any combination thereof.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 16 of the instant application is directed to the method that is not identical to the method of claim 5 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 6.  The method of claim 1, wherein the computing of an output comprises:
performing an operation on a plurality of data sets in parallel to generate a plurality of results respectively, wherein each of the data sets includes one data element from each of the lists of operands.
Claim 17.  The method of claim 12, wherein the computing of the output comprises: 
performing an operation on a plurality of data sets in parallel to generate a plurality of results respectively, wherein each of the data sets includes one data element from each of the lists of operands.


The table immediately above contains only relevant portions of the claim of the instant application and the claim of the patent for comparison purpose. Please refer to the previously presented tables that show comparison between claims from which the claim of the instant application and the claim of the patent depend.

Although claim 17 of the instant application is directed to the method that is not identical to the method of claim 6 of the patent, the claim of the instant application is not patentably distinct from the claim of the patent because the claim of the instant application is anticipated by the claim of the patent.

A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.

Claim 20 is rejected under 35 U.S.C. 101 as claiming the same invention as that of claim 18 of prior U.S. Patent No. US 11/157,213 B2. This is a statutory double patenting rejection.

U.S. Patent No. US 11/157,213 B2
Instant Application 17/483,786
Claim 18.  A computing apparatus, comprising:
a processing device;
a memory device encapsulated within an integrated circuit package; and
a communication connection between the memory device and the processing device;
wherein the memory device comprises:
a plurality of memory regions;
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request from the processing device through the communication connection;
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions;
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region, the communication interface is configured to receive from the processing device, a second request to access a third memory region in the plurality of memory regions;
wherein, in response to the second request and during the time period, the memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating a computation of the list of results and memory access to the third memory region to service the second request through the communication interface;



wherein the processing device is configured to load input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix;



wherein after the time period, the processing device is configured to send a third request over the communication connection to the memory device; in response to the third request, the arithmetic compute element matrix computes a next list of results using the data loaded into the third memory region via the second request; and, during the time period in which the next list of results is computed by the arithmetic compute element matrix, the processing device sends a fourth request to the memory device to access the list of results computed responsive to the first request.
Claim 18.  A computing apparatus, comprising: 
a processing device;
a memory device encapsulated within an integrated circuit package; and
a communication connection between the memory device and the processing device;
wherein the memory device comprises: 
a plurality of memory regions;
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request from the processing device through the communication connection;
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions;
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region, the communication interface is configured to receive from the processing device, a second request to access a third memory region in the plurality of memory regions; and
wherein, in response to the second request and during the time period, the memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface.

Claim 19.  The computing apparatus of claim 18, 
wherein the processing device is configured to load input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix.

Claim 20.  The computing apparatus of claim 19, 
wherein after the time period, the processing device is configured to send a third request over the communication connection to the memory device; in response to the third request, the arithmetic compute element matrix computes a next list of results using the data loaded into the third memory via the second request; and, during a time period in which the next list of results is computed by the arithmetic compute element matrix, the processing device sends a fourth request to the memory device to access the list of results computed responsive to the first request.


Claim 20 of the instant application is directed to a computing apparatus that is the same invention as a computing apparatus of claim 18 of the patent.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding claim 1, the claim recites:
	An integrated circuit memory device, comprising: 
	(a) a plurality of memory regions;
(b) an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and
(c) a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request;
(d) wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions;
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region,
(e) the communication interface is configured to receive a second request to access a third memory region in the plurality of memory regions; and
(f) in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface; and
(g) wherein the integrated circuit memory device is encapsulated within an integrated circuit package.

Step 1: 
The claim recites an integrated circuit memory device, which is a machine that is a statutory category of invention.

Step 2A Prong One: 
Limitation (d) in the claim recites that the integrated circuit memory device comprising “wherein, in response to the first request, the arithmetic compute element matrix is configured to … generate a list of results from the plurality of lists of operands”.  The limitation falls into a group of mathematical concepts of abstract ideas because the list of results is computed from the plurality of lists of operands by the arithmetic compute element matrix using a mathematical calculation of the operands.  For example, the results can be calculated using a mathematical formula of Xi = Ai x Bi for i= 1, 2, ... , n (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong Two: 
Besides the abstract ideas, the claim recites additional elements of “an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; and a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request; wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, … and store the list of results in a second memory region in the plurality of memory regions; wherein, during a time period after the first request and before completion of storing the list of results into the second memory region, the communication interface is configured to receive a second request to access a third memory region in the plurality of memory regions; and in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface” in limitations (b)-(f).  
The additional elements of “a plurality of memory regions”, “an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel; … in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, … and store the list of results in a second memory region in the plurality of memory regions; … in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface; and wherein the integrated circuit memory device is encapsulated within an integrated circuit package” is a limitation that does no more than generally link a judicial exception to a particular technological environment because the parallel access of the plurality of memory regions, the access of the plurality of lists of operands, the storage of the list of results, and the parallel memory access to the first memory regions and the second memory region are for the use of the memory device to access and store data by the arithmetic compute element matrix so that the arithmetic compute element matrix can perform the operations for the computation of the list of results as mentioned above in the abstract ideas.
The plurality of memory regions, the arithmetic compute element matrix, the communication interface coupled to the arithmetic compute element matrix, and the integrated circuit memory device is encapsulated within an integrated circuit package are also an additional element that is used as a tool to access the plurality of memory regions, receive a first request, access a plurality of lists of operands, store the list of results, receive a second request, and provide, in parallel, memory access to the first memory regions and the second memory region to carry out limitations (b)-(f), but the arithmetic compute element matrix, the communication interface, and the integrated circuit memory device are recited so generically (i.e., no details whatsoever are provided other than that it is an “arithmetic compute element matrix”, “communication interface”, or “integrated circuit memory device”) that it represents no more than mere instructions to apply the judicial exceptions on a computer.  The additional element can also be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of an arithmetic compute element matrix, a communication interface, and an integrated circuit memory device in an integrated circuit package.  
Furthermore, the limitations related to “receive a first request” and “receive a second request” in limitations (c) and (e) is also an operation that, under its broadest reasonable interpretation, amounts to mere data gathering since the receiving operation is characteristic of receiving data itself.  Such data gathering is a form of an insignificant extra-solution activity added to a judicial exception.
Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application, and the claim is directed to the judicial exception.

Step 2B: 
The claim as a whole does not amount to significantly more than the recited exception.  The claim has other additional elements.  
The additional elements include the plurality of memory regions, the arithmetic compute element matrix, the communication interface, and the integrated circuit memory device to perform limitations (b)-(f).  As explained previously, the arithmetic compute element matrix, the communication interface, and the integrated circuit memory device are at best the equivalent of merely adding the words “apply it” to the judicial exception.  Mere instructions to apply an exception cannot provide an inventive concept.  Furthermore, the plurality of memory regions, the arithmetic compute element matrix, the communication interface, and the integrated circuit memory device are used as a tool to perform the otherwise mathematical concept.  
Furthermore, these additional elements amount to no more than mere generic computer functions to apply the exception using generic computing elements.  Mere generic computer functions to apply an exception using generic computing element cannot provide an inventive concept.  
In addition, as for the limitation of “receive a first request” and “receive a second request” as an insignificant extra-solution activity added to a judicial exception explained above since the receiving operation is characteristic of receiving data itself, the limitation is an element or a computer function that the courts have recognized as well-understood, routine, conventional activity in particular fields when it is claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity., such as “[r]eceiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362” (MPEP 2106.05(d)(II)).  
The additional elements also include “access the plurality of memory regions”, “access a plurality of lists of operands stored in first memory regions in the plurality of memory regions”, “store the list of results”, and “provide, in parallel, memory access to the first memory regions and the second memory region”.  These are elements associated with storing or retrieving data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Even when considered in combination, these additional elements represent mere instructions to apply an exception, a tool to perform the otherwise mental process, and mere generic computer functions to apply the exception using generic computing elements, which do not provide an inventive concept. 
Therefore, the claim is not eligible.

Regarding claims 2-11, the claims are dependent on claim 1 and include all the limitations of claim 1. Therefore, the dependent claims recite the same abstract idea of claim 1.

Further regarding claim 2, the claim recites a limitation of “wherein the plurality of memory regions provides dynamic random access memory (DRAM), cross point memory, or flash memory, or any combination therein”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of memory devices.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 3, the claim recites a limitation of “wherein the plurality of memory regions is formed on a first integrated circuit die; and the arithmetic compute element matrix is formed on a second integrated circuit die different from the first integrated circuit die”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of integrated circuit dies.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 4, the claim recites a limitation of “a set of through-silicon vias (TSVs) coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of through-silicon vias (TSVs) that are used in semiconductor fabrication for providing connectivity among integrated circuit dies.  
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 5, the claim recites a limitation of “wires encapsulated within the integrated circuit package and coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of wires in an integrated circuit package that is used in semiconductor fabrication for providing connectivity among integrated circuit dies. 
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 6, the claim recites a limitation of “an array of arithmetic logic units configured to perform an operation on a plurality of data sets in parallel, wherein each of the data sets includes one data element from each of the lists of operands”.  The limitation falls into a group of mathematical concepts of abstract ideas because the operation requires a computation from the plurality of data sets by the arithmetic logic units using a mathematical calculation of the data sets.  For example, the computation can be performed using a mathematical formula (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The limitation includes the array of arithmetic logic units, which is an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of an arithmetic logic unit used for mathematical operations.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 7, the claim recites a limitation of “a state machine configured to control the array of arithmetic logic units to perform different computations identified by different codes of operations”.  The limitation falls into a group of mathematical concepts of abstract ideas because the different computations identified by different codes of operations requires a computation by the arithmetic logic units using a mathematical calculation of the data sets.  For example, the computation can be performed using a mathematical formula (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 8, the claim recites a limitation of “wherein the state machine is further configured to control the array of arithmetic logic units to perform computations for the lists of operands that have more data sets than the plurality of data sets that can be processed in parallel by the array of arithmetic logic units”.  The limitation falls into a group of mathematical concepts of abstract ideas because the computations for the lists of operands requires a computation by the arithmetic logic units using a mathematical calculation of the data sets.  For example, the computation can be performed using a mathematical formula (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 9, the claim recites a limitation of “a cache memory configured to store a list of results generated in parallel by the array of arithmetic logic units”.  The limitation includes an additional element of “a cache memory configured to store a list of results”.  The cache memory is at best the equivalent of merely adding the words “apply it” to the judicial exception.  Mere instructions to apply an exception cannot provide an inventive concept.  Furthermore, the cache memory is used as a tool to store data for the performance of the otherwise mathematical concept.  
The limitation includes “store a list of results”, which is an additional element that associated with storing or retrieving data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 10, the claim recites a limitation of “wherein the third memory region is the same as the second memory region”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of memory devices.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 11, the claim recites a limitation of “wherein the third memory region is different from the second memory region”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of memory devices.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Claim 12 recites a method comprising elements for carrying out the same steps in claim 1.  Accordingly, claim 12 is also rejected for the same reasons as set forth for those in claim 1 above.

Regarding claims 13-17, the claims are dependent on claim 12 and include all the limitations of claim 12. Therefore, the dependent claims recite the same abstract idea of claim 12.

Further regarding claim 13, the claim recites a limitation of “wherein the first request is a memory access command configured to access a memory location in the integrated circuit memory device”.  The limitation includes an additional element of “access a memory location in the integrated circuit memory device”.  This is an element associated with storing or retrieving data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 14, the claim recites a limitation of “wherein the memory location stores a code identifying a computation to be performed by the arithmetic compute element matrix to generate the list of results”.  The limitation includes an additional element of “stores a code identifying a computation to be performed by the arithmetic compute element matrix to generate the list of results”.  This is an element associated with storing data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 15, the claim recites a limitation of “wherein the memory location is predefined to store the code”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of memory devices.  Furthermore, “store the code” is an element associated with storing data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 16, the claim recites a limitation of “wherein the second request is a memory read command, or a memory write command, or any combination thereof”.  The limitation includes an additional element that can be viewed as nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of memory devices.  
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 17, the claim recites a limitation of “performing an operation on a plurality of data sets in parallel to generate a plurality of results respectively, wherein each of the data sets includes one data element from each of the lists of operands”.  The limitation falls into a group of mathematical concepts of abstract ideas because the operation requires a computation from the plurality of data sets using a mathematical calculation of the data sets.  For example, the computation can be performed using a mathematical formula (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Claim 18 recites a computing apparatus comprising elements for carrying out substantially the same steps in claim 1.  Accordingly, claim 18 is also rejected for the same reasons as set forth for those in claim 1 above.

Regarding claims 19-20, the claims are dependent on claim 18 and include all the limitations of claim 18. Therefore, the dependent claims recite the same abstract idea of claim 18.

Further regarding claim 19, the claim recites a limitation of “wherein the processing device is configured to load input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix”.  The limitation includes an additional element of “the processing device is configured to load input data into the third memory region”.  This is an element associated with storing data that the courts have recognized as well-understood, routine, conventional activity in particular fields, such as “[s]toring and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93” (MPEP 2106.05(d)(II)).
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Further regarding claim 20, the claim recites a limitation of “in response to the third request, the arithmetic compute element matrix computes a next list of results using the data loaded into the third memory via the second request”.  The limitation falls into a group of mathematical concepts of abstract ideas because the computation for the next list of results requires a computation by the arithmetic compute element matrix using a mathematical calculation.  For example, the computation can be performed using a mathematical formula (see specification [0036]).  If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, mathematical formulas or equations, or mathematical calculations, then the claim limitation falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
In addition, the claim recites a limitation of “wherein after the time period, the processing device is configured to send a third request over the communication connection to the memory device; … during a time period in which the next list of results is computed by the arithmetic compute element matrix, the processing device sends a fourth request to the memory device to access the list of results computed responsive to the first request”.  The limitation includes “send a third request over the communication connection to the memory device” and “sends a fourth request to the memory device”, which is an element or a computer function that the courts have recognized as well-understood, routine, conventional activity in particular fields when it is claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity., such as “[r]eceiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362” (MPEP 2106.05(d)(II)).  
Accordingly, the claim recites an abstract idea, and thus is not patent eligible.

Claims 1-20 are therefore not drawn to eligible subject matter as they are directed to an abstract idea without significantly more. 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 7-11 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Regarding claim 7, the limitation “a state machine configured to control the array of arithmetic logic units” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. There is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function because there is no association between the structure and the function can be found throughout the specification, such as paragraph [0039] as an example. Therefore, the claims fail to comply with the written description requirement and are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph.

Regarding dependent claims 8-11, the dependent claims are also rejected since the dependent claims depend on claim 7 and do not overcome the deficiency thereof for the reasons stated above in the rejections of claim 7.

Further regarding claim 8, the limitation “the state machine is further configured to control the array of arithmetic logic units” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. There is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function because there is no association between the structure and the function can be found throughout the specification, such as paragraph [0039] as an example. Therefore, the claims fail to comply with the written description requirement and are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7-11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 7, the limitation “a state machine configured to control the array of arithmetic logic units” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. There is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function because there is no association between the structure and the function can be found throughout the specification, such as paragraph [0039] as an example. Therefore, the claim is indefinite and are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.

Regarding dependent claims 8-11, the dependent claims are also rejected since the dependent claims depend on claim 7 and do not overcome the deficiency thereof for the reasons stated above in the rejections of claim 7.

Further regarding claim 8, the limitation “the state machine is further configured to control the array of arithmetic logic units” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. There is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function because there is no association between the structure and the function can be found throughout the specification, such as paragraph [0039] as an example. Therefore, the claim is indefinite and are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.

Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-2, 6-9, and 11-19 are rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi et al. (US 2019/0042251 A1), hereinafter “Nurvitadhi”, in view of Yu et al. (US 2020/0004514 A1), hereinafter “Yu”.

	Regarding claim 1, Nurvitadhi teaches:
An integrated circuit memory device (FIGs. 1-2; ¶ 28, “Using the system 10, a designer may implement a circuit design functionality on an integrated circuit, such as a reconfigurable programmable logic device 12 [integrated circuit memory device], such as a field programmable gate array (FPGA)”; ¶ 29, “the programmable logic device 12 [integrated circuit memory device] may include a fabric die 22 that communicates with a base die 24. The base die 24 may perform compute-in-memory arithmetic computations in the memory of the base die 24, while the fabric die 22 may be used for general purposes”), comprising: 
a plurality of memory regions (FIG. 3; ¶ 43, “The FPGA 40 of FIG. 3 is shown to be sectorized, meaning that programmable logic resources may be distributed through a number of discrete programmable logic sectors 48 (e.g., region, portion)”; ¶ 43, “there may be N regions of sector-aligned memory 92 [memory regions] that can be accessible by N corresponding fabric sectors 80 at the same time (e.g., in parallel). … The sector-aligned memory 92 is shown in FIG. 6 as vertically stacked memory. This may allow a large amount of memory to be located within the base die 24”); 
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel (FIGs. 6, 8, 9A-9B, 10; ¶ 43, “there may be N regions of sector-aligned memory 92 [memory regions] that can be accessible by N corresponding fabric sectors 80 at the same time (e.g., in parallel). … The sector-aligned memory 92 is shown in FIG. 6 as vertically stacked memory. This may allow a large amount of memory to be located within the base die 24”; ¶ 47, “the on-chip memory 126 may include memory banks divided into multiple memory sectors 136, which may include dedicated blocks of random access memory (RAM), such as the sector-aligned memory 92 [memory regions]. Some of the sector-aligned memory 92 may be integrated with compute-in-memory circuitry 71 [arithmetic compute element matrix]. The compute-in-memory circuitry 71 [arithmetic compute element matrix] associated with sector-aligned memory 92 [memory regions] may have a corresponding controller 138 … the controller 138 may control a sequence of compute-in-memory operations using multiple integrated sector-aligned memory 92 [memory regions] units and compute-in-memory circuitry 71 [arithmetic compute element matrix]”; ¶ 50, “the application 122 may communicate with the base die 24 to scatter specific different data to multiple instances of the compute-in-memory circuitry 71 [arithmetic compute element matrix] via multiple interfaces, performing a parallel scatter operation, as shown in FIG. 9B. In this manner, the compute-in-memory circuitry 71 may receive the multiple different data in parallel. Thus, the interconnect paths between the dies 22, 24, the multiple sectors 136, and the sector-aligned memory 92 [memory regions] may allow the compute-in-memory circuitry 71 to efficiently receive scattered data from the application 122 to perform calculations in the sector-aligned memory 92”; ¶ 54, “FIG. 10 depicts using the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform a compute-in-memory operation that may be used for tensor operations. Tensors are data structures, such as matrices and vectors, that may be used to calculate arithmetic operations. Particularly, dot product of vectors and matrices (matrix multiplication) may be used for deep learning or training an algorithm”); and 
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request (FIGs. 2, 7; ¶ 44, “the on-chip memory 126 stores computational data 131 that may be used in computations by the compute-in-memory circuitry 71 [arithmetic compute element matrix] to carry out requests [first request] by the application 122. The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”); 
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of lists of operands, and store the list of results in a second memory region in the plurality of memory regions (FIGs. 7, 10; ¶ 44, “much of the data may reside in on-chip memory 126 (e.g., which may represent memory of the sector-aligned memory 92) in the base die 24 (which may be understood to be off-chip from the fabric die 22) and/or in off-chip memory 127 located elsewhere. In the example of FIG. 7, the on-chip memory 126 stores computational data 131 that may be used in computations by the compute-in-memory circuitry 71 [arithmetic compute element matrix] to carry out requests [first request] by the application 122”; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations”; note that note that the first memory region is where the vectors and matrices [lists of operands] stored in the sector-aligned memory 91; cc); 
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region (FIG. 11A; ¶ 58, “To illustrate the type of dot product operations that may be performed using the compute-in-memory architecture described above, FIG. 11A shows a sequence of computations to perform matrix operations and FIG. 11B shows a sequence of computations to perform convolution operations. In FIG. 11A, multiple vectors may be simultaneously sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159”; ¶ 59, “the dot product engines 142 corresponding to sector 0 aligned memories 158, 159 may compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, to determine M0,0 and M0,1. These partial computations may be gathered or accumulated by the accumulator 144, and reduced using the techniques described above, and read to the application 122 to be stored as a partial sum, first vector output 166, Vo0.”; note that a time period is after the application 122 sends [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 and before the generating and the sending of the first vector output 166, Vo0 to the application as shown in FIG. 11A), 
the communication interface is configured to receive a second request to access a third memory region in the plurality of memory regions (FIG. 11A; ¶ 44, “The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 58, “The application may also send [second request] a second vector input (Vi1) to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; ¶ 59, “the dot product engines 142 corresponding to sector 1 aligned memories 160, 161 [third memory region] may compute a product of the second vector input 156 and the first matrix 162, and a product of the second vector input 156 and the second matrix 164, to determine M1,0 and M1,1”); and 
in response to the second request and during the time period, the integrated circuit memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface (FIG. 11A; ¶ 43, “As mentioned above, to facilitate the use of the sector-aligned memory 92 [memory regions], the embedded NOC 100 or another interconnect may enable communication between memory components of the sector-aligned memory 92 [memory regions] of the base die 24 and the sectors 48 or other components (e.g., CRAM) of the fabric die 22”; ¶ 44, “The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations. Additionally or alternatively, the dot product engine 142 may send the computed data to an accumulator 148”; ¶ 58, “multiple vectors may be simultaneously [parallel] sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send [second request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159. The application may also send [second request] a second vector input (Vi1) to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 [third memory region] may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; ¶ 63, “The programmable logic device 12 [integrated circuit memory device] may be, or may be a component of, a data processing system. … The data processing system 220 may include several different packages or may be contained within a single package on a single package substrate.”; note that the time period is after the application 122 sends [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 and before the generating and the sending of the first vector output 166, Vo0 to the application as shown in FIG. 11A.  The second memory region is a portion of the sector-aligned memory 92 [memory regions] where the computed data [list of results] are stored in the sector-aligned memory 92 [memory regions]); and 
wherein the integrated circuit memory device is encapsulated within an integrated circuit package (FIG. 13; ¶ 28; ¶ 63, “The programmable logic device 12 [integrated circuit memory device] may be, or may be a component of, a data processing system. … The data processing system 220 may include several different packages or may be contained within a single package on a single package substrate.”, ¶ 65).  

	Nurvitadhi teaches a time period.  Nevertheless, Nurvitadhi does not explicitly teach a time period after the first request and before completion of storing the list of results into the second memory region.

However, Yu teaches:
a time period after the first request and before completion of storing the list of results into the second memory region (FIGs. 8, 11B; ¶ 1332, “The data loading engine 831 can execute a data loading instruction that loads data for performing neural network computation from the external memory to the internal buffer. The loaded data may include parameter data and feature map data. The parameter data may include weight data (e.g., convolution kernels) and other parameters such as biases. The feature map data may include input image data, and may also include intermediate calculation results of the respective convolutional layers. The data operation engine 832 can execute a data operation instruction that reads the weight data and the feature map data from the internal buffer 820 to perform an operation and stores the operational result back to the internal buffer 820. The data storage engine 833 can then execute a data storage instruction that stores the operational result from internal buffer 820 back to the external memory 840”; ¶ 133, “the acquired instructions for neural network computation may include: a data loading instruction that loads data for neural network computation from the external memory to the internal buffer, the data for neural network computation includes parameter data and feature map data; a data operation instruction [first request] that reads the parameter data and the feature map data from the internal buffer to perform an operation and stores the result of the operation back to the internal buffer [second memory region]; and a data storage instruction [second request] that stores the operational result from the internal buffer back to the external memory”; ¶ 135, “in a neural network specialized processor, the execution of the subsequent instruction [second request] may be started using other engines before the execution of the current instruction [first request] is completed, as shown in FIG. 11B. Thus, the overall computational efficiency of the computing system is improved by temporally partially superimposing the execution of the instructions that originally have dependency relationships”; note that a time period includes a duration of execution of the data operation instruction [first request], and the data storage instruction [second request] as the subsequent instruction that stores the operational result from the internal buffer back to the external memory using the data storage engine 833 occurs before the data operation instruction [first request] as the current instruction executed by the data operation engine 832 is completed as shown in FIG. 11B).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Nurvitadhi to incorporate the teachings of Yu to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with a high parallelism computing system for artificial intelligence applications of Yu having the data loading engine 831, the data operation engine 832, and the data storage engine 833 implement respective instruction functions under the scheduling of internal instruction reading and distribution module 810.  Doing so with the device of Nurvitadhi would make full use of the parallel execution capability of each module in the computing platform to improve the system computing efficiency to optimize high parallelism computation.  (Yu, ¶¶ 3-5) 

Regarding claim 12, the claimed method comprises substantially the same steps or elements as those in claim 1.  Accordingly, the claim is also rejected for the same reasons as set forth for those in claim 1 above.

	Regarding claim 2, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 1.

Nurvitadhi further teaches:
wherein the plurality of memory regions provides dynamic random access memory (DRAM), cross point memory, or flash memory, or any combination therein (FIG. 13; ¶ 63, “The programmable logic device 12 may be, or may be a component of, a data processing system. For example, the programmable logic device 12 may be a component of a data processing system 220, shown in FIG. 13. … The memory and/or storage circuitry 224 may include random access memory (RAM), read-only memory (ROM), one or more hard drives, flash memory”).  

Regarding claim 6, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 1.

Nurvitadhi further teaches:
wherein the arithmetic compute element matrix comprises: 
an array of arithmetic logic units configured to perform an operation on a plurality of data sets in parallel (FIG. 11A; ¶ 58, “To illustrate the type of dot product operations that may be performed using the compute-in-memory architecture described above, FIG. 11A shows a sequence of computations to perform matrix operations and FIG. 11B shows a sequence of computations to perform convolution operations. In FIG. 11A, multiple vectors may be simultaneously sent from the application 122 to the base die 24”), wherein each of the data sets includes one data element from each of the lists of operands (FIG. 11A; ¶ 59, “the dot product engines 142 corresponding to sector 0 aligned memories 158, 159 may compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, to determine M0,0 and M0,1.”).  

Regarding claim 7, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 6.

Nurvitadhi further teaches:
wherein the arithmetic compute element matrix comprises: 
a state machine configured to control the array of arithmetic logic units to perform different computations identified by different codes of operations (FIG. 8; ¶ 47, “The compute-in-memory circuitry 71 associated with sector-aligned memory 92 may have a corresponding controller 138 (e.g., a state machine, an instruction set architecture (ISA) based processor, a reduced instruction set computer (RISC) processor, or the like). The controller 138 may be used to move computational data 131 between the sectors 136 and dies 22, 24. … the controller 138 may control a sequence of compute-in-memory operations using multiple integrated sector-aligned memory 92 units and compute-in-memory circuitry 71. In this manner, the fabric die 22 may offload application-specific commands to the compute-in-memory circuitry 71 in the base die 24”).  

Regarding claim 8, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 7.

Nurvitadhi further teaches:
wherein the state machine is further configured to control the array of arithmetic logic units to perform computations for the lists of operands that have more data sets than the plurality of data sets that can be processed in parallel by the array of arithmetic logic units (FIG. 11A; ¶ 58, “multiple vectors may be simultaneously sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159.”; ¶ 59, “the dot product engines 142 corresponding to sector 0 aligned memories 158, 159 may compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, to determine M0,0 and M0,1.”; note that for each sector (such as sector 0), there are only 2 dot product engines 142 to compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, but there are a total of 2 vectors 150 and 151 for the data sets to processed and thus one sector is not enough to processed both vectors).  

Regarding claim 9, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 7.

Nurvitadhi further teaches:
wherein the arithmetic compute element matrix further comprises: 
a cache memory configured to store a list of results generated in parallel by the array of arithmetic logic units (FIGs. 7, 10; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations”; note that the computed data [list of results] is stored in the sector-aligned memory 92 [memory regions], which is thus considered a cache memory).  

Regarding claim 11, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 9.

Nurvitadhi further teaches:
wherein the third memory region is different from the second memory region (FIG. 11A; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations. Additionally or alternatively, the dot product engine 142 may send the computed data to an accumulator 148”; ¶ 58, “To illustrate the type of dot product operations that may be performed using the compute-in-memory architecture described above, FIG. 11A shows a sequence of computations to perform matrix operations and FIG. 11B shows a sequence of computations to perform convolution operations. In FIG. 11A, multiple vectors may be simultaneously sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 [second memory region]”; ¶ 59, “the dot product engines 142 corresponding to sector 1 aligned memories 160, 161 [third memory region] may compute a product of the second vector input 156 and the first matrix 162, and a product of the second vector input 156 and the second matrix 164, to determine M1,0 and M1,1”; note that the a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 [second memory region] are different from the sector 1 aligned memories 160, 161 [third memory region]).  

Regarding claim 13, the combination of Nurvitadhi teaches the method of claim 12.

Nurvitadhi further teaches:
wherein the first request is a memory access command configured to access a memory location in the integrated circuit memory device (FIGs. 9A-9E; ¶ 48, “To illustrate some different application-specific compute-in-memory calculations that may be performed using the integrated sector-aligned memory 92 and compute-in-memory circuitry 71 architecture, FIGS. 9A, 9B, 9C, 9D, and 9E depict various operation sequences that may support the computations, such as gather and scatter operations. Briefly, gather and scatter operations are two data-transfer operations, transferring a number of data items by reading from (gathering) or writing to (scattering) a given location.”).  

Regarding claim 14, the combination of Nurvitadhi teaches the method of claim 13.

Nurvitadhi further teaches:
wherein the memory location stores a code identifying a computation to be performed by the arithmetic compute element matrix to generate the list of results (FIG. 11A; ¶ 60, “The sector-aligned memories 172, 174, 176, 178 may store functions, f1 and f2, which may be used by the dot product engines 142 for convolution computations”).  

Regarding claim 15, the combination of Nurvitadhi teaches the method of claim 14.

Nurvitadhi further teaches:
wherein the memory location is predefined to store the code (FIGs. 11A-11B; ¶ 60, “The sector-aligned memories 172, 174, 176, 178 may store functions, f1 and f2, which may be used by the dot product engines 142 for convolution computations”; note that since locations in the sector-aligned memories 172, 174, 176, 178 must be determined prior to storing the functions, f1 and f2, the locations are considered as predefined).  

Regarding claim 16, the combination of Nurvitadhi teaches the method of claim 14.

Nurvitadhi further teaches:
wherein the second request is a memory read command, or a memory write command, or any combination thereof (FIGs. 9A-9E, 11A; ¶ 44, “The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 58, “The application may also send [second request] a second vector input (Vi1) to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; ¶ 48, “To illustrate some different application-specific compute-in-memory calculations that may be performed using the integrated sector-aligned memory 92 and compute-in-memory circuitry 71 architecture, FIGS. 9A, 9B, 9C, 9D, and 9E depict various operation sequences that may support the computations, such as gather and scatter operations. Briefly, gather and scatter operations are two data-transfer operations, transferring a number of data items by reading from (gathering) or writing to (scattering) a given location.”).  

Regarding claim 17, the combination of Nurvitadhi teaches the method of claim 12.

Nurvitadhi further teaches:
wherein the computing of the output comprises: 
performing an operation on a plurality of data sets in parallel to generate a plurality of results respectively (FIGs. 7, 10, 11A; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations”; ¶ 58, “To illustrate the type of dot product operations that may be performed using the compute-in-memory architecture described above, FIG. 11A shows a sequence of computations to perform matrix operations and FIG. 11B shows a sequence of computations to perform convolution operations. In FIG. 11A, multiple vectors may be simultaneously sent from the application 122 to the base die 24”), wherein each of the data sets includes one data element from each of the lists of operands (FIG. 11A; ¶ 59, “the dot product engines 142 corresponding to sector 0 aligned memories 158, 159 may compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, to determine M0,0 and M0,1.”).  

	Regarding claim 18, Nurvitadhi teaches:
A computing apparatus, comprising: 
a processing device (FIGs. 1-2; ¶ 29, “the programmable logic device 12 may include a fabric die 22 [processing device] that communicates with a base die 24”); 
a memory device encapsulated within an integrated circuit package (FIG. 13; ¶ 28; ¶ 30, “the programmable logic device 12 includes the fabric die 22 and the base die 24 [memory device] that are connected to one another via microbumps 26”; ¶ 63, “The programmable logic device 12 may be, or may be a component of, a data processing system. … The data processing system 220 may include several different packages or may be contained within a single package on a single package substrate.”; ¶ 65); and 
a communication connection between the memory device and the processing device (FIGs. 1-2; ¶ 29, “the programmable logic device 12 may include a fabric die 22 [processing device] that communicates with a base die 24 [memory device]”; ¶ 30, “The base die 24 [memory device] may attach to a package substrate 32 via C4 bumps 34. In the example of FIG. 2, two pairs of fabric die 22 [processing device] and base die 24 [memory device] are shown communicatively connected to one another via a silicon bridge 36 (e.g., an embedded multi-die interconnect bridge (EMIB)) and microbumps 38 at a silicon bridge interface 39.”; ¶ 30, “the programmable logic device 12 includes the fabric die 22 [processing device] and the base die 24 [memory device] that are connected to one another via microbumps 26”); 
wherein the memory device comprises: 
a plurality of memory regions (FIG. 3; ¶ 43, “The FPGA 40 of FIG. 3 is shown to be sectorized, meaning that programmable logic resources may be distributed through a number of discrete programmable logic sectors 48 (e.g., region, portion)”; ¶ 43, “there may be N regions of sector-aligned memory 92 [memory regions] that can be accessible by N corresponding fabric sectors 80 at the same time (e.g., in parallel). … The sector-aligned memory 92 is shown in FIG. 6 as vertically stacked memory. This may allow a large amount of memory to be located within the base die 24”;); 
an arithmetic compute element matrix coupled to access the plurality of memory regions in parallel (FIGs. 6, 8, 9A-9B, 10; ¶ 43, “there may be N regions of sector-aligned memory 92 [memory regions] that can be accessible by N corresponding fabric sectors 80 at the same time (e.g., in parallel). … The sector-aligned memory 92 is shown in FIG. 6 as vertically stacked memory. This may allow a large amount of memory to be located within the base die 24”; ¶ 47, “the on-chip memory 126 may include memory banks divided into multiple memory sectors 136, which may include dedicated blocks of random access memory (RAM), such as the sector-aligned memory 92 [memory regions]. Some of the sector-aligned memory 92 may be integrated with compute-in-memory circuitry 71 [arithmetic compute element matrix]. The compute-in-memory circuitry 71 [arithmetic compute element matrix] associated with sector-aligned memory 92 [memory regions] may have a corresponding controller 138 … the controller 138 may control a sequence of compute-in-memory operations using multiple integrated sector-aligned memory 92 [memory regions] units and compute-in-memory circuitry 71 [arithmetic compute element matrix]”; ¶ 50, “the application 122 may communicate with the base die 24 to scatter specific different data to multiple instances of the compute-in-memory circuitry 71 [arithmetic compute element matrix] via multiple interfaces, performing a parallel scatter operation, as shown in FIG. 9B. In this manner, the compute-in-memory circuitry 71 may receive the multiple different data in parallel. Thus, the interconnect paths between the dies 22, 24, the multiple sectors 136, and the sector-aligned memory 92 [memory regions] may allow the compute-in-memory circuitry 71 to efficiently receive scattered data from the application 122 to perform calculations in the sector-aligned memory 92”; ¶ 54, “FIG. 10 depicts using the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform a compute-in-memory operation that may be used for tensor operations. Tensors are data structures, such as matrices and vectors, that may be used to calculate arithmetic operations. Particularly, dot product of vectors and matrices (matrix multiplication) may be used for deep learning or training an algorithm”); and 
a communication interface coupled to the arithmetic compute element matrix and configured to receive a first request from the processing device through the communication connection (FIGs. 2, 7; ¶ 30, “The base die 24 [memory device] may attach to a package substrate 32 via C4 bumps 34. In the example of FIG. 2, two pairs of fabric die 22 [processing device] and base die 24 [memory device] are shown communicatively connected to one another via a silicon bridge 36 (e.g., an embedded multi-die interconnect bridge (EMIB)) and microbumps 38 at a silicon bridge interface 39.”; ¶ 44, “A circuit design define an application 122 (e.g., an accelerator function such as an artificial intelligence (AI) function) that may involve a large amount of data, as in the example shown in FIG. 7. In this case, much of the data may reside in on-chip memory 126 (e.g., which may represent memory of the sector-aligned memory 92) in the base die 24 (which may be understood to be off-chip from the fabric die 22 [processing device]) … the on-chip memory 126 stores computational data 131 that may be used in computations by the compute-in-memory circuitry 71 [arithmetic compute element matrix] to carry out requests [first request] by the application 122. The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; note that the application 122 is in the fabric die 22 [processing device] as illustrated in FIG. 7); 
wherein, in response to the first request, the arithmetic compute element matrix is configured to access a plurality of lists of operands stored in first memory regions in the plurality of memory regions, generate a list of results from the plurality of -- 5 --App. Ser. No.: 16/158,593Attorney Docket No.: 120426-158900/US lists of operands, and store the list of results in a second memory region in the plurality of memory regions (FIGs. 7, 10; ¶ 44, “much of the data may reside in on-chip memory 126 (e.g., which may represent memory of the sector-aligned memory 92) in the base die 24 (which may be understood to be off-chip from the fabric die 22) and/or in off-chip memory 127 located elsewhere. In the example of FIG. 7, the on-chip memory 126 stores computational data 131 that may be used in computations by the compute-in-memory circuitry 71 [arithmetic compute element matrix] to carry out requests [first request] by the application 122”; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations”; note that note that the first memory region is where the vectors and matrices [lists of operands] stored in the sector-aligned memory 91; a second memory region is a portion of the sector-aligned memory 92 [memory regions] where the computed data [list of results] are stored in the sector-aligned memory 92 [memory regions]); 
wherein, during a time period after the first request and before completion of storing the list of results into the second memory region (FIG. 11A; ¶ 58, “To illustrate the type of dot product operations that may be performed using the compute-in-memory architecture described above, FIG. 11A shows a sequence of computations to perform matrix operations and FIG. 11B shows a sequence of computations to perform convolution operations. In FIG. 11A, multiple vectors may be simultaneously sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159”; ¶ 59, “the dot product engines 142 corresponding to sector 0 aligned memories 158, 159 may compute a product of the first vector input 154 and the first matrix 162, and a product of the first vector input 154 and the second matrix 164, to determine M0,0 and M0,1. These partial computations may be gathered or accumulated by the accumulator 144, and reduced using the techniques described above, and read to the application 122 to be stored as a partial sum, first vector output 166, Vo0.”; note that a time period is after the application 122 sends [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 and before the generating and the sending of the first vector output 166, Vo0 to the application as shown in FIG. 11A), the communication interface is configured to receive from the processing device, a second request to access a third memory region in the plurality of memory regions (FIG. 11A; ¶ 44, “The application 122 [in the fabric die 22, which is the processing device] may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 58, “The application may also send [second request] a second vector input (Vi1) to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; ¶ 59, “the dot product engines 142 corresponding to sector 1 aligned memories 160, 161 [third memory region] may compute a product of the second vector input 156 and the first matrix 162, and a product of the second vector input 156 and the second matrix 164, to determine M1,0 and M1,1”); and 
wherein, in response to the second request and during the time period, the memory device is configured to provide, in parallel, memory access to the first memory regions and the second memory region to the arithmetic compute element matrix in facilitating the computation of the list of results and memory access to the third memory region to service the second request through the communication interface (FIG. 11A; ¶ 30, “The base die 24 [memory device] may attach to a package substrate 32 via C4 bumps 34. In the example of FIG. 2, two pairs of fabric die 22 [processing device] and base die 24 [memory device] are shown communicatively connected to one another via a silicon bridge 36 (e.g., an embedded multi-die interconnect bridge (EMIB)) and microbumps 38 at a silicon bridge interface 39.”; ¶ 43, “As mentioned above, to facilitate the use of the sector-aligned memory 92 [memory regions], the embedded NOC 100 or another interconnect may enable communication between memory components of the sector-aligned memory 92 [memory regions] of the base die 24 and the sectors 48 or other components (e.g., CRAM) of the fabric die 22”; ¶ 44, “The application 122 may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations. Additionally or alternatively, the dot product engine 142 may send the computed data to an accumulator 148”; ¶ 58, “multiple vectors may be simultaneously [parallel] sent from the application 122 to the base die 24. As shown, the base die 24 memory may be grouped into multiple sectors 136, such as a first sector 150 (sector 0) and a second sector 152 (sector 1). The application 122 may send [second request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159. The application may also send [second request] a second vector input (Vi1) to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 [third memory region] may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; note that the time period is after the application 122 sends [first request] a first vector input 154 (Vi0) to a first sector 0 aligned memory 158 and second sector 0 aligned memory 159 and before the generating and the sending of the first vector output 166, Vo0 to the application as shown in FIG. 11A; the second memory region is a portion of the sector-aligned memory 92 [memory regions] where the computed data [list of results] are stored in the sector-aligned memory 92 [memory regions]).  

Nurvitadhi teaches a time period.  Nevertheless, Nurvitadhi does not explicitly teach a time period after the first request and before completion of storing the list of results into the second memory region.

However, Yu teaches:
a time period after the first request and before completion of storing the list of results into the second memory region (FIGs. 8, 11B; ¶ 1332, “The data loading engine 831 can execute a data loading instruction that loads data for performing neural network computation from the external memory to the internal buffer. The loaded data may include parameter data and feature map data. The parameter data may include weight data (e.g., convolution kernels) and other parameters such as biases. The feature map data may include input image data, and may also include intermediate calculation results of the respective convolutional layers. The data operation engine 832 can execute a data operation instruction that reads the weight data and the feature map data from the internal buffer 820 to perform an operation and stores the operational result back to the internal buffer 820. The data storage engine 833 can then execute a data storage instruction that stores the operational result from internal buffer 820 back to the external memory 840”; ¶ 133, “the acquired instructions for neural network computation may include: a data loading instruction that loads data for neural network computation from the external memory to the internal buffer, the data for neural network computation includes parameter data and feature map data; a data operation instruction [first request] that reads the parameter data and the feature map data from the internal buffer to perform an operation and stores the result of the operation back to the internal buffer [second memory region]; and a data storage instruction [second request] that stores the operational result from the internal buffer back to the external memory”; ¶ 135, “in a neural network specialized processor, the execution of the subsequent instruction [second request] may be started using other engines before the execution of the current instruction [first request] is completed, as shown in FIG. 11B. Thus, the overall computational efficiency of the computing system is improved by temporally partially superimposing the execution of the instructions that originally have dependency relationships”; note that a time period includes a duration of execution of the data operation instruction [first request], and the data storage instruction [second request] as the subsequent instruction that stores the operational result from the internal buffer back to the external memory using the data storage engine 833 occurs before the data operation instruction [first request] as the current instruction executed by the data operation engine 832 is completed as shown in FIG. 11B).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Nurvitadhi to incorporate the teachings of Yu to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with a high parallelism computing system for artificial intelligence applications of Yu having the data loading engine 831, the data operation engine 832, and the data storage engine 833 implement respective instruction functions under the scheduling of internal instruction reading and distribution module 810.  Doing so with the device of Nurvitadhi would make full use of the parallel execution capability of each module in the computing platform to improve the system computing efficiency to optimize high parallelism computation.  (Yu, ¶¶ 3-5)

Regarding claim 19, the combination of Nurvitadhi teaches the computing apparatus of claim 18.

Nurvitadhi further teaches:
wherein the processing device is configured to load input data into the third memory region via the second request during the time period in which the list of results are computed in the arithmetic compute element matrix (FIG. 11A; ¶ 44, “The application 122 [in the fabric die 22, which is the processing device] may communicate with the on-chip memory 126 via an interconnect 132 [communication interface], which may represent the silicon bridge 36 of FIG. 2”; ¶ 56, “the controller 138 may control the compute-in-memory circuitry 71 [arithmetic compute element matrix] to perform the arithmetic computations. In this example, the compute-in-memory circuitry 71 is operated as a dot product engine (DPE) 142. The dot product engine 142 may compute the dot product of vectors and matrices [lists of operands] stored in the sector-aligned memory 91 [first memory regions]”; ¶ 57, “After the data is received by the dot product engine 142 and/or dot product has been computed, the dot product engine 142 may send the computed data [list of results] to the sector-aligned memory 92 [memory regions] to store the data for future use or additional computations. Additionally or alternatively, the dot product engine 142 may send the computed data [list of results] to an accumulator 148”; ¶ 58, “The application may also send [second request] a second vector input (Vi1) [input data] to first sector 1 aligned memory 160 and a second sector 1 aligned memory 161. The sector 0 aligned memories 158, 159 and the sector 1 aligned memories 160, 161 may already store matrix data, such as a first matrix 162 (M0) and a second matrix 164 (M1)”; ¶ 59, “the dot product engines 142 corresponding to sector 1 aligned memories 160, 161 [third memory region] may compute a product of the second vector input 156 and the first matrix 162, and a product of the second vector input 156 and the second matrix 164, to determine M1,0 and M1,1”).  

Claims 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi et al. (US 2019/0042251 A1), hereinafter “Nurvitadhi”, in view of Yu et al. (US 2020/0004514 A1), hereinafter “Yu”, as applied to claim 2 above, and further in view of Jayasena et al. (US 2015/0106574 A1), hereinafter “Jayasena”.

Regarding claim 3, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 2.

	The combination of Nurvitadhi does not teach wherein the plurality of memory regions is formed on a first integrated circuit die; and the arithmetic compute --2--App. Ser. No.: 16/158,593Attorney Docket No.: 120426-158900/USelement matrix is formed on a second integrated circuit die different from the first integrated circuit die.

However, Jayasena teaches:
wherein the plurality of memory regions is formed on a first integrated circuit die; and the arithmetic compute --2--App. Ser. No.: 16/158,593Attorney Docket No.: 120426-158900/USelement matrix is formed on a second integrated circuit die different from the first integrated circuit die (FIG. 1; ¶ 31, “processor 102, logic 104 [arithmetic compute element matrix], and memory 106 [plurality of memory regions] are each implemented using one or more integrated circuit dies (or, more simply, “dies”). In other words, processor 102, logic 104, and memory 106 are implemented as semiconductor integrated circuits that are fabricated on one or more corresponding dies”; note that a first integrated circuit die is for the memory 106 [plurality of memory regions] and a second integrated circuit die is for the logic 104 [arithmetic compute element matrix]).  

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Nurvitadhi to incorporate the teachings of Jayasena to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with the memory die processing circuits and the logic die processing circuits used to offload a portion of the operations from the processor of Jayasena.  Doing so with the device of Nurvitadhi would be beneficial because, in comparison to existing computing devices, the processor is freed to perform other computational operations and a communication link between the processor, the logic die, and/or the memory die may carry less traffic, which generally improves the performance and energy efficiency of the computing device.  (Jayasena, ¶ 22)

Regarding claim 4, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 3.

Jayasena further teaches:
further comprising: 
a set of through-silicon vias (TSVs) coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions (FIG. 7; ¶ 31, supra; ¶ 51, “the dies [e.g., including the first integrated circuit die and the second integrated circuit die] in stack 700 are communicatively coupled using through-silicon vias (TSVs)”; note that “to connect the arithmetic compute element matrix to the plurality of memory regions” is an intended use limitation and thus is not given weight since intended use limitations would not distinguish a claimed apparatus from a prior art apparatus that satisfies all the structural limitations of the claimed apparatus and so the intended use limitation does not impose any limit on the interpretation of the claim).  

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Nurvitadhi to incorporate the teachings of Jayasena to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with the memory die processing circuits and the logic die processing circuits used to offload a portion of the operations from the processor of Jayasena.  Doing so with the device of Nurvitadhi would be beneficial because, in comparison to existing computing devices, the processor is freed to perform other computational operations and a communication link between the processor, the logic die, and/or the memory die may carry less traffic, which generally improves the performance and energy efficiency of the computing device.  (Jayasena, ¶ 22)

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi et al. (US 2019/0042251 A1), hereinafter “Nurvitadhi”, in view of Yu et al. (US 2020/0004514 A1), hereinafter “Yu”, and Jayasena et al. (US 2015/0106574 A1), hereinafter “Jayasena”, as applied to claim 3 above, and further in view of Ye et al. (US 2016/0148918 A1), hereinafter “Ye”.

Regarding claim 5, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 3.

	The combination of Nurvitadhi does not teach further comprising: wires encapsulated within the integrated circuit package and coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions.

However, Ye teaches:
further comprising: 
wires encapsulated within the integrated circuit package and coupled between the first integrated circuit die and the second integrated circuit die to connect the arithmetic compute element matrix to the plurality of memory regions (FIG. 2; ¶ 14, “The memory device 100 can further include a package casing 115 comprising an encapsulant 116 that at least partially encapsulates the memory packages 108 and the wire bonds 142.”; ¶ 15, “FIG. 2 is a cross-sectional view of a memory package 108 … The package substrate 202 can include a plurality of first bond pads 208 a and a plurality of second bond pads 208 b. The first bond pads 208 a can be coupled (e.g., wire bonded) to corresponding bond pads 209 a (one identified) of a first group of the semiconductor dies 200 (e.g., two sets of four dies) [e.g., including the first integrated circuit die and the second integrated circuit die]”; note that “to connect the arithmetic compute element matrix to the plurality of memory regions” is an intended use limitation and thus is not given weight since intended use limitations would not distinguish a claimed apparatus from a prior art apparatus that satisfies all the structural limitations of the claimed apparatus and so the intended use limitation does not impose any limit on the interpretation of the claim).  

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Nurvitadhi to incorporate the teachings of Ye to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with the memory device 100 having a package casing 115 comprising an encapsulant 116 that at least partially encapsulates the memory packages 108 and the wire bonds 142 of Ye.  Doing so with the device of Nurvitadhi would increase product yields because individual components can be tested before assembly.  (Ye, ¶ 25)

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi et al. (US 2019/0042251 A1), hereinafter “Nurvitadhi”, in view of Yu et al. (US 2020/0004514 A1), hereinafter “Yu”, as applied to claim 9 above, and further in view of Elliott et al. (US 6,279,088 B1), hereinafter “Elliott”.

Regarding claim 10, the combination of Nurvitadhi teaches the integrated circuit memory device of claim 9.

The combination of Nurvitadhi does not teach wherein the third memory region is the same as the second memory region.

However, Elliott teaches:
wherein the third memory region is the same as the second memory region (FIG. 2; col. 6, ln. 7-10, “The processor elements 12 can then store the result of the process instruction back into the same memory elements [e.g., including the second memory region and the third memory region] as provided the sensed bits, all in one cycle”).  

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Nurvitadhi to incorporate the teachings of Elliott to provide an integrated circuit device having circuitry to perform arithmetic computations in memory of a first integrated circuit die accessible by a separate integrated circuit die of Nurvitadhi that may be used for artificial intelligence (AI) matrix multiplication operations, with the processors locating on the same chip as the memory of Elliott.  Doing so with the device of Nurvitadhi would exploit the extremely wide data path and high data bandwidth available at the sense amplifiers.  (Elliott, col. 2, ln. 38-41)

Allowable Subject Matter
Claim 20 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and the claim rejections and the claim objections above are addressed.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Cargnini et al. (US 2018/0189230 A1) discloses, in one example, a device includes a non-volatile memory divided into a plurality of selectable locations, wherein the selectable locations are grouped into a plurality of data lines; one or more processing units (PUs) coupled to the non-volatile memory, each of the PUs associated with a data line of the plurality of data lines, the one or more processing units comprising one or more reconfigurable PUs, the one or more PUs configured to: manipulate, based on one or more instruction sets, data in an associated data line to generate results that are stored in selectable locations of the associated data line reserved to store results of the manipulation; determine which of the instruction sets are most frequently used by the one or more PUs to manipulate data; and reconfigure the one or more reconfigurable PUs to manipulate data using the determined most frequently used instruction sets.
Chang et al. (US 2017/0255390 A1) discloses a 3D-stacked memory device including: a base die including a plurality of switches to direct data flow and a plurality of arithmetic logic units (ALUs) to compute data; a plurality of memory dies stacked on the base die; and an interface to transfer signals to control the base die.
Choe (US 2018/0300271 A1) discloses an electronic system includes a serial system bus interface having a root complex and an end point, a command bus and a data bus coupled to the serial system bus interface, a memory device coupled to the data bus, and a direct memory access (DMA) controller coupled to both the command bus and the data bus to directly access the memory device in response to request commands which are transmitted from the root complex to the end point. The DMA controller includes a command queue in which the request commands stand by.
Voronkov et al. (US 2017/0324829 A1) discloses a cache that can be stored in a user partitioned region of storage and utilized to reduce the amount of time required to present content responsive to content requests is described. A request for content associated with a region of a user interface can be received and data corresponding to a list item in a cache can be accessed. Content associated with the data can be presented in the region of the user interface via a same presentation as a most recent presentation of the content. At a time subsequent to when the content is initially presented in the region, new data associated with the list item can be retrieved. In examples where the new data corresponds to updated data, the presentation can be modified based partly on the updated data and the new data can be written to the cache in a location corresponding to the list item.
Baum et al. (US 2018/0285727 A1) discloses a novel and useful neural network (NN) processing core adapted to implement artificial neural networks (ANNs) and incorporating processing circuits having compute and local memory elements. The NN processor is constructed from self-contained computational units organized in a hierarchical architecture. The homogeneity enables simpler management and control of similar computational units, aggregated in multiple levels of hierarchy. Computational units are designed with minimal overhead as possible, where additional features and capabilities are aggregated at higher levels in the hierarchy. On-chip memory provides storage for content inherently required for basic operation at a particular hierarchy and is coupled with the computational resources in an optimal ratio. Lean control provides just enough signaling to manage only the operations required at a particular hierarchical level. Dynamic resource assignment agility is provided which can be adjusted as required depending on resource availability and capacity of the device.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tong B Vo whose telephone number is (571)272-7568.  The examiner can normally be reached on M-F 9:00 AM - 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on (571)272-4085.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 





/TONG B. VO/Examiner, Art Unit 2136