DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Application/Amendment/Argument
This office action is in response to the amendment filed on 12/15/2020.  
Claims 1-20 are presented for further consideration. 
Continued Examination
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/07/2021 has been entered.
Response to Arguments
Remark 1 Applicant's argument in pages 9-10 concerns Walker and Kim do not teach or suggest “request an internal processing operation from the external processor or any external device”, “Walker simply suggests that a command is transmitted from the external processor 32 to the memory control 48 and compute engine 38 for execution”. Particularly, Applicant concerns that “Walker's brief and single reference to transfer of "commands from the memory device 34 to the external processor 32" in paragraph [0029] does not teach the feature concerned.
The examiner respectfully disagrees and submit that:
Walker teaches in [0017], the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions. 
Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38.
Furthermore, Kim teaches similar type (i.e., fourth option) of multiple computing core operation as shown in [0032], fourth option is virtualized MPC mode 124E in this mode, FE 104 forwards the data request directly to MPC 106...allows for MPC 106 to process virtualized threads requested from external cores as part of a hybrid and multi-core system;
Applicant’s argument that the combination of Walker and Kim fails to suggest or teach “the command being a request for an internal processing operation of the memory device”, “in response to determining that memory-stored internal processing information includes the command indicating a type of the internal processing operation”, “the internal processing operation command being subsequently received by the memory device 34 from the external processor 32” are unpersuasive. 
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
Claim Rejections - 35 USC § 103
In the event a determination of the status of the application as subject to AIA  35 U.S.C. 102, 103, and 112 (or as subject to pre-AIA  35 U.S.C. 102, 103, and 112) is incorrect, any correction of the statutory basis for a rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-9, and 11-19 are rejected under 35 U.S.C. 103 as being unpatentable over Walker et al. (US 2011/0093662; hereinafter Walker), in view of Kim (US 2015/0046660), further in view of Agam et al. (US 2012/0246401; hereinafter Agam).
Regarding independent claim 1, Walker teaches a memory device comprising:
a memory cell array configured to store internal processing information; and
a processor-in-memory (PIM) configured to perform an internal processing operation based on the internal processing information ([0017], instructions, and the data on which the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions. Once the instruction(s) have been executed, the internal processor may store the results of the instruction(s) in a memory component, the memory array, or to any other suitable output; 
[0019]-[0021], one or more processors, such as ALUs, may be packaged with a memory device. For example, the memory may be a processor-in-memory (PIM), and may include ALUs embedded on a memory device (e.g., a memory array), which may store instructions and data to be executed by the ALUs and the results from the executed instructions; [0027], a memory system 30 may include a memory device 34… have an internal processor such as the compute engine 38;
[0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38).
Walker does not explicitly teach based on an internal processing mode.
In an analogous art of internal memory processor, Kim teaches a processor-in-memory (PIM) configured to perform an internal processing operation based on …an internal processing mode ([0025], MPE 102 is capable of supporting a plurality of memory types which are coupled to MPC 106 via one or more data buses... MPC 106 registers memory types, size, and real performance; 
[0028], If the selected option is cache modes 1 or 2 (124A, 124B), or parametric programmable cache mode (124C), then the data request is a cache search request which is performed by controller component 118. If the selected option is MPC for Cache-Thru-MPC mode (124D) or virtualized MPC mode (124E), then the data request is forwarded to MPC 106 for processing. Note that Kim teaches options of four operation modes as shown in [0029]-[0032]; 
[0031], third option is cache-thru-MPC mode 124D. In this mode, FE 104 forwards the cache request directly to MPC 106. MPC 106 is configured to use its own caching strategy based on the request. MPC 106 performs the cache operation itself and returns the result to the sender of the request; 
[0032], fourth option is virtualized MPC mode 124E in this mode, FE 104 forwards the data request directly to MPC 106...allows for MPC 106 to process virtualized threads requested from external cores as part of a hybrid and multi-core system; [0038], Virtualized MPC component 166 operates as a generic processor with a large and fast memory. The goal of virtualized processing by virtualized MPC component 166 is to reduce the number of transactions that are required to be processed; [0039], MPC 106 directly performs the processor's operation, making the processing more efficient).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention was made, with the teachings of Walker and Kim before them, to improve Walker’s processor in memory performing internal processing operations based on instructions first stored in the memory array with Kim’s various modes of a processor in memory that is configured based on selecting one of four operation modes. The motivation of doing so would be for the benefits of dynamically reconfiguring memory having processor in memory for supporting a plurality of memory types, reduce the number of transactions, making the processing more efficient, etc. (Kim, [0025]; [0032], The goal of virtualized processing by virtualized MPC component 166 is to reduce the number of transactions that are required to be processed... to combine many simple read/write operations into one operation which is performed by the MPC. Virtualized MPC component 166 is pre-programmed to handle custom types of transactions which are performed on the memory side; [0039], MPC 106 directly performs the processor's operation, making the processing more efficient).
The combination of Walker and Kim further teaches wherein, in response to determining that the internal processing information comprises an internal processing operation command indicating a type of the internal processing operation (Kim, [0025], MPC 106 registers memory types, size, and real performance; [0031], MPC 106 is configured to use its own caching strategy based on the request. MPC 106 performs the cache operation itself and returns the result to the sender of the request; [0038], virtualized MPC mode 124E is to combine many simple read/write operations into one operation which is performed by the MPC. Virtualized MPC component 166 is pre-programmed to handle custom types of transactions which are performed on the memory side), the PIM requests the internal processing operation from an external device  (
Walker, [0017], the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions;
[0029], the external processor 32 and the memory device 34 may also be operably coupled by a control interface 46, which may allow the transfer of commands between the external processor 32 and the memory device 34, including commands from the memory device 34 to the external processor 32.
[0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38).
Although Walker teaches sending command between PIM and external processor through an interface ([0029], the external processor 32 and the memory device 34 may also be operably coupled by a control interface 46, which may allow the transfer of commands between the external processor 32 and the memory device 34, including commands from the memory device 34 to the external processor 32), Walker and Kim do not explicitly teach through dedicated pins between the external device and the memory device.
In an analogous art of in-memory processor, Agam teaches the PIM receive from the external device the internal processing command through dedicated pins between the external device and the memory device (Fig. 1 & [0023], host processor 14 command internal processor 12 to start processing through a signal line (e.g., command16)).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention was made, with the teachings of Agam, Walker and Kim before them, to improve Walker and Kim’s external processor sends a request to MPC to enter internal processing mode with Agam’s dedicated command line to start internal processing. The motivation of incorporating a dedicated signal line to activate internal processing would be for the benefits of improving pipelining because multiple instructions and/or data may occur substantially simultaneously (Walker, [0021]).
Regarding independent claim 14, Walker teaches a method of operating a memory device comprising a memory cell array and a processor-in-memory (PIM) configured to perform an internal processing operation ([0019]-[0021], the memory may be a processor-in-memory (PIM), and may include ALUs embedded on a memory device (e.g., a memory array), which may store instructions and data to be executed by the ALUs and the results from the executed instructions; [0027], a memory system 30 may include a memory device 34… have an internal processor such as the compute engine 38), the method comprising: ...
(Claim recites substantially the same limitations as in claim 1, and is therefore rejected for the same reasons set forth in the analysis of claim 1).
Additionally, the combination of Walker and Kim teaches entering an internal processing mode under control of an external device performing, by the PIM, the internal processing operation in the internal processing mode based on internal processing information stored in the memory cell array (
Kim, [0005], receiving a data request, selecting an operational mode based on the data request and a predefined selection algorithm, and processing the data request based on the selected operational mode; [0039]-[0040], processor A1 has a memory intensive job it is performing. Processor A1 {external device} sends a request to MPC 106 to perform a virtualized thread for processor A1… Status reporting component 168 transmits status information to external entities. For example, MPC 106 wants other processor cores and MPCs to know that MPC 106 is not just a cache block, but also a memory processing core (MPC); [0036], MPC is capable of reconfiguring cache dynamically in real-time based on the demand from the applications originating from different external general processor cores; [0025], MPC 106 registers memory types, size, and real performance;
Walker, [0017], the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions;
[0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38).
Regarding claim(s) 2 and 15, the combination of Walker, Kim and Agam further teaches wherein the internal processing operation command comprises an internal processing read command and an internal processing write command associated with the internal processing operation (Walker, [0031], the compute engine 38 may be capable of accessing the memory array 36, including retrieving information from, and storing information …fetch unit 50 may sequence the memory array 36 states according to the command information (e.g., open or close a bank according to read and write commands) {internal processing operation}; 
[0017], the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions;
[0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38;
Upon the broadest reasonable interpretation, an internal processing read command and an internal processing write command are output to external processor to be divided and designated to one or more compute engines for distributing the execution of the instructions (or parts of instructions) across multiple compute engines 38).
Regarding claim(s) 3, the combination of Walker, Kim and Agam further teaches a command/address buffer configured to receive a command issued from the external device (Walker, [0017], the instructions will be executed, may be sent by an external processor (e.g., a memory controller) to an internal processor (e.g., ALU circuitry). The instructions and/or data may first be stored in a memory array to be retrieved when the internal processor is available to execute the instructions;
[0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38), wherein the PIM comprises a command detector configured to determine whether the received command comprises the internal processing operation command (Kim, [0039], processor A1 has a memory intensive job it is performing. Processor A1 {external device} sends a request to MPC 106 to perform a virtualized thread for processor A1; [0032], the fourth option is virtualized MPC mode 124E in this mode, FE 104 forwards the data request directly to MPC 106… a request for data processing;
Walker, [0031], The process of retrieving and storing information between the compute engine 38 and the memory array 36 …The sequencer 40 may sequence the instructions sent by the external processor 32 to the memory array 36 and store the data retrieved from the memory array 36 in a memory component such as the buffer 42. The sequencer 40 may pass the address and command information for accessing the memory array 36 to the fetch unit 50, and the fetch unit 50 may sequence the memory array 36 states according to the command information (e.g., open or close a bank according to read and write commands). In one embodiment, the memory control 48 may decode the command and address bits).
Regarding claim(s) 4, the combination of Walker, Kim and Agam further teaches wherein, in response to determining that the received command comprises the internal processing operation command, the PIM performs a data communication operation with the memory cell array according to the internal processing operation command (Kim, [0039], Processor A1 {external device} sends a request to MPC 106 to perform a virtualized thread for processor A1 {received command comprises the internal processing operation command}; [0036], MPC is capable of reconfiguring cache dynamically in real-time based on the demand from the applications originating from different external general processor cores;
Walker, [0038], Depending on the instruction and/or the data to be operated by a compute engine 38, processing efficiency may also be increased by distributing the execution of instructions (or parts of instructions) across multiple compute engines 38… the external processor 32 may substantially control the division of an instruction into operations and the designation of the operations to one or more compute engines 38).
Regarding claim(s) 16, Claim recites substantially the same limitations as in claims 3 and 4, and is therefore rejected for the same reasons set forth in the analysis of claims 3 and 4).
Regarding claim(s) 5, the combination of Walker, Kim and Agam further teaches wherein the internal processing operation command is generated during the internal processing operation performed by the PIM based on the internal processing operation information, and comprises an internal processing read command and an internal processing write command associated with the internal processing operation (Walker, [0031], the compute engine 38 may be capable of accessing the memory array 36, including retrieving information from, and storing information (e.g., retiring results) in the memory array 36…fetch unit 50 may sequence the memory array 36 states according to the command information (e.g., open or close a bank according to read and write commands) {internal processing operation}; [0040], intermediate results from one operation {internal processing operation} performed by one compute engine 38 may be transferred as operands for a different operation {internal processing read command and an internal processing write command} to be performed by another compute engine 38. Upon the broadest reasonable interpretation, the intermediate results from one operation {internal processing operation} is generated during the internal processing operation performed by the PIM), and
wherein the PIM comprises a command queue configured to store and output to the external device at least one of the internal processing read command and the internal processing write command (Walker, [0032], intermediate results may be stored in memory components such as the buffer 42 or memory registers {command queue} coupled to the compute engine 38. In one or more embodiments, a compute engine 38 may access the buffer 42 {command queue} for the intermediate results to perform subsequent operations).
Regarding claim(s) 6 and 17, the combination of Walker, Kim and Agam further teaches wherein the PIM is configured to receive from the external device an internal processing mode signal instructing entry into the internal processing mode (Kim, [0039], processor A1 has a memory intensive job it is performing. Processor A1 {external device} sends a request {internal processing mode signal} to MPC 106 to perform a virtualized thread for processor A1 {received command comprises the internal processing operation command}; [0036], MPC is capable of reconfiguring cache dynamically in real-time based on the demand from the applications originating from different external general processor cores).
Regarding claim(s) 7 and 18, the combination of Walker, Kim and Agam further teaches wherein the PIM is configured to receive from the external device the internal processing mode signal through signal lines of a first channel performing a predetermined protocol between the external device and the memory device (Kim, [0039], processor A1 has a memory intensive job it is performing. Processor A1 {external device} sends a request {internal processing mode signal} to MPC 106 to perform a virtualized thread for processor A1 {received command comprises the internal processing operation command}; Figs. 3B & 5, a bus from block 120 corresponds to {signal lines of a first channel}; [0005], receiving a data request, selecting an operational mode based on the data request and a predefined selection algorithm, and processing the data request based on the selected operational mode).
Regarding claim(s) 8 and 19, in view of Walker and Kim, Agam teaches wherein the PIM is configured to receive from the external device the internal processing mode signal through a dedicated signal line of a second channel between the external device and the memory device (Fig. 1 & [0023], host processor 14 command internal processor 12 to start processing through a signal line (e.g., command16)).
Regarding claim(s) 9, the combination of Walker, Kim and Agam further teaches wherein the PIM is configured to output the internal processing operation command to the external device through a dedicated signal line of a second channel between the external device and the memory device (Kim, [0007], a front end coupled to cache memory by at least one bus {corresponding to a first channel and a second channel}, comprising: an input/output component configured to receive a data request and return a response to the sender).
Regarding claim(s) 11, the combination of Walker, Kim and Agam further teaches wherein the internal processing operation comprises at least one of an operation of reading internal processing data comprising at least one of reference data, source data, destination data and target data, and an operation of writing a result of the other internal processing operation to the memory cell array (Walker, [0031], The compute engine 38 may be capable of accessing the memory array 36, including retrieving information {source data} from, and storing information {destination data} in the memory array 36; [0032], a compute engine 38 may access the buffer 42 for the intermediate results to perform subsequent operations; [0033], The buffer 42 may also include additional buffers, such as a data buffer or a simple buffer, which may provide denser storage, and may store intermediate or final results of executed instructions).
Regarding claim(s) 12, the combination of Walker, Kim and Agam further teaches wherein the internal processing operation comprises at least one of data search, data transfer, data addition and data swap with respect to the memory cell array according to the internal processing operation command received from the external device (Kim, [0030], Like cache modes 1 and 2 (124A, 124B), a cache search is performed based on the request and the caching logic of programmable cache mode 124C; [0036], MPC 106 monitors cache behavior, analyzes cache behavior, and is capable of reconfiguring cache dynamically in real-time based on the demand from the applications originating from different external general processor cores; Walker, [0035], multiple transfers between the memory array 36 and the buffer 42 may occur substantially simultaneously by coupling one or more buffers 42 a-42 d to one or more groups of memory cells; [0005], the ALU circuitry may add, subtract, multiply, or divide one operand from another, or may subject one or more operands to logic operations, such as AND, OR, XOR, and NOT logic functions).
Regarding claim(s) 13, the combination of Walker, Kim and Agam further teaches wherein, in a normal mode, the memory device is configured to receive a data transaction command from the external device through signal lines of a first channel performing a predetermined protocol between the external device and the memory device, and perform a data communication operation between the external device and the memory cell array according to the received data transaction command (Kim, [0005], The system includes receiving a data request, selecting an operational mode based on the data request and a predefined selection algorithm, and processing the data request based on the selected operational mode; [0027] & Fig. 5, the mode is selected based on requests received from the external processor by the internal processor via a first bus; [0029], The first option {normal mode} includes cache mode 1 and cache mode 2...Controller component 118 performs the cache search based on the request and the selected mode…Cache modes 1 and 2 (124A, 124B) are each pre-programmed modes having several fixed cache operation scenarios and logic. Each is configured using commonly used settings relating to how the cache is arranged internally to store the cached data to increase the effectiveness of the cache; Fig. 5, bus between block 120 and 104 corresponds to signal lines of first channel).
Claim(s) 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Walker et al. (US 2011/0093662; hereinafter Walker), in view of Kim (US 2015/0046660) and Agam et al. (US 2012/0246401; hereinafter Agam), further in view of Hill et al. (US 2015/0293864; hereinafter Hill).
Regarding claim(s) 10 and 20, Walker, Kim and Agam do not teach generate or transmit a completion signal indicating completion of the internal processing operation.
In an analogous art of in-memory processor, Hill teaches the PIM is configured to generate a completion signal indicating completion of the internal processing operation, and transmit the completion signal to the external device to exit the internal processing mode ([0078], once an internal operation has completed, the status register (e.g., in register block 702) can be updated, and in addition the serial output pin (SO) can be appropriately driven to indicate completion of execution of the internal operation on the serial memory device).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention was made, with the teachings of Hill, Walker, Kim and Agam before them, to improve Walker, Kim and Agam’s internal processor performing internal processing in response to host request with Hill’s updating the status register to generate a completion signal indicating completion of the internal processing operation via the serial output pin. The motivation of generating and transmitting the completion signal to host would be for the benefits of improving performance of the host by allowing the host to perform other tasks while the internal processor is performing the internal operation because the host does not need to perform periodic polling to check the completion of the internal operation (Hill, [0078], In this way, the host (e.g., 402) need not perform periodic polling of the serial memory device).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY C. CHAN whose telephone number is (571)272-9992.  The examiner can normally be reached on Monday - Friday 9 AM to 5 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ADAM M. QUELER can be reached on 571-272-4140.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TRACY C CHAN/            Primary Examiner, Art Unit 2137