DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 8-9, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Diesher (US 2018/0121796; filed 2016, different assignee), Jung (2017/0285968; filed 2016, different assignee), and Drysdale (2018/0095750, filed 2016, different assignee) 
1. A memory device, comprising: 
a memory cell circuit; a memory interface circuit including a host region and a neural network processor (NNP) region; (Diesher teaches: “CPU is free to perform other operations or to enter low power sleep state for example). The CPU may be or may be polling for the completion of the NNA operation.”  Diesher paragraph 0035.  See Diesher figure 1 showing the “target” area as the location with the completion information.  See also Diesher showing a “master” area in the main memory interface transferring data to the NN layer execution buffers.) configured to receive a read command and a write command from a host, and to control the memory cell circuit in response to the read command and the write command; (“The processor 250 may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory 248 via a memory controller 252 and a bus unit (here called a root hub) 246. The processor 250 also may have data transmitted between the memory 248 and other components in the system including components on the NNA 202 as described below.” Diesher paragraph 0047.  
The previously cited art does not expressly teach a read and a write command.  
Jung teaches: “[0058] The accelerator controller 340 manages data movements between the LWP 310 and flash backbone 320 in the accelerator 300 or data movements between the host and the flash backbone 320 of the accelerator 300, and manages conversion between page access and word or byte access. Upon receiving a data read request from the host or the LWP 310, the accelerator controller 340 reads corresponding data from the buffer subsystem 330 and transfers them to host or the LWP 310 if the data have been already stored in the buffer subsystem 330. If the corresponding data are not stored in the buffer subsystem 330, the accelerator controller 340 converts the data in the flash backbone 320 to data by word or byte and stores them in the buffer subsystem 330, and reads the data from the buffer subsystem 330 and transfers them to the host or the LWP 310. Upon receiving a data write request from the host or the LWP 310, the accelerator controller 340 writes corresponding data to the buffer subsystem 330, and converts the data written to the buffer subsystem 330 to data by page and transfers them to the flash backbone 320.”  Jung paragraph 0058.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Jung as an instance of (A) Combining prior art elements according to known methods to yield predictable results.  The prior art included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference (as shown in the cited art); One of ordinary skill in the art could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately (the secondary reference merely adds the steps of host commands sent from the host to the accelerator); One of ordinary skill in the art would have recognized that the results of the combination were predictable (the result of combining the host commands of the secondary references with the primary reference would be predictable). See MPEP § 2143(I)(A).) and a neural network processor configured to receive a command instructing a neural network processing operation from the host, to perform the neural network processing operation using weights of synapses in a neural network in response to the command, and to control the memory circuit to read or write data while performing the neural network processing operation, (“Thus, the NNA is a small, flexible, low-power hardware co-processor that runs neural network forward propagation in parallel with a host CPU (e.g., where the CPU is free to perform other operations or to enter low power sleep state for example).”  Diesher paragraph 0035.  “The processor 250 may control data flow with the memory 248 via a memory controller 252 and a bus unit (here called a root hub) 246. The processor 250 also may have data transmitted between the memory 248 and other components in the system including components on the NNA 202 as described below.”  Diesher paragraph 0047.  “Turning to the NNA 202, the NNA 202 may have a DMA unit (or engine or just DMA) 208, memory management unit (MMU) 210, interrupt generation logic 212, and main memory interface 214 to move data among more external memory 248 and the other memories on the NNA 202. The DMA 208 performs data read/write operations to avoid using the CPU time while the MMU 210 assists with addressing the data in the memory and buffers so that paging schemes or other similar memory storage techniques can be used to increase memory transaction efficiency.”  Diesher paragraph 0048.  “Turning to the NNA 202, the NNA 202 may have a DMA unit (or engine or just DMA) 208, memory management unit (MMU) 210, interrupt generation logic 212, and main memory interface 214 to move data among more external memory 248 and the other memories on the NNA 202. The DMA 208 performs data read/write operations to avoid using the CPU time while the MMU 210 assists with addressing the data in the memory and buffers so that paging schemes or other similar memory storage techniques can be used to increase memory transaction efficiency.”  Diesher paragraph 0049.  Note that all references are to continuous operations.  
The previously cited art does not expressly teach receipt of a command for a neural network processing system. 
Drysdale teaches: “In one embodiment, when accelerator 0 receives input data from the processor (e.g., CPU) (e.g., or other device) it is to begin processing when it has both input buffers and output buffers available. . . . Accelerator 1 may then begin processing data from accelerator 0 assuming it has output buffers to store data to (for example, with output buffers provided to accelerator 1 by the processor (e.g., CPU), e.g., with command packets coming from the processor).”  Drysdale paragraph 0046.  
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of the secondary references as an instance of (A) Combining prior art elements according to known methods to yield predictable results.  The prior art included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference (as shown in the cited art); One of ordinary skill in the art could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately (merely having the host send a message to use the neural network does not change the functionality of the network); One of ordinary skill in the art would have recognized that the results of the combination were predictable (the results of a host command to use the network is predictable). See MPEP § 2143(I)(A). wherein the neural network processor comprises: a command queue configured to receive the command instructing the neural network processing operation, and to store the command, the command being provided by the memory interface circuit; (Drysdale teaches: “In one embodiment, when accelerator 0 receives input data from the processor (e.g., CPU) (e.g., or other device) it is to begin processing when it has both input buffers and output buffers available. . . . Accelerator 1 may then begin processing data from accelerator 0 assuming it has output buffers to store data to (for example, with output buffers provided to accelerator 1 by the processor (e.g., CPU), e.g., with command packets coming from the processor).”  Drysdale paragraph 0046.) a control circuit configured to perform the neural network processing operation according to the command stored in the command queue; (This is obvous over the combination of cited art. The motivation to combine above applies.  Driesher teaches: “The process 1020 then may include "run neural network by inputting data to fixed function hardware" 1030. As described above, the fixed function hardware may include parallel logic blocks, and by one example 48 of the blocks.” Diesher paragraph 0220.  See also Diesher figure 10A. Drysdale teaches: “In one embodiment, when accelerator 0 receives input data from the processor (e.g., CPU) (e.g., or other device) it is to begin processing when it has both input buffers and output buffers available. . . . Accelerator 1 may then begin processing data from accelerator 0 assuming it has output buffers to store data to (for example, with output buffers provided to accelerator 1 by the processor (e.g., CPU), e.g., with command packets coming from the processor).”  Drysdale paragraph 0046.) a global buffer, (See Diesher figure 2 showing NN Buffers 256 (and other buffers including memory 214 that also read on a “global buffer”).  Note that naming a buffer “global” does not require steps to be performed or limit to a particular structure.) the control circuit controlling the global buffer to temporarily store first data; a direct memory access (DMA) controller, the control circuit controlling the DMA controller to control second data input to the memory cell circuit, third data output from the memory cell circuit, or both; (“Turning to the NNA 202, the NNA 202 may have a DMA unit (or engine or just DMA) 208, memory management unit (MMU) 210, interrupt generation logic 212, and main memory interface 214 to move data among more external memory 248 and the other memories on the NNA 202. The DMA 208 performs data read/write operations to avoid using the CPU time while the MMU 210 assists with addressing the data in the memory and buffers so that paging schemes or other similar memory storage techniques can be used to increase memory transaction efficiency.” Diesher paragraph 0049.   See also Diesher figure 2.) and a processing element array configured to process an arithmetic operation using the first data from the global buffer, the second data from the DMA controller, the third data from the DMA controller, or a combination thereof, (Diesher teaches: “The external memory 248 also may have one or more pre-allocated NN buffers (or application buffers) 256 including buffers for a matrix of input values, weights, scale factors, bias values, and other constants. These NN buffers 256 initially hold the data for the neural network before running the neural network or at least before a layer associated with the data is being processed. Eventually, the data in the NN buffers 256 are read by the NNA 202 to be placed into the internal buffers 238 to be used to compute NN outputs as explained below. The data for each layer in the NN buffers 256, such as the input values, scale factors, weights, and other data[.]”  Diesher paragraph 0048.  “As mentioned, the NN layer execution buffers (or internal buffers) 238 hold data to be placed into the path way 240 including the input buffer to hold input values of a layer but also a weight buffer to hold weights of a layer, and a constant/bias buffer to hold constants or bias values of a layer. The internal buffers 238 also may have a sum buffer to accumulate intermediate or temporary sums that may be accumulated when placed into the sum buffer to compute a final sum (or sum output) to provide to an activation function circuit 254.”  Diesher paragraph 0054.) wherein the control circuit controls the DMA controller to store the first data in the global buffer, the first data being the weights. (Diesher teaches: “The external memory 248 also may have one or more pre-allocated NN buffers (or application buffers) 256 including buffers for a matrix of input values, weights, scale factors, bias values, and other constants. These NN buffers 256 initially hold the data for the neural network before running the neural network or at least before a layer associated with the data is being processed. Eventually, the data in the NN buffers 256 are read by the NNA 202 to be placed into the internal buffers 238 to be used to compute NN outputs as explained below. The data for each layer in the NN buffers 256, such as the input values, scale factors, weights, and other data[.]”  Diesher paragraph 0048.  
8. The memory device of claim 1, wherein 
the processing element array includes a plurality of processing elements, each comprising: a register storing fifth data; a computing circuit configured to generate an operation (“Thus, the registers 216 may be considered a state machine since it holds the current state (in the form of the layer descriptor) of the current layer being processed in the neural network. The registers 216 may hold the layer descriptor data in registers 258 in a fixed state while processing is being performed on the associated layer (or while the system is idle). The processor 250 may initiate the processing of the NN by having the DMA 208 place the first layer descriptor data from external memory 248 into the layer descriptor register 258, where the layer descriptors thereafter will be placed in the registers 216 one layer descriptor at a time, and in an order as found in the external memory 248 for the neural network. The registers 216 may be controlled by a register access control 218. It will be understood that the layer descriptor registers 258 may be part of the NN execution core 204 whether or not the layer descriptor registers 258 are a part of the register 216.”  Diesher paragraph 0050.
The previously cited art does not expressly state that the execution cores are duplicated.  
It would have been obvious to one of ordinary skill in the art before the effective filing date to duplicate the processors as a mere duplication of parts.  “[M]ere duplication of parts has no patentable significance unless a new and unexpected result is produced.”  MPEP § 2144.04.  No new and unexpected result appears to be produced by using a plurality of processors.)
9. The memory device of claim 8, wherein 
the arithmetic operation includes one or more of an addition operation, a multiplication operation, and an accumulation operation.  (“The internal buffers 238 also may have a sum buffer to accumulate intermediate or temporary sums that may be accumulated when placed into the sum buffer to compute a final sum (or sum output) to provide to an activation function circuit 254.”  Diesher paragraph 0054.)
13. A memory system, comprising: 
a host; and a memory device configured to perform a read operation according to a read command provided from the host, to perform a write operation according to a write command provided from the host, and to perform a neural network processing operation according to a neural network processing command provided from the host, wherein the memory device includes: a memory cell circuit including a host region and a neural network processor (NNP) region; a memory interface circuit configured to control the memory cell circuit according to the read command and the write command; and a neural network processor configured to perform the neural network processing operation in response to the neural network processing command, and to control the memory cell circuit to read or write data while performing the neural network processing operation wherein the host region is used by the host and the NNP region is used by the neural network processor when the neural network processor performs the neural network processing operation, and wherein the neural network processing operation includes a training operation of a neural network, and inference operation, or both. (See rejection of claim 1.)
Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Diesher, Jung, Drysdale, and Alberta (CMPE401 2008)
7. The memory device of claim 1, wherein 
the neural network processor further comprises a first in first out (FIFO) queue configured to temporarily store fourth data output from the DMA controller, and to provide the fourth data to the processing element array, the fourth data including the second data, the third (“The NNA 202 also may have a sequencer and buffer control 206, as well as the NN layer execution buffers 238 and the data path 240, which generally and collectively may be referred to as an NN execution core 204 along with the DMA 208 and MMU 210 since these components can be considered active components that perform the main operations to run the neural network and may be run as a single power domain on the NNA 202.”  Diesher paragraph 0051.  
The previously cited art does not expressly state that the buffer is an FIFO buffer.
Alberta teaches: “Both ping-pong buffers and first-in first-out (FIFO) buffers can be used to absorb short-term mismatches in the data rates of a producer process with a consumer process. . . . The advantage of a FIFO buffer is that it is a very simple way of temporarily adding delay to an ordered stream of data. No addresses need to be provided at the write and read ports of the FIFO. No complicated Mechanism is required to manage the storage of data inside the buffer; the only complexity is that the read and write pointers need to be wrapped around to avoid going beyond the fixed boundaries of the of the buffer.”  Alberta page 2, 4th full paragraph.  It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Alberta as an instance of (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; The prior art contained a "base" device (method, or product) upon which the claimed invention can be seen as an "improvement” (the inclusion of an FIFO buffer is an improvement because the FIFO buffer is easy to implement as a queue and requires less overhead).  The prior art contained a known technique that is applicable to the base device (method, or product) (the technique of FIFO buffering is applicable to the buffer of the prior art). One of ordinary skill in the art would have recognized that applying the known technique would have yielded predictable results and resulted in an improved system. See MPEP § 2143(I)(D).)





Response to Arguments
Applicant's arguments filed 01/21/2022 have been fully considered but they are not persuasive.
Rejections under § 112:
All rejections under this section are withdrawn based on applicant amendments.  
Rejections under § 103:
All rejections to the prior art are moot because they are to references not cited in this action.  


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL M KNIGHT whose telephone number is (571)272-8646.  The examiner can normally be reached on Monday - Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Reginald Bragdon can be reached on 571 272 4204.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


PAUL M. KNIGHT
Examiner
Art Unit 2139



/PAUL M KNIGHT/Examiner, Art Unit 2139