DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/1/2022 has been entered.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the claims and remarks entered via the request for continued examination (RCE) filed on 12/2/2022. 

Response to Amendment
Claims 1, 16 and 24 were amended by applicant, and no claims were cancelled or added in the amendment. As such, claims 1-25 are pending and have been examined.

Response to Arguments
Applicant's response filed 12/1/2022 did not include any amendments or arguments addressing the objections to claims 1-15 and 24-25. As documented below, objections to claims 1-15 and 24-25 remain.
Applicant's arguments with respect to the rejections of claims 1-25 under 35 U.S.C. 103 have been fully considered, but are moot because the arguments do not apply to the combination of references used in the current rejections.
With reference to amended independent claims 1, 16 and 24, applicant states “claim 1 is directed to hybrid artificial intelligence (Al) processing system and has been amended to include the feature, ‘the first memory circuit comprising static random access memory (SRAM),’ and ‘the second memory circuit comprising static random access memory (SRAM).’ Independent claim 16 is directed to analog in-memory neural
 network (NN) layer including similar features. Independent claims 24 is directed to an artificial intelligence (Al) processing system including similar features. (See Applicant's
original specification, e.g., Paragraph 0021.)” (applicant’s remarks, page 8).
Applicant further asserts “Applicant understands Yakopcic as disclosing resistive memories and NOT static random access memory (SRAM), as is taught and claimed by Applicant. Furthermore, the foundation of the neuromorphic circuits of Yakopcic are based on the resistive memories, and there is no indication whatsoever that it would not have been obvious or even possible to use SRAM memories in the system of
Yakopcic in the manner taught and claimed by Applicant.” (applicant’s remarks, page 9). 
Accordingly, applicant apparently argues that:
1) the newly-added first and second memory circuits “comprising static random access memory (SRAM)” and resistive RAM (RRAM) are somehow mutually exclusive, or that SRAM is not-combinable with the “resistive memories 410(a-n)” disclosed in example embodiments in the primary Yakopcic reference (see, e.g., Yakopcic, column 23, lines 54-56); and 
2) the newly presented claim limitations that were added to claims 1, 16 and 24, i.e., first and second memory circuits each “comprising static random access memory (SRAM)” are not disclosed or taught in the portions of the Yakopcic and Chi references previously applied to claims 1, 16 and 24.
Regarding argument 1) above, the examiner respectfully disagrees with the above-assertion, which is not buttressed by applicant’s specification or in the teachings of Yakopcic and the other applied references. In particular, the very same portion of applicant’s specification relied upon by applicant to support the above-noted amendments (e.g., paragraph 21, see applicant’s remarks page 8), explicitly discloses that “In some embodiments, the memory circuits 320 and 350 are implemented as static random access memory (SRAM) … Other embodiments may employ other memory technologies, whether volatile or non-volatile, such as dynamic RAM (DRAM), resistive RAM (RRAM), and magnetoresistive RAM (MRAM)” and “memory circuits 320 and 350 may be part of, for example, an on-chip processor cache or a computing system's main memory board, or any other memory facility.” Therefore, contrary to applicant’s seeming assertion that the recited “static random access memory (SRAM)” included in the claimed memory circuits and resistive RAM (RRAM) are somehow mutually exclusive, or that such SRAM is not combinable with the “resistive memories 410(a-n)” disclosed exemplary embodiments in the primary Yakopcic reference, applicant’s specification suggests that the claimed memory circuits may comprise resistive memory (RRAM) and/or static RAM (SRAM). In contrast to applicant’s assertion regarding Yakopcic, this reference does not foreclose the use of other types of memory. For instance, Yakopcic discloses that embodiments “may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device). For example, … read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others.” (see, Yakopcic, col. 4, line 57-col. 5, line 3). Further, applicant fails to point to any portion of Yakopcic, or any other cited reference, that discloses, teaches or even suggests that its resistive memory or RRAM cannot be used or combined with SRAM, as taught in the Deisher reference, as discussed below. 
Regarding applicant’s argument 2) that the newly presented claim limitations that were added to claims 1, 16 and 24 in the amendment filed on 12/1/2022, i.e., “a first memory circuit configured to store a subset of the digital weighting factors, the first memory circuit comprising static random access memory (SRAM); a second memory circuit configured to store the input data, the second memory circuit comprising static random access memory (SRAM);” are not disclosed or taught in the portions of the Yakopcic and Chi references applied to claims 1, 16 and 24 in the previous Office Action. In response, the examiner points to the new combination of references, Yakopcic, Chi and Deisher now applied to claims 1, 16 and 24 and the discussion of Yakopcic, Chi and Deisher below. The examiner notes that Deisher was previously applied to claims 12 and 15 in the prior office action.
With reference to the limitations “a first memory circuit configured to store a subset of the digital weighting factors, the first memory circuit comprising static random access memory (SRAM)” recited in amended claims 1, 16 and 24, the examiner points to Yakopcic column 17, lines 56-57, which explicitly discloses that a “vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., subset of weighted values/weighting factors to be applied to an image], to column 23, lines 54-56, which discloses “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined” [i.e., subset of weights for each memory circuit 410a-n] and to column 25, lines 57-59 disclosing that “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a first memory circuit stores the subset of weights]. The examiner further points to column 4, line 64-column 5, line 1 of Yakopcic, which discloses that “a machine-readable medium may include … random access memory (RAM)” [i.e., memory circuit including random access memory/RAM].
Regarding “a second memory circuit configured to store the input data, the second memory circuit comprising static random access memory (SRAM)” recited in amended claims 1, 16 and 24, the examiner points to column 18, lines 26-33 and column 25, lines 35-38 and 57-58 of Yakopcic, which explicitly disclose that an "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is data associated with the NN layer] and a "neuromorphic circuit 1000 includes … resistive memories 410(a-n) … a digital storage layer" [i.e., resistive memories 410a-n include a second memory circuit for storing the input data]. The examiner also points to column 4, line 62-column 5, line 1 of Yakopcic, which further discloses that “A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine … a machine-readable medium may include … random access memory (RAM); … flash memory devices;” [i.e., another, second memory circuit/device includes random access memory/RAM.
With continued reference to the above-noted first and second memory circuits “comprising static random access memory (SRAM)” limitations, the examiner points to paragraphs 47-48 and 281 of Deisher, which explicitly disclose that “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM … processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or … memory 248 or another memory … processor 250 may retrieve or transmit data to other external ( off-die or off-chip) volatile memory” [i.e., neural network/NN system with first and second memory chips/circuits including SRAM], “NN buffers 256 could be, or at least partially be, held external to the SoC 200 on volatile … memory forming memory 248” [i.e., neural network/NN system with other volatile memory/second memory chip/circuit including SRAM] and “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., first and second memory chips/circuits/devices comprising SRAM].
As further detailed below, the combination of Yakopcic, Chi and Deisher (Yakopcic in view of Chi and further in view of Deisher) teaches all the limitations of amended independent claims 1, 16 and 24, as well as the limitations of dependent claims 2-15, 17-23 and 25.
Applicant’s amendments have necessitated the claim rejections under 35 U.S.C. 103 discussed below.

Claim Objections
Claims 1-15 and 24-25 are objected to because of the following informalities: 
In claim 1, the word “and” is missing between “a second memory circuit configured to store the input data;” and “a cross bit line processor” in the penultimate and last limitations of the claim. The examiner suggests that one way to address this objection would be to amend the last 5 lines of claim 1 to recite “a second memory circuit configured to store the input data; and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors corresponding to a column of the first memory circuit and one of the second sequence of analog vectors corresponding to a column of the second memory circuit.” Appropriate correction is required.
In claim 2, the word “and” is missing between “the corresponding NN layer;” and “a second bit line processor (BLP)” in the penultimate and last limitations of the claim. The examiner suggests that one way to address this objection would be to amend the last 3 lines of claim 2 to recite “the second memory circuit to store the data associated with the corresponding NN layer; and 
a second bit line processor (BLP) associated with the second memory circuit, the second BLP to generate a second sequence of analog vectors of analog voltage values”. Appropriate correction is required.
In independent claim 24, the word “and” is missing between “a second memory circuit configured to store the input data;” and “a cross bit line processor” in the penultimate and last limitations of the claim. The examiner suggests that one way to address this objection would be to amend the last 6 lines of claim 24 to recite “a second memory circuit configured to store the input data; and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors corresponding to a column of the first memory circuit and one of the second sequence of analog vectors corresponding to a column of the second memory circuit.” Appropriate correction is required.
Claims 2-15, which depend directly or indirectly from claim 1, are objected to under the same rationale as base claim 1. 
Also, claims 3-11, which depend directly or indirectly from claim 2, are objected to under the same rationale as claim 2. 
Lastly, claim 25, which depends directly from claim 24, is objected to under the same rationale as claim 24. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-25 are rejected under 35 U.S.C. 103 as being unpatentable over Yakopcic et al. (U.S. Patent No. 10,176,425 B2, hereinafter “Yakopcic”) in view of non-patent literature Chi, et al. ("PRIME: A Novel Processing-In-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory." ACM SIGARCH Computer Architecture News 44.3 (2016): 27-39, hereinafter “Chi”) and further in view of Deisher et al. (U.S. Patent Application Pub. No. 2018/0121796 A1, hereinafter “Deisher”).
With respect to claim 1, Yakopcic discloses the invention as claimed including a hybrid artificial intelligence (AI) processing system (see, e.g., column 2, lines 57-58, “The present invention also provides an analog neuromorphic system”, column 17, lines 65-68, "the analog neuromorphic circuit 400 may be incorporated into digital signal processing applications" and column 23, lines 49-53, “the analog neuromorphic circuit 400 may be incorporated into analog neuromorphic configurations to execute popular neural network algorithms to execute popular neural network algorithms" [i.e., a hybrid analog-digital neuromorphic/neural network/AI system]) comprising:
a central processing unit (CPU) (see, e.g., column 2, line 67, ''A controller is configured”, column 4, lines 60-62 “instructions stored on a machine-readable medium, which may be read and executed by one or more processors.” and column 17, lines 65-68, "digital signal processing applications" [i.e., a digital/conventional processor/CPU to execute instructions]); and
an AI processor (see, e.g., FIG. 1 depicting neuromorphic processing device 100 [i.e., an AI processor] that is electrically and communicatively coupled to other components such as a processor/CPU via lines 140 and 180 and column 6, lines 33-34, “an analog neuromorphic processing device 100” [i.e., neuromorphic processing device 100/AI processor]) … , the AI processor to perform analog in-memory computations based on (1) digital neural network (NN) weighting factors provided by the CPU and (2) input data provided by the CPU (see, e.g., column 6, lines 10-14, 35-40 and 50-54, "simultaneous execution of addition and multiplication operations in an analog circuit” [i.e., perform analog computations], “The analog neuromorphic processing device 100 includes a plurality of input voltages 140(a-n) that are applied to a plurality of respective inputs of the analog neuromorphic processing device 100 and the analog neuromorphic processing device 100 then generates a plurality of output signals 180”, “resistive memories are also of nano-scale sizes that enable a significant amount of resistive memories to be configured within the analog neuromorphic processing device 100 [i.e., the AI processor/ neuromorphic processing device 100 performs analog in-memory computations based on input data], column 17, lines 49-58, “The analog neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications, image recognition, image processing, digital signal processing … the analog neuromorphic circuit 400 may be incorporated into image processing applications where the vector represents an image and the matrix includes a set of weighted values” [i.e., and based on digital NN weighting factors] and column 25, lines 57-59, “feature maps after being generated may then be stored in a digital storage layer” [i.e., provided by the CPU]), wherein the AI processor comprises:
a first memory circuit configured to store a subset of the digital weighting factors (see, e.g., column 17, lines 56-57, column 23, lines 54-56 and column 25, lines 57-59, “vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., subset of weighted values/weighting factors to be applied to an image], “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined” [i.e., subset of weights for each memory circuit 410a-n], “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a first memory circuit stores the subset of weights]), the first memory circuit comprising … random access memory (see, e.g., column 4, line 64-column 5, line 1, “a machine-readable medium may include … random access memory (RAM)” [i.e., memory circuit including random access memory/RAM]);
a second memory circuit configured to store the input data (see, e.g., column 18, lines 26-33, "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is data associated with the NN layer] and column 25, lines 35-38 and 57-58, "neuromorphic circuit 1000 includes … resistive memories 410(a-n) … a digital storage layer" [i.e., resistive memories 410a-n include a second memory circuit for storing the input data]), the second memory circuit comprising … random access memory (see, e.g., column 4, line 62-column 5, line 1, “A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine … a machine-readable medium may include … random access memory (RAM); … flash memory devices;” [i.e., another, second memory circuit/device includes random access memory/RAM]); and
a … processor … to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors corresponding to a column of the first memory circuit and one of the second sequence of analog vectors corresponding to a column of the second memory circuit (see, e.g., column 17, lines 26-52, “The dot-product operation with a vector, such as the example vector in Equation 1, and a matrix, such as the example matrix in Equation 2, may then be executed incorporating the analog neuromorphic circuit 400. … where the dot product operation is executed with the dot-product operation values 470(a-n) generated as output values of the dot product operation. … conversion of the resistance values for the resistive memories 410(a-n) to represent to represent the non-binary values included in a matrix, the analog neuromorphic circuit 400 is able to execute dot product operations … neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications” [i.e., the dot product is calculated between analog column and row vectors of two matrices corresponding to columns of first and second memory circuits in resistive memories 410(a-n), calculate a sequence of analog dot products], and column 20, lines 28-35, “The output configuration 500 includes the first op-amp configuration 520 and the second op-amp configuration 530 that may be positioned at the output of each column of the analog neuromorphic circuit 400 to both scale the output voltage signal 510 to a value on the non-linear smooth function 610 between "0" and "1" and does so by incorporating a neuron function such as an activation function and/or a thresholding function.” [i.e., a processor/op-amp configuration of the analog neuromorphic circuit 400 to perform analog calculations]).
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose an AI processor coupled to the CPU and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors … and one of the second sequence of analog vectors.
In the same field, analogous art Chi teaches an AI processor coupled to the CPU (see, e.g., FIG. 3 – showing PRIME architecture with an AI processor coupled to the CPU and pages 32, “when PRIME is accelerating NN computation, CPU can still access the memory and work in parallel” and 34, “When LRN layers are applied PRIME requires the help of CPU for LRN computation” [i.e., an AI processor in the PRIME architecture works in parallel with the CPU and is communicatively coupled to the CPU]) and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors … and one of the second sequence of analog vectors (paragraphs 24-25 of applicant’s specification state “The CBLP circuit 340 is configured to calculate a sequence of analog dot products.” and “the CBLP circuit 340 performs the analog multiply portion of the dot product operation by timing current integration over a capacitor. Circuit 340 may be configured as a capacitor in series with a switch.” Therefore, “a cross bit line processor (CBLP)”, under the BRI, in light of the specification, is any processor, functional unit, capacitor, circuitry or circuit that is capable of performing analog calculations) (see, e.g., pages 31, “in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch … We enable an FF subarray to access any physical location in a Buffer subarray to accommodate the random memory access pattern in NN computation (e.g., in the connection of two convolutional layers).” [i.e., bitline processing using vectors corresponding to first and second memory circuits] and 34, “To implement synapse composing, the high-bit and low-bit parts of the synaptic weights are stored in adjacent bitlines of the corresponding crossbar array … (as shown in Figure 4 A ); the output currents are accumulated at the bitlines. … the input vector is the voltages [i.e., sequences of vectors for inputs] … we execute the dot products of {ai} and six sets of weights … in ReRAM cells, and execute the dot product of the inputs and the weights to obtain the mean value of n inputs.” [i.e., a cross-bitline unit/circuit performs analog calculations and calculates a series of analog dot products between sequences of vectors]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).
Yakopcic in view of Chi substantially teaches the claimed invention.
However, Yakopcic in view of Chi is not relied on to teach a first memory circuit comprising static random access memory (SRAM); and
the second memory circuit comprising static random access memory (SRAM).
In the same field, analogous art Deisher teaches a first memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM” [i.e., neural network/NN system with a first memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., a first memory chip/circuit comprising SRAM]); and 
the second memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM … processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or … memory 248 or another memory … processor 250 may retrieve or transmit data to other external ( off-die or off-chip) volatile memory”, 48, “NN buffers 256 could be, or at least partially be, held external to the SoC 200 on volatile … memory forming memory 248” [i.e., neural network/NN system with other volatile memory/second memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., second memory chip/circuit comprising SRAM]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Deisher with Yakopcic in view of Chi in order to provide an “NN system 200 [that] may be a system on a chip (SoC) that has an NN Accelerator (NNA)” and includes a “processor 250 [that] may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory” and “The processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or non-volatile memory whether as memory 248 or another memory” for storing “the layer data within a layer as arranged in the memory” and storing “an input array” and a “weight matrix” (See, e.g., Deisher, paragraphs 46-47, 59 and 62). Doing so would have allowed Yakopcic in view of Chi to use Deisher’s NN system and NN accelerator components to achieve a “substantial reduction in the use of memory transactions and bandwidth to upload the same weight matrix multiple times for different groups”, as suggested by Deisher (See, e.g., Deisher, paragraphs 46-47 and 62).

Regarding claim 2, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 1.
Yakopcic further discloses wherein the AI processor comprises one or more NN layers comprising a corresponding NN layer (see, e.g., column 8, lines 28-29, "layering of the analog neuromorphic processing device 100 with other similar analog neuromorphic circuits” and column 23, line 48, “in a given layer of a CNN system” [i.e., NN layers that include a given/corresponding layer]) that includes:
a first digital access circuit to receive, from the CPU, a subset of the weighting factors, the subset associated with the corresponding NN layer (applicant’s specification merely repeats the claim language and states “Digital access circuits 310 are configured to receive, from the CPU, weighting factors w(l) 220 which correspond to the weights 120, or a subset of those weights associated with the NN layer (l). Digital access circuits 310 are also configured to receive input data associated with the NN layer (l).” – see paragraphs 19-20, 27-28, 67 and 81. Therefore, a “digital access circuit”, under the BRI, in light of the specification, is any digital circuit or circuitry that is capable of receiving weights, weighting factors or input data associated with a neural network (NN) layer) (see, e.g., column 9, lines 48-52, “combined weight 295 as shown in FIG. 2 as representative of the combined weight for the input voltage 240a is shown as Wj , in FIG. 3. Similar combined weights for the input voltage 240b and the input voltage 240n”, column 17, lines 49-57 and 65-67, “neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications, image recognition, image processing, digital signal processing … neuromorphic circuit 400 may be incorporated into image processing applications where the vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., a subset of weights/weighting factors associated with a neuromorphic layer/NN layer], column 23, lines 54-55, “executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n)" [i.e., circuitry/resistive memories receive the subset of weights associated with the NN layer] and column 25, lines 57-59, “stored in a digital storage layer as the output of the first convolution layer” [i.e., computing system for digital processing includes a digital circuit]), 
the first memory circuit (see, e.g., column 23, lines 54-56, “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined” [i.e., subset of weights for each memory circuit 410a-n] and column 25, lines 57-59, “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., the first memory circuit stores the subset of weights]);
a first … processor … associated with the first memory circuit, the first [processor] … generate the first sequence of analog vectors of analog voltage values (see, e.g., column 18, lines 37-42 and 59-61, " the controller 405 may convert the image matrix xex into the vector values included in the vector xex that are applied as the input voltages 440(a-n) and the vector values included in the vector -xex that are applied as the complemented input voltages 460( a-n )”, “controller 405 may then convert the kernel matrix kex into kex + and kex - which are similar to w+ and w- discussed above", “The output configuration 500 to convert the output voltage signal 510 to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a” [i.e., a first processor/controller 405 converts the kernel, which is a matrix of weights, to generate a sequence of analog matrices/vectors of analog voltage values/input voltages 440a-n]);
a second digital access circuit to receive input data associated with the corresponding NN layer (as indicated above, a “digital access circuit”, under the BRI, in light of the specification, is any digital circuit or circuitry that is capable of receiving weights, weighting factors or input data associated with a neural network (NN) layer) (see, e.g., column 10, lines 17-25, “neuromorphic circuit 200 may also be scaled to include additional layers of neurons … to the extent that the neural network configuration 300 can execute learning algorithms. For example, a neural network configuration with a significant number of input”, column 18, lines 26-33, "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is input data associated with the NN layer] and column 25, lines 56-59, “24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a second digital circuit in neuromorphic circuit 200 receives input data associated with the NN layer]);
the second memory circuit to store the data associated with the corresponding NN layer (see, e.g., column 18, lines 26-33, "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is data associated with the NN layer] and column 25, lines 35-38 and 57-58, "neuromorphic circuit 1000 includes … resistive memories 410(a-n) … a digital storage layer" [i.e., resistive memories 410a-n include the second memory circuit for storing the data]);
a second [processor] … associated with the second memory circuit, the second [processor] to generate a second sequence of vectors of analog voltage values, each of the second sequence of vectors associated with a column of the second memory circuit (see, e.g., column 18, lines 26-33, “the controller 405 may convert the image matrix xex into a vector such that the vector values may then be applied to the analog neuromporphic circuit 400 as the input voltages 440(a-n) and the complemented input voltages 460(a-n).” [i.e., generate a second sequence of vectors/vector values of analog input voltage values 440, 460], column 19, lines 27-29, “the analog neuromorphic circuit 400 generates an output voltage signal 510. The output voltage signal 510 is generated from each input voltage 440(a-n)” and column 20, lines 11-15, “The output configuration 500 to convert the output voltage signal 510 to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a” [i.e., output configuration 500 converts the output voltage signal 510 based on analog voltage values/output voltage signal 510 associated with input voltages/a column of the second memory circuit]).
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose a first bit line processor (BLP) associated with the first memory circuit, the first BLP to generate a first sequence of vectors of analog voltage values, each of the first sequence of vectors associated with a column of the first memory circuit; …
a second bit line processor (BLP) associated with the second memory circuit, the second BLP to generate a second sequence of analog vectors of analog voltage values, each of the second sequence of vectors associated with a column of the second memory circuit; and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of vectors and one of the second sequence of vectors.
In the same field, analogous art Chi teaches a first bit line processor (BLP) associated with the first memory circuit, the first BLP to generate a first sequence of vectors of analog voltage values, each of the first sequence of vectors associated with a column of the first memory circuit (applicant’s specification repeats the claim language in paragraphs 22, 27, 30, 67 and 81 and states “The first BLP circuit 330, associated with the first memory circuit 320, is configured to generate a first sequence of vectors of analog voltage values.” in paragraph 22. Therefore, “a first bit line processor (BLP)”, under the BRI, in light of the specification, is any processor, functional unit, circuitry or circuit that is capable of performing analog calculations or generating vectors/matrices based on analog voltage values) (see, e.g., page 29, “execute the neural networks in Figure 2(a). The input data ai is represented by analog input voltages … Then the current flowing to the end of each bitline is viewed … After sensing the current on each bitline, the neural networks adopt a nonlinear function unit to complete the execution. Implementing NNs with ReRAM crossbar arrays requires specialized peripheral circuit design.” [i.e., a bitline unit/circuit performs analog calculations based on analog voltage values associated with a column of a first ReRAM/memory circuit in the crossbar array]); …
a second bit line processor (BLP) associated with the second memory circuit, the second BLP to generate a second sequence of vectors of analog voltage values, each of the second sequence of vectors associated with a column of the second memory circuit (applicant’s specification repeats the claim language in paragraphs 19, 23, 30, 67 and 81 and states “The second BLP circuit 330, associated with the second memory circuit 350, is configured to generate a second sequence of vectors of analog voltage values.” in paragraph 23. Therefore, “a second BLP”, under the BRI, in light of the specification, is any second processor, functional unit, circuitry or circuit that is capable of performing analog calculations or generating vectors/matrices based on analog voltage values) (see, e.g., pages 29, “current flowing to the end of each bitline is viewed … After sensing the current on each bitline, the neural networks adopt a nonlinear function unit to complete the execution. Implementing NNs with ReRAM crossbar arrays requires specialized peripheral circuit design.”, 31, “in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch” and 34, “the input vector is the voltages” [i.e., a second bitline unit/circuit performs analog calculations/generates a sequence of vectors based on analog voltage values associated with a second memory circuit/ReRAM in the crossbar array]); and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of vectors and one of the second sequence of vectors (as indicated above, “a cross bit line processor (CBLP)”, under the BRI, in light of the specification, is any processor, functional unit, capacitor, circuitry or circuit that is capable of performing analog calculations) (see, e.g., pages 31, “in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch” and 34, “To implement synapse composing, the high-bit and low-bit parts of the synaptic weights are stored in adjacent bitlines of the corresponding crossbar array … (as shown in Figure 4 A ); the output currents are accumulated at the bitlines. … the input vector is the voltages [i.e., sequences of vectors for inputs] … we execute the dot products of {ai} and six sets of weights … in ReRAM cells, and execute the dot product of the inputs and the weights to obtain the mean value of n inputs.” [i.e., a cross-bitline unit/circuit performs analog calculations and calculates a series of analog dot products between sequences of vectors]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).

Regarding claim 5, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
 Yakopcic further discloses wherein data associated with one of the NN layers is a subset of the input data provided by the CPU (see, e.g., column 4, lines 60-62, “instructions … which may be read and executed by one or more processors.” [i.e., provided by a processor/CPU], column 10, lines 17-25, “neuromorphic circuit 200 may also be scaled to include additional layers of neurons … to the extent that the neural network configuration 300 can execute learning algorithms. For example, a neural network configuration with a significant number of input”, column 17, lines 54-56, “neuromorphic circuit 400 may be incorporated into image processing applications where the vector represents an image … a set of weighted values that are to be applied to the image” [i.e., a subset of image data/weights associated with a neuromorphic layer/NN layer], column 18, lines 26-33 "image is a two-dimensional image depicted by the image matrix" [i.e., image data is a subset of input data associated with a NN layer] and column 25, lines 56-59, “24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.”).

Regarding claim 6, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
 Yakopcic further discloses wherein the data associated with one of the NN layers is a result of the analog in-memory computations generated by another of the NN layers (as indicated above, “the data associated with one of the NN layers” has been interpreted as any data associated with one of the previously-introduced “one or more NN layers”) (see, e.g., column 25, line 51 - column 26, line 23, “Thus, six different 24x24 pixel feature maps are generated by the analog neuromorphic circuit 1000 … The six different 24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer. … the operational control flow 900 executes a smoothing layer and subsamples the data for each feature map that is stored after completing the first convolution layer … The pixel size of each feature map may then be decreased with a subsampling operation where a portion of the averaged pixels for 20 each feature map are selected such that the important data of each feature map is carried forward as outputs of the analog neuromorphic circuit 1100” [i.e., successive layers are computed, the feature maps are the result of the analog in-memory computations generated by another of the NN layers]).

Regarding claim 7, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
 Yakopcic further discloses wherein the NN layers further include a third digital access circuit to provide a result of the analog in-memory computations to the CPU or to another of the NN layers (as indicated above, a “digital access circuit”, is any digital circuit or circuitry that is capable of receiving weights, weighting factors or input data associated with a neural network (NN) layer) (see, e.g., column 25, lines 56-59, “24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a third digital access circuit to provide the result/output of the computations of the first convolution layer to another of the NN layers]).

Regarding claim 10, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
Yakopcic further discloses perform thresholding on the sequence of analog dot products (see, e.g., col. 20, lines 21-36, “output voltage signal 510 may be converted to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a … by incorporating a neuron function such as an activation function and/or a thresholding function.” [i.e., perform thresholding on the sequence of dot products]). 
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose wherein at least one of the NN layers further includes a Rectified Linear Unit (ReLU) to perform thresholding on the sequence of analog dot products.
In the same field, analogous art Chi teaches wherein at least one of the NN layers further includes a Rectified Linear Unit (ReLU) to perform thresholding on the sequence of analog dot products (see, e.g., pages 31, “The modified column multiplexer incorporates … a nonlinear threshold (sigmoid) unit”, “we add a hardware unit to support ReLU function, a function in the convolution layer of CNN.”, “Our circuit design supports two activation functions: sigmoid and ReLU. Sigmoid is implemented by the sigmoid unit in Figure 4 B, and ReLU is implemented by the ReLU unit.” and 34, “we execute the dot products of {ai} … execute the dot product of the inputs and the weights” [i.e., a layer of the CNN includes a ReLU unit to perform thresholding on the analog calculations/sequence of analog dot products]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).

Regarding claim 11, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose wherein at least one of the NN layers further includes a pooling logic circuit to perform maximum pooling on the thresholded sequence of analog dot products.
In the same field, analogous art Chi teaches wherein at least one of the NN layers further includes a pooling logic circuit to perform maximum pooling on the thresholded sequence of analog dot products (see, e.g., FIG. 4 C – showing “4-1 max pooling function units” and pages 31, “a circuit to support 4-1 max pooling is included” and 34, “To implement max pooling function, we adopt 4:1 max pooling hardware in Figure 4 C , which is able to support n:1 max pooling … we execute the dot products of {ai}” [i.e., pooling unit/logic circuit to perform max pooling on the thresholded sequence of analog dot products]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).

Regarding claim 12, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2. 
Yakopcic in view of Chi substantially teaches the claimed invention.
However, Yakopcic in view of Chi is not relied on to teach wherein the CPU is an x86-architecture processor.
In the same field, analogous art Deisher teaches wherein the CPU is an x86-architecture processor (as indicated above, “an x86-architecture processor” has been interpreted as any CISC CPU or processor in the x86 family of CPUs and processors or any CPU or processor capable of executing x86 instructions such as x86 assembly language) (see, e.g., paragraph 280, “Processor 1410 may be implemented as a Complex Instruction Set Computer (CISC) or Reduced Instruction Set Computer (RISC) processors; x86 instruction set compatible processors, multi-core, or any other microprocessor or central processing unit (CPU).”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Deisher with Yakopcic in view of Chi to provide an “NN system 200 [that] may be a system on a chip (SoC) that has an NN Accelerator (NNA)” and includes a “processor 250 [that] may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory” and “The processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or non-volatile memory whether as memory 248 or another memory” for storing “the layer data within a layer as arranged in the memory” and storing “an input array” and a “weight matrix” (See, e.g., Deisher, paragraphs 46-47, 59 and 62). Doing so would have allowed Yakopcic in view of Chi to use Deisher’s NN system and NN accelerator components to achieve a “substantial reduction in the use of memory transactions and bandwidth to upload the same weight matrix multiple times for different groups”, as suggested by Deisher (See, e.g., Deisher, paragraphs 46-47 and 62).

Regarding claim 13, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2.
Yakopcic further discloses wherein the CPU is to generate the digital weighting factors for training of the AI processor (see, e.g., column 9, lines 24-26, “The analog neuromorphic circuits also have learning capability … so that the analog neuromorphic circuits may successfully execute learning algorithms” [i.e., learning/training of the AI processor/neuromorphic circuits], column 17, lines 49-63, “The analog neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications, image recognition, image processing, digital signal processing [i.e., digital data] … the matrix includes a set of weighted values that are to be applied to the image to improve the quality of the image [i.e., digital weights/weighting factors] … Through numerous iterations, the resistance values of the resistive memories 410(a-n) may be adjusted until the resistance values accurately represent the weighted values included in the matrix and the dot-product operation values 470(a-n) generated by the analog neuromorphic circuit 400” and column 23, lines 54-56, “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined as described in detail above by the controller 405” [i.e., processor/CPU executes the NN algorithms and determines/generates the weights for training the AI processor]).

Regarding claim 14, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 1.
Examiner’s Note: claim 14, as drafted, depends from claim 1. If applicant intended for claim 14 to be an independent claim, the examiner suggests that one way to do so is to amend the last portion of claim 14 to explicitly recite the limitations of claim 1 instead of the current recitation of an “integrated circuit or chip set comprising the system of claim 1”.
Yakopcic further discloses an integrated circuit or chip set (see, e.g., column 6, lines 57-61, “the analog neuromorphic processing device 100 has significant computational efficiency while maintaining the size of the analog neuromorphic processing device 100 to a chip that may easily be positioned on a circuit board.” and column 7, lines 55-57, “The scaling of the resistive memories into additional neurons may be done within the analog neuromorphic processing device 100 such as within a single chip. However, the analog neuromorphic processing device 100 may also be scaled with other analog neuromorphic circuits contained in other chips” [i.e., an integrated circuit or chip set]) comprising the system of claim 1 (as indicated above, Yakopcic in view of Chi and Deisher teaches the system of claim 1, see above citations to Yakopcic, Chi and Deisher regarding the limitations of claim 1).

Regarding claim 15, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 1. 
Examiner’s Note: claim 15, as drafted, depends from claim 1. If applicant intended for claim 15 to be an independent claim, the examiner suggests that one way to do so is to amend the last portion of claim 15 to explicitly recite the limitations of claim 1 instead of the current recitation of a “virtual assistant comprising the system of claim 1.”
Yakopcic in view of Chi and Deisher teaches comprising the system of claim 1 (as indicated above, Yakopcic in view of Chi and Deisher teaches the system of claim 1, see above citations to Yakopcic, Chi and Deisher regarding the limitations of claim 1).
Yakopcic in view of Chi substantially teaches the claimed invention.
However, Yakopcic in view of Chi is not relied on to teach a virtual assistant.
In the same field, analogous art Deisher teaches a virtual assistant (aside from merely repeating the claim language in paragraphs 80 and 102 of applicant’s specification, the sole mention of any “virtual assistant” in applicant’s specification is in paragraph 15, which states “The disclosed techniques are particularly well-suited to AI platforms, but also can be implemented on a broad range of platforms including laptops, tablets, smart phones, workstations, video conferencing systems, gaming systems, smart home control systems, robots, and personal or so-call virtual assistants (such as those that respond to a wake-up phrase).” Therefore, “a virtual assistant” under the BRI, in light of the specification, is any platform, system or personal device such as a device that responds to a wake-up phrase, such as a smart phone or personal digital assistant that respond to a phrase such as “Hey Siri”) (see, e.g., paragraphs 277 and 298, “system 1400 may be incorporated into a microphone, … handheld computer, palmtop computer, personal digital assistant (PDA), cellular telephone, combination cellular telephone/PDA, television, smart device (e.g., smart phone, smart tablet”, “may include any device with an audio subsystem such as a personal computer (PC), laptop computer … personal digital assistant (PDA), cellular telephone, combination cellular telephone/PDA, television, smart device (e.g., smart phone, smart tablet or smart television), mobile internet device (MID), messaging device, data communication device, and so forth, and any other … computer that may accept audio commands.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Deisher with Yakopcic in view of Chi to provide an “NN system 200 [that] may be a system on a chip (SoC) that has an NN Accelerator (NNA)” and includes a “processor 250 [that] may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory” and “The processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or non-volatile memory whether as memory 248 or another memory” for storing “the layer data within a layer as arranged in the memory” and storing “an input array” and a “weight matrix” (See, e.g., Deisher, paragraphs 46-47, 59 and 62). Doing so would have allowed Yakopcic in view of Chi to use Deisher’s NN system and NN accelerator components to achieve a “substantial reduction in the use of memory transactions and bandwidth to upload the same weight matrix multiple times for different groups”, as suggested by Deisher (See, e.g., Deisher, paragraphs 46-47 and 62).

With respect to independent claim 16, Yakopcic discloses the invention as claimed including an analog in-memory neural network (NN) layer (see, e.g., column 23, line 48, and column 24, lines 9-10, "in a given layer of a CNN system”, “each layer of the neural network executing the feature extractor”) comprising:
a first digital access circuit to receive, from a central processing unit (CPU), digital weighting factors associated with the NN layer (as indicated above, a “digital access circuit”, under the BRI, in light of the specification, is any digital circuit or circuitry that is capable of receiving weights, weighting factors or input data associated with a neural network (NN) layer) (see, e.g., column 9, lines 48-52, “combined weight 295 as shown in FIG. 2 as representative of the combined weight for the input voltage 240a is shown as Wj , in FIG. 3. Similar combined weights for the input voltage 240b and the input voltage 240n”, column 17, lines 49-57 and 65-67, “neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications, image recognition, image processing, digital signal processing … neuromorphic circuit 400 may be incorporated into image processing applications where the vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., digital weights/weighting factors associated with a neuromorphic layer/NN layer], column 23, lines 54-55, “executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n)" [i.e., circuitry/resistive memories receive the subset of weights associated with the NN layer] and column 25, lines 57-59, “stored in a digital storage layer as the output of the first convolution layer” [i.e., computing system for digital processing includes a digital circuit]);
a first memory circuit to store the weighting factors (see, e.g., column 17, lines 56-57, “vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., weighted values/weighting factors to be applied to an image], column 23, lines 54-56, “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined” [i.e., weights for each memory circuit 410a-n] and column 25, lines 57-59, “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a first memory circuit stores the weights]), the first memory circuit comprising … random access memory (see, e.g., column 4, line 64-column 5, line 1, “a machine-readable medium may include … random access memory (RAM)” [i.e., memory circuit including random access memory/RAM]);
a first … processor … associated with the first memory circuit, the first [processor] to generate a first sequence of analog vectors of analog voltage values, each of the first sequence of vectors associated with a column of the first memory circuit (see, e.g., column 18, lines 37-42 and 59-61, " the controller 405 may convert the image matrix xex into the vector values included in the vector xex that are applied as the input voltages 440(a-n) and the vector values included in the vector -xex that are applied as the complemented input voltages 460( a-n )”, “controller 405 may then convert the kernel matrix kex into kex + and kex - which are similar to w+ and w- discussed above", “The output configuration 500 to convert the output voltage signal 510 to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a” [i.e., processor/controller 405 converts the kernel, which is a matrix of weights, to generate a sequence of analog matrices/vectors of analog voltage values/input voltages 440a-n associated with the first memory circuit]);
a second digital access circuit to receive input data associated with the NN layer (as indicated above, a “digital access circuit”, under the BRI, in light of the specification, is any digital circuit or circuitry that is capable of receiving weights, weighting factors or input data associated with a neural network (NN) layer) (see, e.g., column 10, lines 17-25, “neuromorphic circuit 200 may also be scaled to include additional layers of neurons … to the extent that the neural network configuration 300 can execute learning algorithms. For example, a neural network configuration with a significant number of input”, column 18, lines 26-33, "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is input data associated with the NN layer] and column 25, lines 56-59, “24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a second digital circuit in neuromorphic circuit 200 receives input data associated with the NN layer]);
a second memory circuit to store the data associated with the NN layer (see, e.g., column 18, lines 26-33, "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is data associated with the NN layer] and column 25, lines 35-38 and 57-58, "neuromorphic circuit 1000 includes … resistive memories 410(a-n) … a digital storage layer" [i.e., resistive memories 410a-n include a second memory circuit for storing the data]), the second memory circuit comprising … random access memory (see, e.g., column 4, line 62-column 5, line 1, “A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine … a machine-readable medium may include … random access memory (RAM); … flash memory devices;” [i.e., another, second memory circuit/device includes random access memory/RAM]);
a second [processor] … associated with the second memory circuit, the second [processor] to generate a second sequence of analog vectors of analog voltage values, each of the second sequence of analog vectors associated with a column of the second memory circuit (see, e.g., column 18, lines 26-33, “the controller 405 may convert the image matrix xex into a vector such that the vector values may then be applied to the analog neuromporphic circuit 400 as the input voltages 440(a-n) and the complemented input voltages 460(a-n).” [i.e., generate a second sequence of analog vectors/vector values of analog input voltage values 440, 460], column 19, lines 27-29, “the analog neuromorphic circuit 400 generates an output voltage signal 510. The output voltage signal 510 is generated from each input voltage 440(a-n)” and column 20, lines 11-15, “The output configuration 500 to convert the output voltage signal 510 to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a” [i.e., output configuration 500 converts the output voltage signal 510 based on analog voltage values/output voltage signal 510 associated with input voltages/a column of the second memory circuit]); and
a … processor … to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors and one of the second sequence of analog vectors (see, e.g., column 17, lines 26-52, “The dot-product operation with a vector, such as the example vector in Equation 1, and a matrix, such as the example matrix in Equation 2, may then be executed incorporating the analog neuromorphic circuit 400. … where the dot product operation is executed with the dot-product operation values 470(a-n) generated as output values of the dot product operation. … conversion of the resistance values for the resistive memories 410(a-n) to represent to represent the non-binary values included in a matrix, the analog neuromorphic circuit 400 is able to execute dot product operations … The analog neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications” [i.e., the dot product is calculated between analog column and row vectors of two matrices, calculate a sequence of analog dot products] and column 20, lines 28-35, “The output configuration 500 includes the first op-amp configuration 520 and the second op-amp configuration 530 that may be positioned at the output of each column of the analog neuromorphic circuit 400 to both scale the output voltage signal 510 to a value on the non-linear smooth function 610 between "0" and "1" and does so by incorporating a neuron function such as an activation function and/or a thresholding function.” [i.e., a processor/op-amp configuration of the analog neuromorphic circuit 400 to perform analog calculations]).
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose a first bit line processor (BLP) associated with the first memory circuit, the first BLP to generate a first sequence of vectors of analog voltage values, each of the first sequence of vectors associated with a column of the first memory circuit; …
a second bit line processor (BLP) associated with the second memory circuit, the second BLP to generate a second sequence of vectors of analog voltage values, each of the second sequence of vectors associated with a column of the second memory circuit; and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of vectors and one of the second sequence of vectors.
In the same field, analogous art Chi teaches a first bit line processor (BLP) associated with the first memory circuit, the first BLP to generate a first sequence of vectors of analog voltage values, each of the first sequence of vectors associated with a column of the first memory circuit (as indicated above, “a first bit line processor (BLP)”, under the BRI, in light of the specification, is any processor, functional unit, circuitry or circuit that is capable of performing analog calculations or generating vectors/matrices based on analog voltage values) (see, e.g., page 29, “execute the neural networks in Figure 2(a). The input data ai is represented by analog input voltages … Then the current flowing to the end of each bitline is viewed … After sensing the current on each bitline, the neural networks adopt a nonlinear function unit to complete the execution. Implementing NNs with ReRAM crossbar arrays requires specialized peripheral circuit design.” [i.e., a bitline unit/circuit performs analog calculations based on analog voltage values associated with a column of a first ReRAM/memory circuit in the crossbar array]); …
a second bit line processor (BLP) associated with the second memory circuit, the second BLP to generate a second sequence of vectors of analog voltage values, each of the second sequence of vectors associated with a column of the second memory circuit (as indicated above, “a second BLP”, under the BRI, in light of the specification, is any second processor, functional unit, circuitry or circuit that is capable of performing analog calculations or generating vectors/matrices based on analog voltage values) (see, e.g., pages 29, “current flowing to the end of each bitline is viewed … After sensing the current on each bitline, the neural networks adopt a nonlinear function unit to complete the execution. Implementing NNs with ReRAM crossbar arrays requires specialized peripheral circuit design.”, 31,“in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch” and 34, “the input vector is the voltages” [i.e., a second bitline unit/circuit performs analog calculations/generates a sequence of vectors based on analog voltage values associated with a second memory circuit/ReRAM in the crossbar array]); and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of vectors and one of the second sequence of vectors (as indicated above, “a cross bit line processor (CBLP)”, under the BRI, in light of the specification, is any processor, functional unit, capacitor, circuitry or circuit that is capable of performing analog calculations) (see, e.g., pages 31, “in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch” and 34, “To implement synapse composing, the high-bit and low-bit parts of the synaptic weights are stored in adjacent bitlines of the corresponding crossbar array … (as shown in Figure 4 A ); the output currents are accumulated at the bitlines. … the input vector is the voltages [i.e., sequences of vectors for inputs] … we execute the dot products of {ai} and six sets of weights … in ReRAM cells, and execute the dot product of the inputs and the weights to obtain the mean value of n inputs.” [i.e., a cross-bitline unit/circuit performs analog calculations and calculates a series of analog dot products between sequences of vectors]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).
Yakopcic in view of Chi substantially teaches the claimed invention.
However, Yakopcic in view of Chi is not relied on to teach a first memory circuit comprising static random access memory (SRAM); and
the second memory circuit comprising static random access memory (SRAM).
In the same field, analogous art Deisher teaches a first memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM” [i.e., neural network/NN system with a first memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., a first memory chip/circuit comprising SRAM]); and 
the second memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM … processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or … memory 248 or another memory … processor 250 may retrieve or transmit data to other external ( off-die or off-chip) volatile memory”, 48, “NN buffers 256 could be, or at least partially be, held external to the SoC 200 on volatile … memory forming memory 248” [i.e., neural network/NN system with other volatile memory/second memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., second memory chip/circuit comprising SRAM]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Deisher with Yakopcic in view of Chi in order to provide an “NN system 200 [that] may be a system on a chip (SoC) that has an NN Accelerator (NNA)” and includes a “processor 250 [that] may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory” and “The processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or non-volatile memory whether as memory 248 or another memory” for storing “the layer data within a layer as arranged in the memory” and storing “an input array” and a “weight matrix” (See, e.g., Deisher, paragraphs 46-47, 59 and 62). Doing so would have allowed Yakopcic in view of Chi to use Deisher’s NN system and NN accelerator components to achieve a “substantial reduction in the use of memory transactions and bandwidth to upload the same weight matrix multiple times for different groups”, as suggested by Deisher (See, e.g., Deisher, paragraphs 46-47 and 62).

Regarding claims 3 and 17, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2 and the NN layer of claim 16.
Yakopcic further discloses wherein the analog voltage values of the first sequence of vectors are generated in parallel and the analog voltage values of the second sequence of vectors are generated in parallel (see, e.g., column 5, line 57 - column 6, line 32, “Each resistive memory may apply a resistance to each input voltage so that each input voltage is multiplied by each resistance. The positioning of each resistive memory at each intersection of the wire grid enables the multiplying of each input voltage by the resistance of each resistive memory to be done in parallel. The multiplication in parallel enables multiple multiplication operations to be executed simultaneously. … The addition of each current to generate the accumulative currents is also done in parallel … The addition in parallel also enables multiple addition operations to be executed simultaneously. The simultaneous execution of addition and multiplication operations in an analog circuit” [i.e., analog input voltage values of sequences of vectors are generated in parallel/simultaneously]).

Regarding claims 4 and 18, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2 and the NN layer of claim 16.
Yakopcic further discloses wherein the analog dot products, of the sequence of analog dot products, are calculated in parallel (see, e.g., column 5, line 57 - column 6, line 32, “The positioning of each resistive memory at each intersection of the wire grid enables the multiplying of each input voltage by the resistance of each resistive memory to be done in parallel. The multiplication in parallel enables multiple multiplication operations to be executed simultaneously.” and column 25, lines 47-50, “Each of the six dot-product operation values 470(a-n) and the complemented dot-product operation values 450(a-n) represent the six different feature maps generated in parallel” [i.e., multiplication operations – including dot products of the sequence of dot products are calculated in parallel/simultaneously]).

Regarding claims 8 and 19, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2 and the NN layer of claim 16.
 Yakopcic further discloses wherein the NN layer is a convolutional NN layer (see, e.g., FIG. 8 – depicting “CONVOLUTION” layers 830 in convolutional neural network/CNN 800 and column 24, lines 13-15, “The feature extractor 810 includes the combination of two different types layers that are the convolution layers 830(a-n)” and column 25, lines 56-59, “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.”).

Regarding claims 9 and 20, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 2 and the NN layer of claim 16.
Yakopcic further discloses wherein at least one of the NN layers is a fully connected NN layer (see, e.g., FIG. 8 – depicting “FULLY CONNECTED LAYER” in convolutional neural network/CNN 800 and column 24, lines 31-33, “The outputs of the last layer of the conventional CNN 800 are then input to a fully connected network that is the classifier 820.”).

Regarding claim 21, as discussed above, Yakopcic in view of Chi and Deisher teaches the NN layer of claim 16.
Yakopcic further discloses perform thresholding on the sequence of analog dot products (see, e.g., col. 20, lines 21-36, “output voltage signal 510 may be converted to the non-binary values represented by the dot-product operation value 470a and the complemented dot-product operation value 450a … by incorporating a neuron function such as an activation function and/or a thresholding function.” [i.e., perform thresholding on the sequence of dot products]). 
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose wherein at least one of the NN layers further includes a Rectified Linear Unit (ReLU) to perform thresholding on the sequence of analog dot products, and the NN layer further includes a pooling logic circuit to perform maximum pooling on the thresholded sequence of analog dot products.
In the same field, analogous art Chi teaches wherein at least one of the NN layers further includes a Rectified Linear Unit (ReLU) to perform thresholding on the sequence of analog dot products (see, e.g., pages 31, “The modified column multiplexer incorporates … a nonlinear threshold (sigmoid) unit”, “we add a hardware unit to support ReLU function, a function in the convolution layer of CNN.”, “Our circuit design supports two activation functions: sigmoid and ReLU. Sigmoid is implemented by the sigmoid unit in Figure 4 B, and ReLU is implemented by the ReLU unit.” and 34, “we execute the dot products of {ai} … execute the dot product of the inputs and the weights” [i.e., a layer of the CNN includes a ReLU unit to perform thresholding on the analog calculations/sequence of analog dot products]) and the NN layer further includes a pooling logic circuit to perform maximum pooling on the thresholded sequence of analog dot products (see, e.g., FIG. 4 C – showing “4-1 max pooling function units” and pages 31, “a circuit to support 4-1 max pooling is included” and 34, “To implement max pooling function, we adopt 4:1 max pooling hardware in Figure 4 C , which is able to support n:1 max pooling … we execute the dot products of {ai}” [i.e., pooling unit/logic circuit to perform max pooling on the thresholded sequence of analog dot products]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).

Regarding claim 22, as discussed above, Yakopcic in view of Chi and Deisher teaches the NN layer of claim 16.
Examiner’s Note: claim 22, as drafted, depends from claim 16. If applicant intended for claim 22 to be an independent claim, the examiner suggests that one way to do so is to amend the last portion of claim 22 to explicitly recite the limitations of claim 16 instead of the current recitation of a “multi-layer analog neural network comprising one or more cascaded NN layers of claim 16.”
Yakopcic further discloses a multi-layer analog neural network (see, e.g., FIG. 8 – showing a multi-layer neural network with “CONVOLUTION” layers 830 in a convolutional neural network/CNN 800 and column 25, line 51 - column 26, line 23, “six different 24x24 pixel feature maps are generated by the analog neuromorphic circuit 1000 … The six different 24x24 pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer. … the operational control flow 900 executes a smoothing layer and subsamples the data for each feature map that is stored after completing the first convolution layer … The pixel size of each feature map may then be decreased with a subsampling operation where a portion of the averaged pixels for 20 each feature map are selected such that the important data of each feature map is carried forward as outputs of the analog neuromorphic circuit 1100” [i.e., successive analog neural network layers are computed, the feature maps are the result of another layer]) comprising one or more cascaded NN layers of claim 16 (as indicated above, Yakopcic in view of Chi and Deisher teaches the NN layer of claim 16, see above citations to Yakopcic, Chi and Deisher regarding the limitations of claim 16).

Regarding claim 23, as discussed above, Yakopcic in view of Chi and Deisher teaches the network of claim 22.
Examiner’s Note: claim 23, as drafted, depends from claim 22 (which, as noted above, depends from independent claim 16). If applicant intended for claim 23 to be an independent claim, the examiner suggests that one way to do so is to amend the last portion of claim 23 to explicitly recite the limitations of claim 22 and its base claim 16 instead of the current recitation of a “integrated circuit, chip set, on-chip memory, or cache comprising the network of claim 22.”
Yakopcic further comprises an integrated circuit, chip set, on-chip memory, or cache (see, e.g., column 6, lines 57-61, “the analog neuromorphic processing device 100 has significant computational efficiency while maintaining the size of the analog neuromorphic processing device 100 to a chip that may easily be positioned on a circuit board.” and column 7, lines 55-57, “The scaling of the resistive memories into additional neurons may be done within the analog neuromorphic processing device 100 such as within a single chip. However, the analog neuromorphic processing device 100 may also be scaled with other analog neuromorphic circuits contained in other chips” [i.e., an integrated circuit or chip set]) comprising the network of claim 22 (as indicated above, Yakopcic in view of Chi and Deisher teaches the network of claim 22 and the NN layer of claim 16, see above citations to Yakopcic, Chi and Deisher regarding the limitations of claims 22 and 16).

With respect to independent claim 24, Yakopcic discloses the invention as claimed including an artificial intelligence (AI) processing system (see, e.g., column 2, lines 57-58, “The present invention also provides an analog neuromorphic system”, column 17, lines 65-68, "the analog neuromorphic circuit 400 may be incorporated into digital signal processing applications" and column 23, lines 49-53 “the analog neuromorphic circuit 400 may be incorporated into analog neuromorphic configurations to execute popular neural network algorithms to execute popular neural network algorithms" [i.e., a neuromorphic/neural network/AI system]) comprising:
a central processing unit (CPU) (see, e.g., column 2, line 67, column 4, lines 60-62, ''A controller is configured”, “instructions stored on a machine-readable medium, which may be read and executed by one or more processors.” and column 17, lines 65-68, "digital signal processing applications" [i.e., a digital/conventional processor/CPU to execute instructions]); and
an AI processor (see, e.g., FIG. 1 depicting neuromorphic processing device 100 [i.e., an AI processor] that is electrically and communicatively coupled to other components such as a processor/CPU via lines 140 and 180 and column 6, lines 33-34, “an analog neuromorphic processing device 100” [i.e., neuromorphic processing device 100/AI processor]) … , the AI processor to perform analog in-memory computations based on (1) digital neural network (NN) weighting factors provided by the CPU and (2) input data provided by the CPU (see, e.g., column 6, lines 10-14, 35-40 and 50-54, "simultaneous execution of addition and multiplication operations in an analog circuit” [i.e., perform analog computations], “The analog neuromorphic processing device 100 includes a plurality of input voltages 140(a-n) that are applied to a plurality of respective inputs of the analog neuromorphic processing device 100 and the analog neuromorphic processing device 100 then generates a plurality of output signals 180”, column 17, lines 49-58, “resistive memories are also of nano-scale sizes that enable a significant amount of resistive memories to be configured within the analog neuromorphic processing device 100 [i.e., the AI processor/ neuromorphic processing device 100 performs analog in-memory computations based on input data] and column 25, lines 57-59, “The analog neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications, image recognition, image processing, digital signal processing … the analog neuromorphic circuit 400 may be incorporated into image processing applications where the vector represents an image and the matrix includes a set of weighted values” [i.e., and based on digital NN weighting factors], “feature maps after being generated may then be stored in a digital storage layer” [i.e., provided by the CPU]),
wherein the AI processor comprises a NN layer (see, e.g., column 10, lines 17-25, “neuromorphic circuit 200 may also be scaled to include additional layers of neurons” and column 25, lines 35-38, “neuromorphic circuit 1000 includes … resistive memories 410(a-n)” [i.e., AI processor includes an NN layer of neurons and neuromorphic circuit includes a processor and memory circuitry/resistive memories 410a-n]), the NN layer comprising:
a first memory circuit configured to store a subset of the digital weighting factors (see, e.g., column 17, lines 56-57, “vector represents an image and the matrix includes a set of weighted values that are to be applied to the image” [i.e., subset of weighted values/weighting factors to be applied to an image], column 23, lines 54-56, “In executing each of the neural network algorithms, the weights of each of the resistive memories 410(a-n) may be determined” [i.e., subset of weights for each memory circuit 410a-n] and column 25, lines 57-59, “pixel feature maps after being generated may then be stored in a digital storage layer as the output of the first convolution layer.” [i.e., a first memory circuit stores the subset of weights]), the first memory circuit comprising … random access memory (see, e.g., column 4, line 64-column 5, line 1, “a machine-readable medium may include … random access memory (RAM)” [i.e., memory circuit including random access memory/RAM]);
a second memory circuit configured to store the input data (see, e.g., column 18, lines 26-33 and column 25, lines 35-38 and 57-58: "image is a two-dimensional image depicted by the image matrix" [i.e., the image data is data associated with the NN layer], "neuromorphic circuit 1000 includes … resistive memories 410(a-n) … a digital storage layer" [i.e., resistive memories 410a-n include a second memory circuit for storing the input data]), the second memory circuit comprising … random access memory (see, e.g., column 4, line 62-column 5, line 1, “A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine … a machine-readable medium may include … random access memory (RAM); … flash memory devices;” [i.e., another, second memory circuit/device includes random access memory/RAM]); and
a … processor … to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors corresponding to a column of the first memory circuit and one of the second sequence of analog vectors corresponding to a column of the second memory circuit (see, e.g., column 17, lines 26-52, “The dot-product operation with a vector, such as the example vector in Equation 1, and a matrix, such as the example matrix in Equation 2, may then be executed incorporating the analog neuromorphic circuit 400. … where the dot product operation is executed with the dot-product operation values 470(a-n) generated as output values of the dot product operation. … conversion of the resistance values for the resistive memories 410(a-n) to represent to represent the non-binary values included in a matrix, the analog neuromorphic circuit 400 is able to execute dot product operations … The analog neuromorphic circuit 400 is capable of executing dot product operations in numerous applications such as but not limited to neural applications” [i.e., the dot product is calculated between analog column and row vectors of two matrices corresponding to columns of first and second memory circuits in resistive memories 410(a-n), calculate a sequence of analog dot products] and column 20, lines 28-35, “The output configuration 500 includes the first op-amp configuration 520 and the second op-amp configuration 530 that may be positioned at the output of each column of the analog neuromorphic circuit 400 to both scale the output voltage signal 510 to a value on the non-linear smooth function 610 between "0" and "1" and does so by incorporating a neuron function such as an activation function and/or a thresholding function.” [i.e., a processor/op-amp configuration of the analog neuromorphic circuit 400 to perform analog calculations]).
Although Yakopcic substantially discloses the claimed invention, Yakopcic is not relied on to explicitly disclose an AI processor coupled to the CPU and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors … and one of the second sequence of analog vectors.
In the same field, analogous art Chi teaches an AI processor coupled to the CPU (see, e.g., FIG. 3 – showing PRIME architecture with an AI processor coupled to the CPU and pages 32, “when PRIME is accelerating NN computation, CPU can still access the memory and work in parallel” and 34, “When LRN layers are applied PRIME requires the help of CPU for LRN computation” [i.e., an AI processor in the PRIME architecture works in parallel with the CPU and is communicatively coupled to the CPU]) and
a cross bit line processor (CBLP) to calculate a sequence of analog dot products, each of the analog dot products calculated between one of the first sequence of analog vectors … and one of the second sequence of analog vectors (as indicated above, “a cross bit line processor (CBLP)”, under the BRI, in light of the specification, is any processor, functional unit, capacitor, circuitry or circuit that is capable of performing analog calculations) (see, e.g., pages 31, “in order to allow FF subarrays to switch bitlines between memory and computation modes, we attach a multiplexer to each bitline to control the switch … We enable an FF subarray to access any physical location in a Buffer subarray to accommodate the random memory access pattern in NN computation (e.g., in the connection of two convolutional layers).” [i.e., bitline processing using vectors corresponding to first and second memory circuits] and 34, “To implement synapse composing, the high-bit and low-bit parts of the synaptic weights are stored in adjacent bitlines of the corresponding crossbar array … (as shown in Figure 4 A ); the output currents are accumulated at the bitlines. … the input vector is the voltages [i.e., sequences of vectors for inputs] … we execute the dot products of {ai} and six sets of weights … in ReRAM cells, and execute the dot product of the inputs and the weights to obtain the mean value of n inputs.” [i.e., a cross-bitline unit/circuit performs analog calculations and calculates a series of analog dot products between sequences of vectors]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yakopcic to incorporate the teachings of Chi to provide “a novel PIM [Processing-in-memory] architecture, called PRIME, to accelerate NN [neural network] applications in ReRAM based main memory” where the architecture for “processing in ReRAM-based main memory, PRIME … directly leverages ReRAM cells to perform computation without the need for extra PUs.” [processing units] (See, e.g., Chi, Abstract and page 30, section III). Doing so would have allowed Yakopcic to use Chi’s architecture to achieve “significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance” because the architecture “efficiently accelerates NN computation by leveraging ReRAM’s computation capability and the PIM architecture”, as suggested by Chi (See, e.g., Chi, Abstract and page 30, section III).
Yakopcic in view of Chi substantially teaches the claimed invention.
However, Yakopcic in view of Chi is not relied on to teach a first memory circuit comprising static random access memory (SRAM); and
the second memory circuit comprising static random access memory (SRAM).
In the same field, analogous art Deisher teaches a first memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM” [i.e., neural network/NN system with a first memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., a first memory chip/circuit comprising SRAM]); and 
the second memory circuit comprising static random access memory (SRAM) (see, e.g., paragraphs 47, “The NN system 200 may have at least one processor 250 which … may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM … processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or … memory 248 or another memory … processor 250 may retrieve or transmit data to other external ( off-die or off-chip) volatile memory”, 48, “NN buffers 256 could be, or at least partially be, held external to the SoC 200 on volatile … memory forming memory 248” [i.e., neural network/NN system with other volatile memory/second memory chip/circuit including SRAM] and 281, “Memory 1412 may be implemented as a volatile memory device such as … Static RAM (SRAM).” [i.e., second memory chip/circuit comprising SRAM]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Deisher with Yakopcic in view of Chi in order to provide an “NN system 200 [that] may be a system on a chip (SoC) that has an NN Accelerator (NNA)” and includes a “processor 250 [that] may process instructions and may send data to, and receive data from, a volatile memory 248 which may be on-board, on-die or on-chip relative to the SoC, and may be RAM such as DRAM or SRAM and so forth. The processor 250 may control data flow with the memory” and “The processor 250 may retrieve or transmit data to other external (off-die or off-chip) volatile memory (such as cache and/or RAM) or non-volatile memory whether as memory 248 or another memory” for storing “the layer data within a layer as arranged in the memory” and storing “an input array” and a “weight matrix” (See, e.g., Deisher, paragraphs 46-47, 59 and 62). Doing so would have allowed Yakopcic in view of Chi to use Deisher’s NN system and NN accelerator components to achieve a “substantial reduction in the use of memory transactions and bandwidth to upload the same weight matrix multiple times for different groups”, as suggested by Deisher (See, e.g., Deisher, paragraphs 46-47 and 62).

Regarding claim 25, as discussed above, Yakopcic in view of Chi and Deisher teaches the system of claim 24.
Examiner’s Note: claim 25, as drafted, depends from claim 24. If applicant intended for claim 25 to be an independent claim, the examiner suggests that one way to do so is to amend the last portion of claim 25 to explicitly recite the limitations of claim 24 instead of the current recitation of an “integrated circuit or chip set comprising the system of claim 24”.
Yakopcic further discloses an integrated circuit or chip set (see, e.g., column 6, lines 57-61, “the analog neuromorphic processing device 100 has significant computational efficiency while maintaining the size of the analog neuromorphic processing device 100 to a chip that may easily be positioned on a circuit board.” and column 7, lines 55-57, “The scaling of the resistive memories into additional neurons may be done within the analog neuromorphic processing device 100 such as within a single chip. However, the analog neuromorphic processing device 100 may also be scaled with other analog neuromorphic circuits contained in other chips” [i.e., an integrated circuit or chip set]) comprising the system of claim 24 (as indicated above, Yakopcic in view of Chi and Deisher teaches the system of claim 24, see above citations to Yakopcic, Chi and Deisher regarding the limitations of claim 24).

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125 

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125