DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 6/1/2022 has been entered.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the claims and remarks filed on 6/1/2022. Claims 1-20 are pending and have been examined. 

Response to Amendment
Claims 1-2, 10 and 11-12 were amended by applicant, and no claims were cancelled or added in the amendment. 
Claim 1 is are no longer being interpreted under 35 U.S.C. 112(f) in view of the 6/1/2022 amendment to this claim.

Response to Arguments
Applicant's arguments filed 6/1/2022 with respect to the previous interpretation of claim 1 under 35 U.S.C. 112(f) have been fully considered and are persuasive.
Applicant's arguments filed 6/1/2022 with respect to the rejections of claims 1-20 under 35 U.S.C. 103 have been carefully and fully considered, but are not persuasive.
Regarding amended independent claims 1 and 11 and the primary Yao reference, applicant alleges, which examiner does not acquiesce to, that “Yao describes two different techniques that are used while performing training operations for a neural network using resistive random access memory (RRAM) cells in an RRAM array” and “Yao does not describe or suggest, however, the switching operation of the amended independent claims.” (applicant’s remarks, page 11, paraphrasing Yao without citing any portions of Yao, and characterizing and paraphrasing limitations added to claims 1 and 11). 
With continued reference to amended claims 1 and 11, applicant then asserts, that “Yao does not describe or suggest ‘the one or more initial iterations being performed by the processor using neural network data acquired from the memory’ and "when one or more error values are less than a threshold, switch from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation, the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations,’ etc. such as in independent claim 1, or the language of independent claim 11.” Id. 
Applicant then characterizes Yao in stating “Yao compares backpropagation update values for RRAM cells to respective thresholds and only updates the RRAM cells when backpropagation update values are greater than the threshold, which helps to avoid making smaller-magnitude updates to RRAM cells. Because RRAM cells wear with each update, preventing these smaller-magnitude updates to the RRAM cells prolongs the service life and increases the reliability of the RRAM array.” and “Yao describes how RRAM cells effected by hard errors such as ‘stuck at 0’ are mapped (i.e., locations for RRAM cells experiencing errors in the RRAM array can be identified). Yao then uses the mapping of RRAM cells to rearrange neural network data so that RRAM cells can be used (i.e., so that the fixed/erroneous output from a given RRAM cell can still be used), despite the RRAM cells experiencing the errors” (applicant’s remarks, page 13). 
Applicant further asserts that “Yao describes the two techniques for using RRAM arrays for training operations for neural networks, but does not describe or suggest the switching operation of the amended independent claims. More specifically, Yao does not describe or suggest ‘the one or more initial iterations being performed by the processor using neural network data acquired from the memory’ and ‘when one or more error values are less than a threshold, switch from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation, the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations,’ such as in amended independent claim 1, or the language of amended independent claim 11.” (applicant’s remarks, page 14). 
With reference to the secondary Gokmen reference applied to claims 1 and 11, applicant states, which examiner does not concede, that “Gokmen describes operations for determining when the training of a convolutional neural network is completed. Gokmen, however, does not describe or suggest the switching operation of independent claims 1 and 11. Gokmen therefore does not remedy the deficiency of Yao” before concluding that “Yao describes two separate techniques (i.e., RRAM cell update blocking and RRAM array hard error mapping) for using an RRAM array for processing operations for a neural network and Gokmen generally describes using a memristor array for computations for a convolutional neural network, but neither Yao nor Gokmen separately describes or suggests independent claims 1 and 11.” (applicant’s remarks, pages 15-16, characterizing and paraphrasing limitations added to claims 1 and 11).
Accordingly, applicant apparently argues that the “one or more initial iterations being performed by the processor using neural network data acquired from the memory” and “when one or more error values are less than a threshold, switch from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation, the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations” limitations added to claims 1 and 11, using respective similar language, are not disclosed or taught in the portions of the Yao and Gokmen references applied to claims 1 and 11 in the previous Office Action. The examiner respectfully disagrees and points applicant to the below discussion of Yao and Gokmen.
With regard to the limitation “one or more initial iterations being performed by the processor using neural network data acquired from the memory” recited in amended claims 1 and 11, the examiner points to paragraphs 92, 94, 103, 107-108 and 112 of Yao which disclose that “the comparison module includes the preset threshold … the processor sets preset thresholds of all layers of the neural network” [i.e., initial iterations use a threshold set by the processor], “approximate neural network training is obtained. In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training”, “execution may be completed using a general purpose processor. The hard error distribution map Q obtained by the error test module may be written to a dedicated area of a peripheral circuit of the RRAM, and sent to the general purpose processor together with the updated weight area of the neural network for rearrangement.” [the initial neural network training iterations being performed by the processor using neural network data], “errors … detected by changing resistance of the RRAM and comparing the changed resistance with an ideal change result. … before error detection, all current resistance of the RRAM needs to be read and recorded. … a block error detection method may be used … An original RRAM array is first divided into several mutually exclusive submatrices, and error detection is performed”, “FIG. 8 shows an example error detection circuit with an array size of 5x5. (The array is an error detection subarray, and may be a part of the crossbar array” [i.e., detecting/monitoring error values] and “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM, to obtain neuron output values <ro 1, ro2, ... , rom> of the rth layer of the neural network, where n is a positive integer greater than 0, and m is a positive integer greater than 0” [i.e., processor 910 of training apparatus 900 performs the one or more initial iterations – greater than 0 iterations, using neural network data/neuron values acquired/read from the RRAM/memory].
Regarding the newly added limitation “when one or more error values are less than a threshold, switch from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation” recited in independent claims 1 and 11, using similar respective language, the examiner points to paragraphs 94 and 122 of Yao, which explicitly disclose that “When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., error values are compared to the preset threshold before next iteration], “In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training … When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., when error values are the preset threshold, switch to using the RRAM array/analog circuit block for the next iteration process/remaining iterations], “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training. After a period of training [i.e., after initial iterations], a phased error test is performed on the RRAM, a corresponding hard error distribution map of the RRAM is output, and then the processor 910 performs data rearrangement for the neural network based on the hard error” [i.e., based on the error threshold, switching from using processor 910 to using the RRAM array/analog circuit block for performing next, remaining iterations of the neural network training operation].
With regard to the new limitation “the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations” added to claims 1 and 11, the examiner points to paragraphs 94 and 98 of Yao, which explicitly disclose that “In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training … When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., error values are compared to the preset threshold]. … backpropagation update values are accumulated for a plurality of times … for a training process of the neural network.” [i.e., use the accumulated update data/values from the RRAM/memory] and “after … the backward calculation module, the comparison module … complete one or more iterations of neural network training, the error test module may perform an error test on the RRAM” [i.e., configure module/functional block to perform next, remaining iterations of the neural network training operation].
With continued reference to the new limitation “when one or more error values are less than a threshold” and “perform the remaining iterations; and … performing … the remaining iterations” limitations recited in claims 1 and 11, the examiner further points to paragraphs 59 and 73 of Gokmen, which disclose, “compar[ing] the network's calculated values for the output nodes to these ‘correct’ values, and to calculate an error term for each node … These error terms are then used to adjust the weights in the hidden layers so that in the next iteration the output values will be closer to the ‘correct’ values.”, “the training can be deemed completed if the CNN identifies the inputs according to the expected outputs with a predetermined error threshold. If the training is not yet completed, another iteration, or training epoch is performed using the modified convolutional kernels from the most recent iteration.” [i.e., when error values for expected outputs are below the predetermined error threshold, training is not completed and then performing remaining iterations of the CNN/neural network training operation].
As detailed below, the combination of Yao and Gokmen (i.e., Yao in view of Gokmen) teaches the other limitations of amended independent claims 1 and 11, and dependent claims 2, 4-10, 12 and 14-20. As further discussed below, the combination of Yao, Gokmen and Yu (i.e., Yao in view of Gokmen, and further in view of Yu) teaches the limitations of dependent claims 3 and 13. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-12 and 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Yao et al. (U.S. Patent Application Pub. No. 2020/0117997 A1, hereinafter “Yao”) in view of Gokmen (U.S. Patent Application Pub. No. 2018/0075338 A1, hereinafter “Gokmen”). Yao was filed as a national stage application of PCT application no. PCT/CN2018/091033 filed on June 13, 2018, and claims foreign priority to Chinese application CN 201710459806.0, which was filed on June 16, 2017, and both of these dates are before the effective filing date of this application, i.e., November 14, 2018. Therefore, Yao constitutes prior art under 35 U.S.C. 102(a)(2). Gokmen was published on March 15, 2018, and this date is before the effective filing date of this application, i.e., November 14, 2018. Therefore, Gokmen constitutes prior art under 35 U.S.C. 102(a)(1). The examiner further notes that Gokmen was filed on April 6, 2017 as a continuation of U.S. Patent Application No.15/262,606, filed on Sep. 12, 2016, and both of these dates are also before the effective filing date of this application, i.e., November 14, 2018.
With respect to claim 1, Yao discloses the invention as claimed including a system that performs training operations for a neural network (see, e.g., FIG. 9 - depicting a neural network training apparatus 900/system and paragraphs 11 and 112, “neural network training method and apparatus in order to prolong service life of an RRAM that performs neural network training”, “FIG. 9 is a schematic structural diagram of a neural network training apparatus 900” [i.e., an apparatus/system that performs neural network training]), comprising:
a processor (see, e.g., FIG. 9 – depicting “Processor 910” of “Neural network training apparatus 900” and paragraph 112, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values” [i.e., a processor 910 for training apparatus 900]); 
a memory (see, e.g., paragraphs 8, 112 and 129, “resistive random access memory (RRAM) device”, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values”, “The foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM)” [i.e., a RRAM device, a memory]);
an analog circuit element functional block that includes an array of analog circuit elements (see, e.g., paragraphs 8-9 and 60-61, “resistive random access memory (RRAM) device … the RRAM is a non-volatile memory (NVM) … a crossbar array structure is built using the RRAM”, “using an analog circuit form of the RRAM … an RRAM-based analog circuit computation process … the crossbar array structure built using the RRAM memory cell” [i.e., an array of RRAM devices/cells, which are analog circuit elements], “a crossbar array structure is built using the RRAM crossbar array structure using an RRAM cell quite adapts to a matrix vector multiplication operation of the neural network”, “The crossbar array structure is a structure with crossed rows and columns. As shown in FIG. 1, an NVM is disposed on each intersection node (the intersection node is referred to as an NVM node below) for data storage and computing. …obtain a calculation result based on analog calculation, therefore the crossbar array is quite suitable for processing an operation” [i.e., crossbar array structure is an analog circuit element functional block for functions/operations of the neural network]); and
controller circuits, wherein the controller circuits are configured (see, e.g., FIG. 8 – depicting an error detection circuit and paragraph 108, “ FIG. 8 shows an example error detection circuit” [i.e., a configured controller/error detection circuit]) to:
monitor error values (see, e.g., FIG. 8 – depicting an error detection circuit and paragraph 108, “error detection circuit with an array size of 5x5. (The array is an error detection subarray … the error detection module may successively perform error detection” [i.e., controller/error detection circuit configured to monitor/detect error values]) computed using an output from each of one or more initial iterations of a neural network training operation (see, e.g., paragraphs 85 and 107-108, “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training.”, “FIG. 8 shows … the error detection module. … errors … detected by changing resistance of the RRAM and comparing the changed resistance with an ideal change result.” [i.e., detect/monitor errors computed as output after initial quantity of iterations of a neural network training operation]), the one or more initial iterations being performed by the processor using neural network data acquired from the memory (see, e.g., paragraphs 92, 94, 103, 107-108 and 112, “the comparison module includes the preset threshold … the processor sets preset thresholds of all layers of the neural network” [i.e., initial iterations use a threshold set by the processor], “approximate neural network training is obtained. In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training”, “execution may be completed using a general purpose processor. The hard error distribution map Q obtained by the error test module may be written to a dedicated area of a peripheral circuit of the RRAM, and sent to the general purpose processor together with the updated weight area of the neural network for rearrangement.” [the initial neural network training iterations being performed by the processor using neural network data], “errors … detected by changing resistance of the RRAM and comparing the changed resistance with an ideal change result. … before error detection, all current resistance of the RRAM needs to be read and recorded. … a block error detection method may be used … An original RRAM array is first divided into several mutually exclusive submatrices, and error detection is performed”, “FIG. 8 shows an example error detection circuit with an array size of 5x5. (The array is an error detection subarray, and may be a part of the crossbar array” [i.e., detecting/monitoring error values], “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM, to obtain neuron output values <ro 1, ro2, ... , rom> of the rth layer of the neural network, where n is a positive integer greater than 0, and m is a positive integer greater than 0” [i.e., processor 910 of training apparatus 900 performs the one or more initial iterations – greater than 0 iterations, using neural network data/neuron values acquired/read from the RRAM/memory]); 
when one or more error values are … a threshold (see, e.g., paragraph 94, “When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., error values are compared to the preset threshold before next iteration]), switch from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation (see, e.g., paragraphs 94 and 122, “In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training … When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., when error values are the preset threshold, switch to using the RRAM array/analog circuit block for the next iteration process/remaining iterations], “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training. After a period of training [i.e., after initial iterations], a phased error test is performed on the RRAM, a corresponding hard error distribution map of the RRAM is output, and then the processor 910 performs data rearrangement for the neural network based on the hard error” [i.e., based on the error threshold, switch from using processor 910 to using the RRAM array/analog circuit block for performing next, remaining iterations of the neural network training operation]), the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations (see, e.g., paragraphs 94 and 98, “the value may be updated to the RRAM crossbar array in a next iteration. … backpropagation update values are accumulated for a plurality of times … for a training process of the neural network.” [i.e., the switching includes using the accumulated update data/values from the RRAM/memory], “after … the backward calculation module, the comparison module … complete one or more iterations of neural network training, the error test module may perform an error test on the RRAM” [i.e., switch from using processor 910 to using RRAM array/analog circuit block for performing next, remaining iterations of the neural network training operation]); and 
cause the analog circuit element functional block to perform the remaining iterations (see, e.g., paragraphs 94, 98 and 122, “When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration.”, “after … the backward calculation module, the comparison module, and the update module complete one or more iterations of neural network training … The error test may be performed before the neural network training, that is, performed in advance when an entire training process (or a specific operation) has not been loaded into the crossbar array, or may be performed after a particular quantity of iterations”, “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training. After a period of training [i.e., after the one or more initial iterations], a phased error test is performed on the RRAM, a corresponding hard error distribution map of the RRAM is output, and then the processor 910 performs data rearrangement for the neural network based on the hard error” [i.e., switch from processor 910 and cause module/functional block to perform next, remaining iterations of the entire training process/neural network training operation after a particular quantity of initial iterations]).
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose when one or more error values are less than a threshold, … perform the remaining iterations; and … perform the remaining iterations.
In the same field, analogous art Gokmen teaches when one or more error values are less than a threshold, … perform the remaining iterations; and … perform the remaining iterations (see, e.g., paragraphs 59 and 73, “compare the network's calculated values for the output nodes to these ‘correct’ values, and to calculate an error term for each node … These error terms are then used to adjust the weights in the hidden layers so that in the next iteration the output values will be closer to the ‘correct’ values.”, “the training can be deemed completed if the CNN identifies the inputs according to the expected outputs with a predetermined error threshold. If the training is not yet completed, another iteration, or training epoch is performed using the modified convolutional kernels from the most recent iteration” [i.e., when error values for expected outputs are below the predetermined error threshold, training is not completed and the perform the remaining iterations of the CNN/neural network training operation]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

With respect to independent claim 11, Yao discloses the invention as claimed including a method for performing training operations for a neural network in a system (see, e.g., FIG. 9 - depicting a neural network training apparatus 900/system and paragraphs 11 and 112, “neural network training method and apparatus in order to prolong service life of an RRAM that performs neural network training”, “FIG. 9 is a schematic structural diagram of a neural network training apparatus 900” [i.e., a method for performing neural network training operations in an apparatus/system]) that includes a processor (see, e.g., FIG. 9 – depicting “Processor 910” of “Neural network training apparatus 900” and paragraph 112, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values” [i.e., a processor 910 for training apparatus 900]), a memory (see, e.g., paragraphs 8, 112 and 129, “resistive random access memory (RRAM) device”, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values”, “The foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM)” [i.e., a RRAM device, a memory]), and an analog circuit element functional block with an array of analog circuit elements (see, e.g., paragraphs 8-9 and 60-61, “resistive random access memory (RRAM) device … the RRAM is a non-volatile memory (NVM) … a crossbar array structure is built using the RRAM”, “using an analog circuit form of the RRAM … an RRAM-based analog circuit computation process … the crossbar array structure built using the RRAM memory cell” [i.e., an array of RRAM devices/cells, which are analog circuit elements], “a crossbar array structure is built using the RRAM crossbar array structure using an RRAM cell quite adapts to a matrix vector multiplication operation of the neural network”, “The crossbar array structure is a structure with crossed rows and columns. As shown in FIG. 1, an NVM is disposed on each intersection node (the intersection node is referred to as an NVM node below) for data storage and computing. …obtain a calculation result based on analog calculation, therefore the crossbar array is quite suitable for processing an operation” [i.e., crossbar array structure is an analog circuit element functional block for functions/operations of the neural network]), the method comprising:
monitoring error values computed using an output from each of one or
more initial iterations of a neural network training operation (see, e.g., paragraphs 85 and 107, “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training.”, “FIG. 8 shows … the error detection module. … errors … detected by changing resistance of the RRAM and comparing the changed resistance with an ideal change result.” [i.e., detecting/monitoring errors computed as output after initial quantity of iterations of a neural network training operation]), the one or more initial iterations being performed by the processor using neural network data acquired from the memory (see, e.g., paragraphs 92, 94, 103, 107-108 and 112, “the comparison module includes the preset threshold … the processor sets preset thresholds of all layers of the neural network” [i.e., initial iterations use a threshold set by the processor], “approximate neural network training is obtained. In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training”, “execution may be completed using a general purpose processor. The hard error distribution map Q obtained by the error test module may be written to a dedicated area of a peripheral circuit of the RRAM, and sent to the general purpose processor together with the updated weight area of the neural network for rearrangement.” [the initial neural network training iterations being performed by the processor using neural network data], “errors … detected by changing resistance of the RRAM and comparing the changed resistance with an ideal change result. … before error detection, all current resistance of the RRAM needs to be read and recorded. … a block error detection method may be used … An original RRAM array is first divided into several mutually exclusive submatrices, and error detection is performed”, “FIG. 8 shows an example error detection circuit with an array size of 5x5. (The array is an error detection subarray, and may be a part of the crossbar array” [i.e., detecting/monitoring error values], “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM, to obtain neuron output values <ro 1, ro2, ... , rom> of the rth layer of the neural network, where n is a positive integer greater than 0, and m is a positive integer greater than 0” [i.e., processor 910 of training apparatus 900 performs the one or more initial iterations – greater than 0 iterations, using neural network data/neuron values acquired/read from the RRAM/memory]);
when one or more error values are … a threshold (see, e.g., paragraph 94, “When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., error values are compared to the preset threshold before next iteration]), switching from using the processor to using the analog circuit element functional block for performing remaining iterations of the neural network training operation (see, e.g., paragraphs 94 and 122, “In a next iteration process, larger backpropagation error values <B1, B2, ... , Bm> … are generated in the approximate neural network training … When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration [i.e., when error values are the preset threshold, switch to using the RRAM array/analog circuit block for the next iteration process/remaining iterations], “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training. After a period of training [i.e., after initial iterations], a phased error test is performed on the RRAM, a corresponding hard error distribution map of the RRAM is output, and then the processor 910 performs data rearrangement for the neural network based on the hard error” [i.e., based on the error threshold, switching from using processor 910 to using the RRAM array/analog circuit block for performing next, remaining iterations of the neural network training operation]), the switching including using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations (see, e.g., paragraphs 94 and 98, “the value may be updated to the RRAM crossbar array in a next iteration. … backpropagation update values are accumulated for a plurality of times … for a training process of the neural network.” [i.e., the switching includes using the accumulated update data/values from the RRAM/memory], “after … the backward calculation module, the comparison module … complete one or more iterations of neural network training, the error test module may perform an error test on the RRAM” [i.e., switch from using processor 910 to using RRAM array/analog circuit block for performing next, remaining iterations of the neural network training operation]); and
performing, in the analog circuit element functional block, the remaining iterations (see, e.g., paragraphs 94, 98 and 122, “When the backpropagation update value C exceeds the preset threshold, the value may be updated to the RRAM crossbar array in a next iteration.”, “after … the backward calculation module, the comparison module, and the update module complete one or more iterations of neural network training … The error test may be performed before the neural network training, that is, performed in advance when an entire training process (or a specific operation) has not been loaded into the crossbar array, or may be performed after a particular quantity of iterations”, “the error test may be performed after a particular quantity of iterations, because the hard errors of the RRAM may occur constantly in the neural network training.” [i.e., performing, in the module/functional block, remaining iterations of the neural network training operation after a particular quantity of initial iterations]).
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose when one or more error values are less than a threshold, … perform the remaining iterations; and … performing … the remaining iterations.
In the same field, analogous art Gokmen teaches when one or more error values are less than a threshold, … perform remaining iterations; and … performing … the remaining iterations (see, e.g., paragraphs 59 and 73, “compare the network's calculated values for the output nodes to these ‘correct’ values, and to calculate an error term for each node … These error terms are then used to adjust the weights in the hidden layers so that in the next iteration the output values will be closer to the ‘correct’ values.”, “the training can be deemed completed if the CNN identifies the inputs according to the expected outputs with a predetermined error threshold. If the training is not yet completed, another iteration, or training epoch is performed using the modified convolutional kernels from the most recent iteration.” [i.e., when error values for expected outputs are below the predetermined error threshold, training is not completed and then performing remaining iterations of the CNN/neural network training operation]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claims 2 and 12, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Yao further discloses wherein the processor is configured to perform each of the one or more initial iterations (see, e.g., paragraph 112, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM, to obtain neuron output values <ro 1, ro2, ... , rom> of the rth layer of the neural network, where n is a positive integer greater than 0, and m is a positive integer greater than 0” [i.e., processor 910 of training apparatus 900 performs each of the one or more initial iterations – greater than 0 iterations]) by:
processing a corresponding instance of input data in the neural network to compute an output from the neural network, the processing including using neural network data acquired from the memory (see, e.g., paragraph 112, “The neural network training apparatus 900 is applied to a RRAM, and the apparatus includes a processor 910 configured to input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM [i.e., processing including using neural network data/filters acquired from/in the RRAM/memory], to obtain neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network” [i.e., process corresponding instance of input values ri/data in the neural network to perform calculations/compute output/values ro by using neural network data from the RRAM/memory]); 
determining an error value based on the output … associated with the corresponding instance of input data (see, e.g., paragraph 70, “Perform calculation based on kernel values of the RRAM, the neuron input values <ri1, ri2, ... , rin> of the rth layer of the neural network, the neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network, and backpropagation error values <B1, B2, ... , Bm> of the rth layer of the neural network,” [i.e., calculating/determining backpropagation error values B based on the output values ro associated with the corresponding input values/data ri]); and
backpropagating the error value through the neural network and making associated updates to some or all of the neural network data (see, e.g., paragraph 70, “obtain backpropagation update values <C1, C2, ... , Cm> of the rth layer of the neural network, where the kernel values of the RRAM are matrix values of the filters in the RRAM, and the backpropagation error values <B1, B2, ... , Bm> of the rth layer of the neural network are obtained based on the neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network and neuron reference output values <rt1, rt2, ... , rtm> of the rth layer of the neural network.” [i.e., backpropagating the error values B through layers of the neural network and updating some of the neural network data/kernel values/matrix values of filters in the RRAM]), the making the updates including storing updated neural network data in the memory (see, e.g., paragraphs 73 and 88, “When the backpropagation update values <C1, C2, ... , Cm> of the rth layer of the neural network are greater than the preset threshold, update the filters in the RRAM based on the backpropagation update values <C1, C2, ... , Cm> of the rth layer of the neural network.” [i.e., store updated filters in the RRAM/memory based on backpropagation update values C]).
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose determining an error value based on the output and an expected output associated with the corresponding instance of input data.
In the same field, analogous art Gokmen teaches determining an error value based on the output and an expected output associated with the corresponding instance of input data (see, e.g., paragraphs 58-59, “ANN 300 receives inputs x1, x2, x3 directly from a source”, “ANN model 300 processes data records one at a time, and it ‘learns’ by comparing an initially arbitrary classification of the record with the known actual classification of the record. Using a training methodology knows as ‘backpropagation’ (i.e., ‘backward propagation of errors’) … compare the network's calculated values for the output nodes to these ‘correct’ values, and to calculate an error term for each node” [i.e., determining/calculating an error term/value based on the output nodes and expected output/‘correct’ value associated with the corresponding input data record/instance]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claims 4 and 14, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Yao further discloses wherein using the neural network data from the memory to configure the analog circuit element functional block to perform the remaining iterations comprises:
setting, for programmable elements in the array of analog circuit elements in the analog circuit element functional block, conductances based at least in part on a value of respective neural network data (see, e.g., paragraphs 61, 108 and 121, “analog signals V1 to Vn are input to n rows of the crossbar array, respectively. A conductance value of an NVM node in each column of the crossbar array represents a size of a weight stored in the NVM node” [i.e., conductances/condunctance values of NVM circuit elements are based on weight values of respective neural network data], “When error detection is performed … , a minimum bias is first written for all RRAM cells in the subarray … when the error detection is performed stuck-at-0 error, a minimum bias in reducing resistance (or increasing conductance) needs to be written. When the error detection is performed stuck-at-1 error, a minimum bias in increasing resistance (or reducing conductance) needs to be written.”, “the error test may be a program or a test circuit, and carried on the RRAM for implementation.” [i.e., setting/writing, for programmable RRAM cells/elements in the array of analog circuits, conductances based on respective, detected error values in the neural network data]).

Regarding claims 5 and 15, as discussed above, Yao in view of Gokmen teaches the system of claim 4 and the method of claim 14.
Yao further discloses wherein performing each remaining iteration in the analog circuit element functional block includes:
processing a corresponding instance of input data in the neural network to compute an output from the neural network, the processing including computing internal values for nodes in the neural network using respective electrical currents from the programmable elements in the array of analog circuit elements (see, e.g., paragraphs 61 and 112, “analog signals V1 to Vn are input to n rows of the crossbar array, respectively. A conductance value of an NVM node in each column of the crossbar array represents a size of a weight stored in the NVM node. Therefore, after the analog signals V1 to Vn are applied to NVM nodes corresponding to each column, a current value that is output by each NVM node represents a product of a weight stored in the NVM node and a data element represented by an analog signal received by the NVM node” [i.e., using respective electrical currents from NVM elements in the array], “input neuron input values <ri1, ri2, . . . , rin> of an rth layer of a neural network to the RRAM, and perform calculation for the neuron input values <ri1, ri2, ... , rin> based on filters in the RRAM [i.e., processing including calculating/computing internal values – weights and filters in the NVM/RRAM nodes], to obtain neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network” [i.e., process corresponding instance of input values ri/data in the neural network to perform calculations/compute output/values ro]), the electrical currents from each programmable element in the array of analog circuit elements being proportional to a conductance of that element (see, e.g., paragraphs 61 and 108, “A conductance value of an NVM node in each column of the crossbar array represents a size of a weight stored in the NVM node. Therefore, after the analog signals V1 to Vn are applied to NVM nodes corresponding to each column, a current value that is output by each NVM node represents a product of a weight stored in the NVM node and a data element represented by an analog signal received by the NVM node”, “a magnitude of the bias is determined by resistance precision of the RRAM device. … when the error detection is performed on the stuck-at-0 error, a minimum bias in reducing resistance (or increasing conductance) needs to be written. When the error detection is performed on the stuck-at-1 error, a minimum bias in increasing resistance (or reducing conductance) needs to be written.” [i.e., electrical currents from each NVM/RRAM device/node/element in the array is proportional to a conductance of the NVM/RRAM devices/nodes/elements]);
determining an error value based on the output … associated with the corresponding instance of input data (see, e.g., paragraphs 70 and 108, “Perform calculation based on kernel values of the RRAM, the neuron input values <ri1, ri2, ... , rin> of the rth layer of the neural network, the neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network, and backpropagation error values <B1, B2, ... , Bm> of the rth layer of the neural network”, “Then, an error detection voltage is applied to an input interface … of the error detection subarray, and a calculation result after the bias is written is obtained on an output interface” [i.e., calculating/determining backpropagation error values B/results based on the output values ro associated with the corresponding input values/data ri from the input interface]); and
backpropagating the error value through the neural network (see, e.g., paragraph 70, “obtain backpropagation update values <C1, C2, ... , Cm> of the rth layer of the neural network, where the kernel values of the RRAM are matrix values of the filters in the RRAM, and the backpropagation error values <B1, B2, ... , Bm> of the rth layer of the neural network are obtained based on the neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network and neuron reference output values <rt1, rt2, ... , rtm> of the rth layer of the neural network.” [i.e., backpropagating the error values B through layers of the neural network]) and making associated updates to conductances of the programmable elements in the array of analog circuit elements based at least in part on the error value (paragraphs 61, 108 and 121, “A conductance value of an NVM node in each column of the crossbar array represents a size of a weight stored in the NVM node”, “When error detection is performed … , a minimum bias is first written for all RRAM cells in the subarray … when the error detection is performed stuck-at-0 error, a minimum bias in reducing resistance (or increasing conductance) needs to be written. When the error detection is performed stuck-at-1 error, a minimum bias in increasing resistance (or reducing conductance) needs to be written.”, “the error test may be a program or a test circuit, and carried on the RRAM” [i.e., updating/writing, for programmable RRAM cells/elements in the array of analog circuits, conductances based on respective, detected error values]).
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose determining an error value based on the output and an expected output associated with the corresponding instance of input data.
In the same field, analogous art Gokmen teaches determining an error value based on the output and an expected output associated with the corresponding instance of input data (see, e.g., paragraphs 58-59, “ANN 300 receives inputs x1, x2, x3 directly from a source”, “ANN model 300 processes data records one at a time, and it ‘learns’ by comparing an initially arbitrary classification of the record with the known actual classification of the record. Using a training methodology knows as ‘backpropagation’ (i.e., ‘backward propagation of errors’) … compare the network's calculated values for the output nodes to these ‘correct’ values, and to calculate an error term for each node” [i.e., determining/calculating an error term/value based on the output nodes and expected output/‘correct’ value associated with the corresponding input data record/instance]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claims 6 and 16, as discussed above, Yao in view of Gokmen teaches the system of claim 5 and the method of claim 15.
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose wherein performing the remaining iterations comprises performing individual remaining iterations until a stopping threshold is reached in a magnitude of updates to conductances or error values.
In the same field, analogous art Gokmen teaches wherein performing the remaining iterations comprises performing individual remaining iterations until a stopping threshold is reached in a magnitude of updates to conductances or error values (see, e.g., paragraphs 73, 89 and 96, “modified convolutional kernels 420 after being adjusted can be used for further training of the CNN, unless the training is deemed completed … the training can be deemed completed if the CNN identifies the inputs according to the expected outputs with a predetermined error threshold. If the training is not yet completed, another iteration … is performed using the modified convolutional kernels from the most recent iteration” [i.e. performing iterations until a stopping threshold is reached in magnitudes of updates to error values, the predetermined error threshold], “For weight updates, …. the conductance values stored in the relevant RPU [resistive processing unit] devices all update in parallel. … weight updates are performed locally at each RPU 820 of array 800 using the RPU device itself”, “RPU 820A causes an incremental conductance change that is equivalent to a weight change” [i.e., incremental changes/updates to conductances for RPU devices in array based on weight updates from training until training iterations stop due to stopping error threshold is reached]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). 

Regarding claims 7 and 17, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose wherein when the neural network has been trained using the training operations, use the neural network to perform one or more specified tasks for unknown instances of input data.
In the same field, analogous art Gokmen teaches wherein when the neural network has been trained using the training operations, use the neural network to perform one or more specified tasks for unknown instances of input data (see, e.g., paragraphs 3 and 62, “Neural networks can be used to estimate or approximate systems and functions that depend on a large number of inputs and are generally unknown.” [i.e., perform estimation or approximation tasks for unknown instances of inputs/input data], “learning process in the ANN context can be viewed as the problem of updating the crosspoint device connection weights so that a network can efficiently perform a specific task. The crosspoint devices typically learn the necessary connection weights from available training patterns.” [i.e., when the ANN/neural network has been trained, use the ANN/neural network to perform a specific task]).
Yao and Gokmen are analogous art because they are both directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, and Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). 

Regarding claims 8 and 18, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose wherein the neural network data comprises values representing weights associated with directed edges between nodes in the neural network.
In the same field, analogous art Gokmen teaches wherein the neural network data comprises values representing weights associated with directed edges between nodes in the neural network (see, e.g., FIG. 3 – depicting an artificial neural network (ANN) 300 with nodes 302, 308 and 316 and directed edges m1-m20 between the nodes and paragraphs 57-59 and 62, “FIG. 3 depicts a simplified ANN model 300 organized as a weighted directional graph, wherein the artificial neurons are nodes (e.g., 302, 308, 316), and wherein weighted directed edges (e.g., m1 to m20) connect the nodes.”, “Each hidden layer node 308, 310, 312, 314 receives its inputs from all input layer nodes 302, 304, 306 according to the connection strengths associated with the relevant connection pathways.”, “error terms are then used to adjust the weights in the hidden layers” [i.e., the ANN/neural network data includes values representing weights associated with directed edges m1-m20 between nodes in the ANN/neural network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). 

Regarding claims 9 and 19, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Yao further discloses wherein the threshold is set to a given value based at least in part on an estimated wear on the analog circuit elements in the array of analog circuit elements from performing the remaining operations (see, e.g., paragraphs 36, 38, 74, 77, 80, 90, 95 and 114, “an update operation in neural network training is determined by setting the preset threshold, and the update operation is performed only when the update value is greater than the preset threshold. … this solution can greatly reduce write/erase operations brought to the RRAM by a large quantity of update operations in the neural network training such that service life of the RRAM is prolonged.”, “a backpropagation update value of each layer of the neural network is compared with the static threshold, and many write/erase operations performed when update values are below the static threshold are avoided, thereby prolonging the service life of the RRAM. … This can make update operations more purposeful, and further ensures the service life of the RRAM.” [i.e., the threshold is set to a value based at least in part on reducing estimated wear on the RRAM analog circuit elements in the array from performing remaining update and write/erase operations]).

Regarding claims 10 and 20, as discussed above, Yao in view of Gokmen teaches the system of claim 1 and the method of claim 11.
Yao further discloses a lower-power mode while the analog circuit element functional block performs the remaining iterations (see, e.g., paragraph 64, “In an embodiment, neural network parameters that have been trained on another device, including strengths (weights) of neurons and synapses, are loaded into the RRAM crossbar array [i.e., crossbar array is the analog circuit element functional block for training functions/operations of the neural network]. In this method, application of the neural network in an inference process is increased mainly by virtue of features of low power consumption … of the RRAM. If the features of low power consumption … of the RRAM are further applied to a neural network training process, an operation process and result are affected” [i.e., a low-power mode while the analog circuit element/RRAM of the array performs remaining/further neural network training operations/iterations]). 
Although Yao substantially discloses the claimed invention, Yao is not relied on to explicitly disclose wherein the controller circuits transition the memory to a lower-power mode while the analog circuit element functional block performs the remaining iterations.
In the same field, analogous art Gokmen teaches wherein the controller circuits transition the memory to a lower-power mode while the analog circuit element functional block performs the remaining iterations (see, e.g., paragraphs 76, 78 and 110, “In order to limit power consumption, the crosspoint devices of ANN chip architectures … utilize offline learning techniques, wherein the approximation of the target function does not change once the initial training phase has been resolved [i.e., after the initial training iterations]. Offline learning allows the crosspoint devices of crossbar-type ANN architectures to be simplified such that they draw very little power. … simplifying the crosspoint devices of ANN architectures to prioritize power-saving for offline learning techniques … Providing crosspoint devices that keep power consumption within an acceptable range”, “RPU can be implemented as two-terminal resistive crosspoint devices, … the described RPU device can be implemented with resistive random access memory (RRAM)” [i.e., transition the RPU RRAM crosspoint memory devices of the crossbar array to a lower-power mode while the ANN chip/circuit element performs the remaining training iterations], “Referring now to FIG. 19, a node/neuron control system 1900 is shown. The neuron control system 1900 includes a hardware processor 1902 and memory 1904. Training data 1906 for a CNN is stored in the memory 1906 and is used to train weights of the CNN. A neuron interface 1908 controls neurons on the CNN, determining whether the neurons are in feed forward mode, back propagation mode, or weight update mode.” [i.e., controller circuits/control system 1900 including processor 1902 and memory 1904 – circuits/circuitry]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao to incorporate the teachings of Gokmen to provide technical solutions “for implementing a convolutional neural network (CNN) using resistive processing unit (RPU) array” and “solutions for accelerating training of convolutional neural networks” that “include using RPUs, such as those configured in an RPU array for training convolutional neural networks” and “performing update pass computations via the RPU array by transmitting voltage pulses corresponding to the input data of the convolution layer and the error of the output maps to the RPU array.” (See, e.g., Gokmen, Abstract and paragraph 43). Doing so would have allowed Yao to use Gokmen’s RPU array and technical solutions to “facilitate gain in efficiencies of deep learning techniques that use convolutional neural networks” and to use “Deep learning [which] inherently leverages the availability of massive training datasets (that are enhanced with the use of Big Data) and compute power”, as suggested by Gokmen (See, e.g., Gokmen, Abstract, paragraph 44). 

Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Yao in view of Gokmen as applied to claims 2 and 12 above, and further in view of non-patent literature Yu, Shimeng ("Neuro-Inspired Computing With Emerging Nonvolatile Memory." Proceedings of the IEEE 106.2 (February 2018): 260-285.", hereinafter “Yu”).
Regarding claims 3 and 13, as discussed above, Yao in view of Gokmen teaches the system of claim 2 and the method of claim 12.
Yao further discloses wherein, when processing instances of input data in the neural network during the initial iterations, the processor uses a … precision for operands and results of operations (see, e.g., paragraphs 68, 82 and 108, “the performing the calculation for the neuron input values based on filters in the RRAM is performing a multiply-accumulate operation on a neuron input value vector [i.e., processing instances of input data in the neural network during the initial iterations] and the kernel vector, and converting, by an ADC in the RRAM, a result obtained through the multiply-accumulate operation, to obtain the neuron output values <ro1, ro2, ... , rom> of the rth layer of the neural network.”, “error distribution map of the RRAM is obtained by performing the error test on the RRAM, … This reduces impact of the hard errors in the RRAM on training precision of the neural network”, “When error detection is performed … , a minimum bias is first written for all RRAM cells in the subarray on which the error detection is to be performed, where a magnitude of the bias is determined by resistance precision of the RRAM device.” [i.e., the processing/performing the calculation uses a precision for multiply-accumulate operands and results of operations]).
Although Yao in view of Gokmen substantially teaches the claimed invention, Yao in view of Gokmen is not relied on to teach wherein, when processing instances of input data in the neural network during the initial iterations, the processor uses a specified precision for operands and results of operations.
In the same field, analogous art Yu teaches wherein, when processing instances of input data in the neural network during the initial iterations, the processor uses a specified precision for operands and results of operations (see, e.g., pages 267, 269, 279 and 281, “the algorithm allows tuning the conductance with 1% precision (which is equivalent to ~ 8b) to any desired value within devices”, “The resistive crossbar array architecture has been proposed for implementing the weighted sum (or matrix–vector multiplication, dot-product operation) …The weighted sum operation is performed … : read voltages are applied to all the rows, and then the read voltages are multiplied by the conductance of the synaptic devices at each crosspoint, resulting in a weighted sum” [i.e., sum and multiplication operands and operation results], “analog synaptic devices … with the extracted realistic device parameters such as precision (the number of bits)”, “deep learning models are trained in the GPU environment using 32-b floating point, in order to satisfy the precision required by backpropagation … and low-precision (fixed-point) training with stochastic rounding of last few bits … the matrix–vector multiplication essentially becomes the bitwise xnor operation.” [i.e., using specified bits of precision, 8 or 32-bit, a specified precision for operands and results of training and backpropagation operations]).
Yao, Gokmen and Yu are analogous art because they are each directed to using arrays of resistive devices for training neural networks (see, e.g., Yao, Abstract and paragraphs 11-12 which disclose “A neural network training method [that] includes inputting neuron input values of a neural network to the RRAM” and “an RRAM that performs neural network training … a neural network training method is provided, and the method is applied to a RRAM”, Gokmen, Abstract and paragraph 6, which disclose “using resistive processing unit (RPU) array” and “training a convolution layer of a convolutional neural network (CNN) using resistive processing unit (RPU) arrays” and Yu, Abstract and page 264, which disclose “we introduce the crossbar array architecture to accelerate the weighted sum and weight update operations that are commonly used in the neuro-inspired machine learning algorithms” and “a resistive crossbar array with eNVMs [emerging non-volatile memory devices] can do parallel programming and weighted sum for further speedup, potentially enabling online training.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yao in view of Gokmen to incorporate the teachings of Yu to provide a “crossbar array architecture to accelerate the weighted sum and weight update operations that are commonly used in the neuro-inspired machine learning algorithms” and “a device-circuit-algorithm codesign methodology to evaluate the impact of nonideal device effects on the system-level performance (e.g., learning accuracy)” that uses “emerging nonvolatile memory (eNVM) devices … for on-chip weight storage with higher density … and fast parallel analog computing” (See, e.g., Yu, Abstract and pages 260-261). Doing so would have allowed Yao in view of Gokmen to use Yu’s architecture and co-design methodology to achieve “customization of the learning algorithms for efficient hardware implementation” by “using eNVM-based devices for energy-efficient computing”, as suggested by Yu (See, e.g., Yu, Abstract and pages 260-261). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125