DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This action is in response to the amendment and remarks filed 6/28/2022.
The amendment filed 6/28/2022 has been entered. 
In the amendment, claims 1, 14 and 20 were amended by applicant, and no claims were added or canceled. Claims 21-28 were previously added, and claims 3, 6, 7, 11, 14, 15, 17 and 19 were previously cancelled in an amendment filed September 13, 2021. As such, claims 1-2, 4-5, 8-10, 12-13, 16, 18 and 20-28 are pending and have been examined.
The previous objections to claims 8-10, 12-13, 16, 18, 20 and 23-28 are withdrawn in view of the 6/28/2022 amendments to the claims.

Response to Arguments
Applicant's arguments filed 6/28/2022 with respect to the objections to claims 8-10, 12-13, 16, 18, 20 and 23-28 have been fully considered and are persuasive.
Applicant's arguments filed 6/28/2022 with respect to the rejections of claims 1-2, 4-5, 8-10, 12-13, 16, 18 and 20-28 under 35 U.S.C. 103 have been fully considered but are not persuasive. Applicant’s amendments have necessitated the rejections under 35 U.S.C. 103 discussed below.
With reference to amended claim 1 (and characterizing and paraphrasing the claim) applicant states “Independent claim 1 recites a method for training an artificial neural network using a high-power processor unit and transferring the trained artificial neural network in the higher power training system to a processing system implemented on a lower-power neuromorphic hardware system used to process input data via the trained artificial neural network.” (applicant’s remarks, page 9 – characterizing and paraphrasing the claim language with reference to embodiments disclosed in paragraphs 23-24 of the specification instead of quoting claim limitations). 
With apparent reference to the new limitation added to each of independent claims 1, 8 and 16, e.g., “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks to target reduced precision neuromorphic hardware”, applicant refers to the specification, but not the claim language, in stating “As explained in the specification, this hybrid system (training on a first system and operating on a second, neuromorphic system) has the advantage of alleviating the difficulty of implementing neural networks on reduced precision neuromorphic hardware because the training is platform agnostic, in contrast to prior art training methods that are platform dependent. See, e.g., Spec. ¶¶ 23-24. In particular, the training process may utilize a higher performance platform while adapting the training process to target the intended reduced precision neuromorphic platform on which the trained neural network is intended to operate (e.g., a spiking neural network operating on a neuromorphic hardware system).” Id.
With apparent reference to the combination of Soltiz, Soltiz 2012, Yoo and Davies previously applied to reject claim 1, applicant again characterizes and paraphrases the claim limitations by alleging that “Nothing in Soltiz 2012, alone or in the proposed combination, teaches two distinct systems-a training system used to train a spiking neural network and a processing system to which the trained neural network is transferred.” (applicant’s remarks, page 10). 
With continued reference to amended claim 1, applicant again characterizes and paraphrases the claim limitations by asserting “that neither of the cited references describe or suggest a combined hybrid system wherein the artificial neural network is trained on a first system and then the trained network is transferred to and implemented on a second system. Furthermore, the independent claims are amended to further recite that there is a performance difference between the two system-specifically reciting ‘wherein the processing system is a lower power system relative to the training system.’” (applicant’s remarks, page 11). 
Accordingly, applicant apparently argues that “a processing system comprising a spiking network implemented on neuromorphic hardware, wherein the processing system is a lower power system relative to the training system” and the newly-added limitation “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks to target reduced precision neuromorphic hardware” recited in independent claims 1, 8 and 16 is not disclosed or taught in the portions of the Soltiz, Soltiz 2012, Yoo and Davies references applied to independent claims 1, 8 and 16 in the previous Office Action. The examiner respectfully disagrees and points applicant to the below discussion of Soltiz, Soltiz 2012 and Davies.

As a preliminary matter, the new limitation “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks to target reduced precision neuromorphic hardware”, this appears to be intended use language with no patentable weight – aside from this recitation, the plural “deep neural networks” and “reduced precision neuromorphic hardware”, let alone “adaptation of deep neural networks to target reduced precision neuromorphic hardware” is not recited elsewhere in any of independent claims 1, 8 or 16, or in their respective dependent claims. Further, there is no positive recitation elsewhere in independent claims 1, 8 and 16, or in their dependent claims, of any step or limitation for “adapting” the “deep neural networks” or “targeting” the “reduced precision neuromorphic hardware”, much less enabling the “adaptation of deep neural networks to target reduced precision neuromorphic hardware”.
Even assuming arguendo that the above-noted wherein clause has patentable weight, which examiner does not concede, regarding the newly-added limitation, “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks”, aside from repeating the claim language in paragraph 23, which states in part “embodiments enable in-training adaptation of deep neural networks … by selectively modifying the activation functions of the artificial neuron in the neural network during training”, applicant’s specification does not define what is meant by “adaptation of deep neural networks”. Therefore, “selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks”, under the BRI, in light of the specification, is selected activation functions that can be used for training or adapting deep neural networks.
Additionally, regarding the new limitation “target reduced precision neuromorphic hardware”, besides repeating the claim language in paragraph 23, which states, inter alia, “embodiments enable in-training adaptation of deep neural networks to target reduced precision neuromorphic hardware”, applicant’s specification does not define what is meant by “target[ing] reduced precision neuromorphic hardware”, much less “adaptation of deep neural networks to target reduced precision neuromorphic hardware”. Paragraph 11 of applicant’s specification discloses “Neuromorphic hardware implementing an architecture of spiking neurons or otherwise binary artificial neurons may be referred to as reduced-precision neuromorphic hardware.” Therefore, “target reduced precision neuromorphic hardware”, under the BRI, in light of the specification, is using selected activation functions in a spiking neural network that can be implemented with neuromorphic hardware, such as a neuromorphic processor.
Regarding the limitation “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks to target reduced precision neuromorphic hardware” added to claims 1, 8 and 16, the examiner points to pages 34 and 53-54 of Soltiz, which explicitly disclose that “both the activation function and input weights are constantly trained towards a target function” [i.e., the target, threshold activation function], “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., neuromorphic hardware] and “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., selection of the activation functions from the sequence of functions enables creation and adaption of hardware-based neural networks implemented on target neuromorphic systems/neuromorphic hardware].
With continued reference to the limitation “wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks to target reduced precision neuromorphic hardware” added to claims 1, 8 and 16, the examiner further points to paragraphs 55-56 of Yoo, which explicitly disclose that “the neural network may include a plurality of hidden layers (such as seen, for example, in FIG. 5). A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning … output of a hidden node belonging to the second hidden layer may be connected to hidden nodes belonging to the third hidden layer.” and “the training apparatus … input outputs of previous hidden nodes included in a previous hidden layer into each hidden layer … by applying the connection weights to the outputs of the previous hidden nodes and activation functions” [i.e., selection of the activation functions enables training and adapting of deep neural networks].
With continued reference to the above-noted wherein clause, as noted above, to “target reduced precision neuromorphic hardware”, under the BRI, in light of the specification, is using selected activation functions in a spiking neural network that can be implemented with neuromorphic hardware, such as a neuromorphic processor. Regarding “target reduced precision neuromorphic hardware”, the examiner points to paragraphs 32-34 and 75 of Davies, which explicitly disclose that “a neuromorphic processor may be architected … examples and techniques below provide architectures to achieve … a neuromorphic processor. As used herein, references to ‘neural network’ … refer to a ‘spiking neural network’ … references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network”, “In an example of a spiking neural network, activation functions occur via spike trains” and “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports a very wide spectrum of such choices.” [i.e., selected activation functions used on neuromorphic hardware/processor implementing a spiking neural network - reduced-precision neuromorphic hardware].
Regarding applicant’s assertions vis-à-vis “a processing system comprising a spiking network implemented on neuromorphic hardware, wherein the processing system is a lower power system relative to the training system” recited in amended claims 1, 8 and 16, the examiner respectfully disagrees and points applicant to the discussion of Davies below.
With respect to “a processing system comprising a spiking network implemented on neuromorphic hardware”, the examiner points to paragraphs 32-34 of Davies, which explicitly disclose that “a neuromorphic processor may be architected. It is, however, desirable to create an efficient and fast neuromorphic processor” and “The examples and techniques below provide architectures to achieve just such a neuromorphic processor. As used herein, references to ‘neural network’ for at least some examples is specifically meant to refer to a ‘spiking neural network’; thus, many references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network” [i.e., a processing system including a spiking neural network implemented on neuromorphic hardware].
Regarding “wherein the processing system is a lower power system relative to the training system” recited in amended claims 1, 8 and 16, the examiner further points to paragraphs 43, 86, 71 and 75 of Davies, which explicitly disclose that “Very large-scale integration (VLSI) design technology, on the other hand, delivers much higher speed and more reliable circuits at the cost of … higher power.”, “algorithmic features such as … dynamic learning may add considerably more state per synapse.” [i.e., a relatively higher power system such as a learning/training system] and “the use of the SYNAPSE_MAP 312 indirection allows … the spike payload to be smaller, thereby saving overall area and power” [i.e., processing system with a spiking network saves power], “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports … such choices.” [i.e., the processing system with the spiking network implemented on neuromorphic hardware supports/is a lower power system relative to the training system]. The examiner notes that this reading of the spiking network of Davies being hosted on a lower power system is consistent with applicant’s own arguments and specification which state, inter alia, “the intended reduced precision neuromorphic platform on which the trained neural network is intended to operate (e.g., a spiking neural network operating on a neuromorphic hardware system).” [i.e., the reduced precision neuromorphic platform is a relatively lower powered system] and “Neuromorphic hardware implementing an architecture of spiking neurons or otherwise binary artificial neurons may be referred to as reduced-precision neuromorphic hardware.” (see, applicant’s remarks, page 9 and paragraph 11).
Applicant then largely repeats (previously rebutted arguments and assertions regarding other previously recited, and unamended claim limitations (see, applicant’s remarks, pages 12-13) before concluding “that the art of record, considered individually or in combination fails to teach or suggest every feature of independent claim 1 and similar recitations of independent claims 8 and 16.” (applicant’s remarks, page 13). For at least the reasons discussed in the previous office action, these arguments are unpersuasive. Also, as detailed in the rejections below, contrary to applicant’s assertions, the combination of Soltiz, Soltiz 2012, Yoo and Davies teaches every feature of independent claims 1, 8 and 16. 
With regard to amended dependent claims 4, 12 and 26, applicant states “Applicant agrees with the Examiner's finding regarding the definition of ‘piecewise’ in this mathematical context.” before asserting “However, respectfully, the alleged BRI definition of the phrase "piecewise differentiable" ignores the proper meaning of ‘differentiable’ in this mathematical context” and alleging that “nothing in the applied reference teaches or suggests the claimed ‘sequence’ of activation functions used in adapting the activation function, as required by the independent claims, nor do the applied references teach or suggest that each of the sequence of activation functions is ‘piecewise differentiable,’ as required by dependent claims 4, 12, and 26.” (applicant’s remarks, pages 14-16). The examiner respectfully disagrees with applicant’s allegations regarding these dependent claims. 
Regarding claims 4, 12 and 26, as discussed in the previous office action and below, paragraphs 23 and 29 of applicant’s specification disclose “a sequence of piecewise differentiable activation functions for each artificial neuron in the artificial neural network” and “activation functions in sequence of activation functions 128 may be piecewise differentiable 130.” As previously noted, these are the sole mentions of any “piecewise differentiable” functions in applicant’s specification. The plain meaning of piecewise is denoting that a function has a specified property, as smoothness or continuity, on each of a finite number of pieces into which its domain is divided. See https://www.dictionary.com/browse/piecewise. Further, the plain meaning of “differentiable” is capable of being differentiated. See https://www.dictionary.com/browse/differentiable. Therefore, “activation functions” that “are piecewise differentiable”, under the BRI, are any piecewise activation functions that can be differentiated from each other.
Regarding the limitation “wherein each of the activation functions in the sequence of activation functions is piecewise differentiable” recited in amended claims 4, 12 and 26, the examiner disagrees with applicant’s allegations and points applicant to the discussion of Soltiz 2012 below. 
With reference to the above-noted piecewise differentiable limitation of claims 4, 12 and 26, the examiner points to Table 3.1 of Soltiz 2012, which depicts two NLBs/neurons that each have a differentiable “Piecewise adaptive activation function.” The examiner further points to pages 20, 27 and 33 of Soltiz 2012, which explicitly disclose that “the activation function is modeled as a piecewise continuous function comprised of the interpolation between m points”, “An adaptive activation function has previously been modeled as a piecewise continuous function, represented as the interpolation of m points, each of which has a floating point value ranging from 0.0 to 1.0.” and “the activation function is represented as a piecewise function consisting of m ranges, each of which can be trained to a value ranging from 0 to Vdd.” [i.e., each of the adaptive/modifiable activation functions is a piecewise function that can be differentiated from other such functions].
As discussed in detail below, the previously-applied combination of Soltiz, Soltiz 2012, Yoo and Davies references teach all of the features of amended independent claims 1, 8 and 16, and their respective dependent claims, including amended claims 4, 12 and 26.
Applicant's amendment necessitated the rejections under 35 U.S.C. 103 discussed below.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-2, 4-5, 8-10, 12-13, 16, 18 and 20-28 are rejected under 35 U.S.C. 103 as being unpatentable over non-patent literature Soltiz, Michael, et al. ("Memristor-based neural logic blocks for nonlinearly separable functions." IEEE Transactions on computers 62.8 (2013): 1597-1606, hereinafter “Soltiz”) in view of non-patent literature Soltiz ("Hardware neuromorphic learning systems utilizing memristive devices." (2012): i-65, hereinafter “Soltiz 2012”) and Yoo et al. (U.S. Patent Application Pub. No. 2018/0068218 A1, hereinafter “Yoo”) and further in view of Davies (U.S. Patent Application Pub. No. 2018/0174033 A1, hereinafter “Davies”).
With regard to claim 1, Soltiz discloses the invention as claimed including a method for training and implementing an artificial neural network (see, e.g., Abstract and pages 1597 and 1600-1601, “Neural logic blocks (NLBs) enable the realization of biologically inspired reconfigurable hardware. Networks of NLBs can be trained … These designs are implemented … designs enable any logic function to be implemented in a single-layer NLB network … This work … proposing two hardware implementations of perceptron-based NLBs” [i.e., a network of NLBs is an artificial neural network of artificial neurons that is trained and implemented], “the adaptive activation function provides several benefits. First of all, by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence times”, “the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neural network system” [i.e., a method for training an artificial neural network and implementing it in a neural network system/hardware]) comprising artificial neurons and weights (see, e.g., pages 1597-1600, “hardware emulation of synaptic plasticity between large networks of neurons” [i.e., the neural network comprises artificial neurons/NLBs with synaptic connections between them], “By iteratively performing this process and looping through all input sets, the weights will be trained … the activation function is trained in parallel with the synaptic weights.”, “the weight of a synapse can be modified to strengthen or weaken a connection”, “by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence” [i.e., training a neural network that includes neurons and weights]), the method comprising:
training the artificial neural network on a training system (see, e.g., pages 1600-1601, “training hardware to provide the system with the ability to adjust input weights. In this work, we use the local/global stochastic gradient descent training circuitry … a global training circuit compares the output to an expected output, determines when training is necessary and the direction of training” [i.e., training on a training circuit/circuitry/system], “by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence times” [i.e., training the artificial neural network], “the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neural network system” [i.e., training the artificial neural on a neural network training system]), the training comprising:
providing training data as an input having a known output to the artificial neural network (see, e.g., pages 1599-1600, “provides training inputs to the Weighting and Input Select component” [i.e., providing training data as an input to the neural network], “a global training circuit compares the output to an expected output … the input weights are trained to match the expected output … the input weights are given two training cycles to attempt to match the expected output”, “an adjustable digital value must be associated with each input” [i.e., the training data has an expected/known output]); … 
selectively modifying activation functions of the artificial neurons in the artificial neural network (see, e.g., Abstract and pages 1597 and 1599, Sect. 3.2, “we propose two NLB designs-robust adaptive NLB (RANLB) and multithreshold NLB (MTNLB) … allowing the effective activation function to be adapted during the training … modify the activation function of individual neurons during the learning process [13], [14]. This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs”, “hardware implementation of a neural logic block with an adaptive activation function” [i.e., selectively adapting/modifying activation functions of neurons in the neural network]) by selecting the activation functions from a sequence of activation functions for each of the artificial neurons in the artificial neural network (see, e.g., pages 1597-1599 and 1601, “modify the activation function of individual neurons during the learning process … This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs, which are capable of learning all logic functions”, “our contributions are as follows: A perceptron-based NLB design with an adaptive activation function. A perceptron-based NLB with a static activation function and multiple activation thresholds … An adaptive activation function comprised of m points can implement any function with (m – 1) decision boundaries.” [i.e., selectively modifying/adapting activation functions for each of the artificial neurons/NLBs in the network], “Each design is capable of adapting its effective activation function during training to learn both linearly separable and nonlinearly separable functions”, “all logic functions’ ideal activation functions can be realized by limiting the input current range on a single, static activation function curve.” [i.e., modifying/adapting by the activation functions by selecting the activation functions from a separable sequence of effective, ideal activation functions on the activation function curve]).
Although Soltiz substantially discloses the claimed invention and discloses “the NLBs use a perceptron model with a threshold activation function”, “Using multiple thresholds, … a second layer of trainable memristors is used to adjust the output associated with each range.”, “provide an adaptive activation function, … use of this multithreshold activation function scheme has proven to give a neural logic block the ability to learn nonlinearly separable functions” [i.e., adapting/modifying activation functions using thresholds] (see, e.g., pages 1598 and 1600-1601), it is not relied on to explicitly disclose comparing the output of the neural network to the known output to determine an error value; 
using the error value to update the weights in the artificial neural network; and …
modifying activation functions of the artificial neurons in the artificial neural network … until the activation functions for the artificial neurons are threshold activation functions, wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware; and 
transferring the artificial neural network after training to a processing system comprising a … network implemented on neuromorphic hardware.
In the same field, analogous art Soltiz 2012 teaches comparing the output of the neural network to the known output to determine an error value (see, e.g., FIG. 1.3 showing comparison of actual output Y to the known/expected output Yexp [i.e., to determine a difference between Yexp and Y, an error value] and Algorithm 1 where output error value E = Yexp - Y, and pages 4 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB … a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output”, “Algorithm 1 Training an NLB … while Output error, E > 0 do E [Wingdings font/0xDF] Yexp – Y” [i.e., comparing expected/known output Yexp to the actual output of the neural network, Y, to determine error value E]); 
using the error value to update the weights in the artificial neural network (see, e.g., FIG. 1.3 showing that weights W are updated based on actual output Y not matching expected output Yexp [i.e., using an error value] and Algorithm 1 where error value E is used to update weights wi, and pages 4-5 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. Fig. 1.3 outlines the stochastic gradient descent process, a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output, Yexp, the synaptic weights corresponding to the high inputs are adjusted. This process is repeated until all input combinations produce the correct output.”, “Algorithm 1 Training an NLB with an adaptive activation function …

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

[i.e., using the error value E to update/adjust the weights wi in the network]); 
selectively modifying activation functions of the artificial neurons in the artificial neural network ... until the activation functions for the artificial neurons are threshold activation functions (paragraphs 23 and 29 of applicant’s specification state “target threshold activation functions” are “activation functions for the artificial neurons in the artificial neural network [that] are gradually adapted during the training process” and another example “activation function 124 of artificial neuron 118 may be selectively modified until target threshold activation function 126 is determined as activation function 124 for artificial neuron 118”. Therefore, “threshold activation functions”, under the broadest reasonable interpretation (BRI), in light of the specification, are any activation functions that are modified or adapted (e.g., adaptive activation functions) based on a target or threshold value to be threshold activation functions) (see, e.g., pages 4, 15 and 34, “the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks”, “a perceptron-based NLB that utilizes … an adaptive activation function is proposed” [i.e., an adaptive/modifiable activation function of artificial neurons/NLBs in the artificial neural network], “both the activation function and input weights are constantly trained towards a target function during training [i.e., a target threshold activation function]. While the activation function is modified less frequently in an RANLB, this design still contains this feature that decreases training time.” [i.e., modifying activation functions until they are target threshold activation functions]), wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware (see, e.g., pages 34 and 53-54, “both the activation function and input weights are constantly trained towards a target function” [i.e., the target, threshold activation function], “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., neuromorphic hardware], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., selection of the activation functions from the sequence of functions enables creation and adaption of hardware-based neural networks implemented on target neuromorphic systems/neuromorphic hardware]); and
transferring the artificial neural network after training to a processing system comprising a … network implemented on neuromorphic hardware (see, e.g., pages 4 and 53-54, “In biologically-inspired neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks to implement the desired functionality for a specific computing application. Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. … During this training process, the NLB is given all possible combination of inputs” [i.e., trained artificial neural network of trained NLBs], “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., a processing/neuromorphic system implemented on neuromorphic hardware], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., the processing system including a network implemented on hardware]. Because the training of NLBs is highly parallel, a very significant speedup would be expected in hardware implementations.” [i.e., transfer the trained network of NLBs to hardware implementations of neuromorphic systems include the processing system that performs signal processing]). 
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach propagating the input through the artificial neural network until an output of the artificial neural network is produced,
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks.
In the same field, analogous art Yoo teaches propagating the input through the artificial neural network until an output of the artificial neural network is produced (see, e.g., paragraphs 52 and 57, “neural network 100 includes an input layer 110, a hidden (or intermediate) layer 120, and an output layer 130. The input layer 110 receives an input to be used to perform training or recognition and transmits the input to the hidden layer 120 [i.e., propagating/transmitting the input through the neural network]. The output layer 130 generates an output of the neural network 100 based on signals (or indicia) received from the hidden layer 120.”, “inputting a training input of training data and a corresponding training output into the neural network 100, and updating connection weights of edges so that output data corresponding to the training output of the training data may be output” [i.e., until an output of the neural network is produced]),
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks (aside from repeating the claim language in paragraph 23, which states in part “embodiments enable in-training adaptation of deep neural networks … by selectively modifying the activation functions of the artificial neuron in the neural network during training”, applicant’s specification does not define what is meant by “adaptation of deep neural networks”. Therefore, “selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks”, under the BRI, in light of the specification, is selected activation functions that can be used for training or adapting deep neural networks) (see, e.g., paragraphs 55-56, “the neural network may include a plurality of hidden layers (such as seen, for example, in FIG. 5). A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning … output of a hidden node belonging to the second hidden layer may be connected to hidden nodes belonging to the third hidden layer.”, “the training apparatus … input outputs of previous hidden nodes included in a previous hidden layer into each hidden layer … by applying the connection weights to the outputs of the previous hidden nodes and activation functions” [i.e., selection of the activation functions enables training and adapting of deep neural networks]).
Alternatively, Yoo also teaches providing training data as an input having a known output to the artificial neural network (see, e.g., paragraphs 52, 57 and 59, “the neural network 100 includes an input layer 110 … a training input of training data received from the input layer 110”, “inputting a training input of training data” [i.e., providing training data as an input to the neural network], “the objective function is a loss function to be used by the neural network 100 to calculate a loss between an actual output value and a value expected to be output with respect to a training input of training data.” [i.e., providing training data as training input having an expected/known output]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 to incorporate the teachings of Yoo to provide “a training method for neural network recognition, the training method including obtaining a first neural network that includes a plurality of layers and a second neural network that includes a layer connected to the first neural network” and to provide a “method and apparatus for performing complex recognition … based on a neural network 100, and a method and apparatus for training the neural network” where the first “neural network 100 may also be referred to as an artificial neural network” and the first “neural network may include a plurality of hidden layers … a deep neural network”. (See, e.g., Yoo, paragraphs 15, 47, 50 and 55). Doing so would have allowed Soltiz in view of Soltiz 2012 to utilize Yoo’s “recognition apparatus [which] is able to recognize multiple tasks corresponding to purposes using the first neural network … and the second neural network” in order to perform “a first task, for example, face recognition, corresponding to the first purpose, thereby improving a recognition rate of the first task”, as suggested by Yoo (See, e.g., Yoo, paragraph 75). 
Although Soltiz in view of Soltiz 2012 and Yoo substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 and Yoo is not relied on to teach target reduced precision neuromorphic hardware; and …
a processing system comprising a spiking network implemented on neuromorphic hardware, wherein the processing system is a lower power system relative to the training system.
In the same field, analogous art Davies teaches target reduced precision neuromorphic hardware (aside from repeating the claim language in paragraph 23, which states in part “embodiments enable in-training adaptation of deep neural networks to target reduced precision neuromorphic hardware”, applicant’s specification does not define what is meant by “target[ing] reduced precision neuromorphic hardware”, much less “adaptation of deep neural networks to target reduced precision neuromorphic hardware”. Paragraph 11 of applicant’s specification discloses “Neuromorphic hardware implementing an architecture of spiking neurons or otherwise binary artificial neurons may be referred to as reduced-precision neuromorphic hardware.” Therefore, “target reduced precision neuromorphic hardware”, under the BRI, in light of the specification, is using selected activation functions in a spiking neural network that can be implemented with neuromorphic hardware, such as a neuromorphic processor) (see, e.g., paragraphs 32-34, “a neuromorphic processor may be architected … examples and techniques below provide architectures to achieve … a neuromorphic processor. As used herein, references to ‘neural network’ … refer to a ‘spiking neural network’ … references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network”, “In an example of a spiking neural network, activation functions occur via spike trains”, “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports a very wide spectrum of such choices.” [i.e., selected activation functions used on neuromorphic hardware/processor implementing a spiking neural network - reduced-precision neuromorphic hardware]); and …
a processing system comprising a spiking network implemented on neuromorphic hardware (see, e.g., paragraphs 32-34, “a neuromorphic processor may be architected. It is, however, desirable to create an efficient and fast neuromorphic processor … The examples and techniques below provide architectures to achieve just such a neuromorphic processor. As used herein, references to ‘neural network’ for at least some examples is specifically meant to refer to a ‘spiking neural network’; thus, many references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network” [i.e., a processing system including a spiking neural network implemented on neuromorphic hardware]), wherein the processing system is a lower power system relative to the training system (see, e.g., paragraphs 43, 86, 71 and 75, “Very large-scale integration (VLSI) design technology, on the other hand, delivers much higher speed and more reliable circuits at the cost of … higher power.”, “algorithmic features such as … dynamic learning may add considerably more state per synapse.” [i.e., a relatively higher power system such as a learning/training system], “the use of the SYNAPSE_MAP 312 indirection allows … the spike payload to be smaller, thereby saving overall area and power” [i.e., processing system with a spiking network saves power], “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports … such choices.” [i.e., the processing system with the spiking network implemented on neuromorphic hardware supports/is a lower power system relative to the training system]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Yoo to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 and Yoo “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 2, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the method of claim 1.
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose wherein training the artificial neural network comprises training a … neural network for implementation on neuromorphic hardware.
In the same field, analogous art Soltiz 2012 teaches wherein training the artificial neural network comprises training a … neural network for implementation on neuromorphic hardware (see, e.g., pages 4, 9, 23, 30 and 53, “In … neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks [i.e., a neural network] … During this training process, the NLB is given all possible combination of inputs”, “the complexity of training logic is also likely to be increased when considering a full neuromorphic system.”, “The ability of memristors to accurately emulate biological synapses with a single passive device makes them very appealing for the hardware implementation of neuromorphic systems. … NLBs not only simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time”, “In some large neural networks … , the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neuromorphic system”, “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs.” [i.e., training the neural network of NLBs/neurons for implementation on neuromorphic hardware]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach wherein training the artificial neural network comprises training a deep neural network for implementation on neuromorphic hardware. 
In the same field, analogous art Yoo teaches wherein training the artificial neural network comprises training a deep neural network for implementation on neuromorphic hardware (see, e.g., paragraphs 3 and 55-56, “a neuromorphic processor modelling a number of synapse-connected neurons, that models characteristics of biological nerve cells” [i.e., implementation on neuromorphic hardware], “A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning. … a result of the activation functions needs to exceed a threshold of a current hidden node. In this example, a node maintains a deactivated state without firing (or sending) a signal to a next node until a predetermined threshold strength of activation is reached through input vectors.” [i.e., training a deep neural network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 to incorporate the teachings of Yoo to provide “a training method for neural network recognition, the training method including obtaining a first neural network that includes a plurality of layers and a second neural network that includes a layer connected to the first neural network” and to provide a “method and apparatus for performing complex recognition … based on a neural network 100, and a method and apparatus for training the neural network” where the first “neural network 100 may also be referred to as an artificial neural network” and the first “neural network may include a plurality of hidden layers … a deep neural network”. (See, e.g., Yoo, paragraphs 15, 47, 50 and 55). Doing so would have allowed Soltiz in view of Soltiz 2012 to use Yoo’s “recognition apparatus [that] is able to recognize multiple tasks corresponding to purposes using the first neural network … and the second neural network” in order to perform “a first task, for example, face recognition, corresponding to the first purpose, thereby improving a recognition rate of the first task”, as suggested by Yoo. (See, e.g., Yoo, paragraph 75). 

Regarding claim 4, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the method of claim 1.
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose wherein each of the activation functions in the sequence of activation functions is piecewise differentiable.
In the same field, analogous art Soltiz 2012 teaches wherein each of the activation functions in the sequence of activation functions is piecewise differentiable (paragraphs 23 and 29 of applicant’s specification disclose “a sequence of piecewise differentiable activation functions for each artificial neuron in the artificial neural network” and “activation functions in sequence of activation functions 128 may be piecewise differentiable 130.” These are the sole mentions of any “piecewise differentiable” functions in applicant’s specification. The plain meaning of piecewise is denoting that a function has a specified property, as smoothness or continuity, on each of a finite number of pieces into which its domain is divided. See https://www.dictionary.com/browse/piecewise. Further, the plain meaning of “differentiable” is capable of being differentiated. See https://www.dictionary.com/browse/differentiable. Therefore, “activation functions” that “are piecewise differentiable”, under the BRI, in light of the specification, are any piecewise activation functions that can be differentiated from each other) (see, e.g., Table 3.1 showing two NLBs/neurons that each have a differentiable “Piecewise adaptive activation function.” and pages 20, 27 and 33, “the activation function is modeled as a piecewise continuous function comprised of the interpolation between m points”, “An adaptive activation function has previously been modeled as a piecewise continuous function, represented as the interpolation of m points, each of which has a floating point value ranging from 0.0 to 1.0.”, “the activation function is represented as a piecewise function consisting of m ranges, each of which can be trained to a value ranging from 0 to Vdd.” [i.e., each of the adaptive/modifiable activation functions is a piecewise function that can be differentiated from other such functions]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).

Regarding claim 5, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the method of claim 1.
Soltiz further discloses wherein selectively modifying the activation functions comprises selectively modifying first activation functions of first artificial neurons in a first layer of the artificial neural network (see, e.g., Abstract and pages 1597-1599 and 1601, Sect. 3.2, “we propose two NLB designs … allowing the effective activation function to be adapted during the training … modify the activation function of individual neurons during the learning process [13], [14]. This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs”, “Each design is capable of adapting its effective activation function … in a single layer.” [i.e., adapting/modifying first activation functions for first NLBs/neurons in a single, first layer], “our contributions are as follows: A perceptron-based NLB design with an adaptive activation function. A perceptron-based NLB with a static activation function and multiple activation thresholds … the NLBs use a perceptron model with a threshold activation function … we proposed a hardware NLB that integrates an adaptive activation function into a perceptron with memristive synapses.”, “hardware implementation of a neural logic block with an adaptive activation function”, “The use of this multithreshold activation function scheme has proven to give a neural logic block [NLB] the ability to learn nonlinearly separable functions in a single layer” [i.e., selectively adapting/modifying activation functions of NLBs/neurons in a single, first layer of the neural network]) and then selectively modifying second activation functions of second artificial neurons in a second layer of the artificial neural network only after all of the first activation functions for the first artificial neurons in the first layer are … activation functions (see, e.g., Table 2 – showing “Reconfigurable Logic Block [neuron] implementations” where “11 blocks [including second neurons] connected in 3 layers are required to learn the worst-case 4-input XOR function” [i.e., 3 layers –including a first and second layer of the neural network] and pages 1597-1600 and 1602, “Non-linearly separable functions … must be implemented in multiple layers of NLBs” [i.e., including a first and second layer of the neural network with first and second neurons/NLBs], “Since the NLBs use a perceptron model with a threshold activation function, multilayer logic is required to implement nonlinearly separable functions”, “we proposed a hardware NLB that integrates an adaptive activation function into a perceptron with memristive synapses. Using multiple thresholds, the input current is broken up into ranges. Then, a second layer of trainable memristors is used to adjust the output associated with each range.”, “As we show in [20], the adaptive activation function can be represented using an additional layer of memristors” [i.e., then modify separable activation functions for second NLBs/memristors/neurons in a second layer of the network after the first NLBs/neurons in the first layer are activation functions]).
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose that the activation functions for the first artificial neurons in the first layer are threshold activation functions.
In the same field, analogous art Soltiz 2012 teaches the activation functions for the first artificial neurons in the first layer are threshold activation functions (as indicated above, “threshold activation functions”, under the BRI, are any activation functions that are modified or adapted (e.g., adaptive activation functions) based on a target or threshold value to be threshold activation functions) (see, e.g., pages 4, 8, 15, 23 and 34, “the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks”, “To overcome the limitations on the learnable set of functions for a single NLB, perceptron-based systems generally implement non-linearly separable functions in multiple layers.” [i.e., including a first layer], “hardware implementations of perceptron-based NLBs, which are capable of learning all logic functions in a single layer [i.e., first NLBs/neurons in the first layer]. … a perceptron-based NLB that utilizes a second layer of memristors to represent an adaptive activation function is proposed.” [i.e., an adaptive/modifiable activation function of artificial neurons/NLBs in layers of the artificial neural network], “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” [i.e., the first layer], “both the activation function and input weights are constantly trained towards a target function during training [i.e., a target, threshold activation function]. While the activation function is modified less frequently in an RANLB, this design still contains this feature that decreases training time.”, “algorithm is capable of learning any logic function rapidly in a single layer.” [i.e., the activation functions for the first NLBs/neurons in the single, first layer are modified to be threshold activation functions]). 
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).

With respect to independent claim 8, Soltiz discloses the invention as claimed including a training system for training an artificial neural network (see, e.g., Abstract and pages 1597-1598 and 1600-1601, “Neural logic blocks (NLBs) enable the realization of biologically inspired reconfigurable hardware [i.e., artificial neurons in a hardware platform]. Networks of NLBs can be trained” [i.e., a network of NLBs is an artificial neural network of artificial neurons that is trained], “During the training process, the system is given input sets and the expected output vector.”, “training hardware to provide the system with the ability to adjust input weights” [i.e., a training system], “the adaptive activation function provides several benefits. First of all, by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence times”, “the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neural network system” [i.e., for training an artificial neural network]) comprising artificial neurons and weights (see, e.g., pages 1597-1600, “hardware emulation of synaptic plasticity between large networks of neurons” [i.e., the neural network comprises artificial neurons/NLBs with synaptic connections between them], “By iteratively performing this process and looping through all input sets, the weights will be trained … the activation function is trained in parallel with the synaptic weights.”, “the weight of a synapse can be modified to strengthen or weaken a connection”, “by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence” [i.e., training a neural network that includes neurons and weights]), the training system … configured to:
receive training data as an input having a known output to the artificial neural network (see, e.g., pages 1598-1600, “This work utilizes a stochastic gradient descent-based training mechanism, implemented using local and global training circuitry. During the training process, the system is given input sets and the expected output vector.” [i.e., receive training data as an input having an expected/known output at training circuitry/system], “provides training inputs to the Weighting and Input Select component” [i.e., training data is received as an input at the neural network], “a global training circuit compares the output to an expected output … the input weights are trained to match the expected output … the input weights are given two training cycles to attempt to match the expected output”, “an adjustable digital value must be associated with each input” [i.e., the training data has an expected/known output]); …
selectively modify activation functions of the artificial neurons in the artificial neural network (see, e.g., Abstract and pages 1597 and 1599, Sect. 3.2, “we propose two NLB designs-robust adaptive NLB (RANLB) and multithreshold NLB (MTNLB) … allowing the effective activation function to be adapted during the training … modify the activation function of individual neurons during the learning process [13], [14]. This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs”, “hardware implementation of a neural logic block with an adaptive activation function” [i.e., selectively adapting/modifying activation functions of neurons in the neural network]) by selecting the activation functions from a sequence of activation functions for each of the artificial neurons in the artificial neural network (see, e.g., pages 1597-1599 and 1601, “modify the activation function of individual neurons during the learning process … This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs, which are capable of learning all logic functions”, “our contributions are as follows: A perceptron-based NLB design with an adaptive activation function. A perceptron-based NLB with a static activation function and multiple activation thresholds … An adaptive activation function comprised of m points can implement any function with (m – 1) decision boundaries.” [i.e., selectively modifying/adapting activation functions for each of the artificial neurons/NLBs in the network], “Each design is capable of adapting its effective activation function during training to learn both linearly separable and nonlinearly separable functions”, “all logic functions’ ideal activation functions can be realized by limiting the input current range on a single, static activation function curve.” [i.e., modifying/adapting by the activation functions by selecting the activation functions from a separable sequence of effective, ideal activation functions on the activation function curve]).
 Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose 
compare the output of the neural network to the known output to determine an error value, 
use the error value to update the weights in the artificial neural network, …
modify activation functions of artificial neurons in the artificial neural network until the activation functions for the artificial neurons are threshold activation functions, wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware; and
transfer the artificial neural network after training to a processing system. 
In the same field, analogous art Soltiz 2012 teaches compare the output of the neural network to the known output to determine an error value (see, e.g., FIG. 1.3 showing comparison of actual output Y to the known/expected output Yexp [i.e., to determine a difference between Yexp and Y, an error value] and Algorithm 1 where output error value E = Yexp - Y, and pages 4 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB … a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output”, “Algorithm 1 Training an NLB … while Output error, E > 0 do E [Wingdings font/0xDF] Yexp – Y” [i.e., compare expected/known output Yexp to the actual output of the neural network, Y, to determine error value E]); 
use the error value to update the weights in the artificial neural network (see, e.g., FIG. 1.3 showing that weights W are updated based on actual output Y not matching expected output Yexp [i.e., using an error value] and Algorithm 1 where error value E is used to update weights wi, and pages 4-5 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. Fig. 1.3 outlines the stochastic gradient descent process, a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output, Yexp, the synaptic weights corresponding to the high inputs are adjusted. This process is repeated until all input combinations produce the correct output.”, “Algorithm 1 Training an NLB with an adaptive activation function …

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

[i.e., use the error value E to update/adjust the weights wi in the network]); … 
modify activation functions of artificial neurons in the artificial neural network until the activation functions for the artificial neurons are threshold activation functions (as indicated above, “threshold activation functions”, under the BRI, in light of the specification, are any activation functions that are modified or adapted (e.g., adaptive activation functions) based on a target or threshold value to be threshold activation functions) (see, e.g., pages 4, 15 and 34, “the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks”, “a perceptron-based NLB that utilizes … an adaptive activation function is proposed” [i.e., an adaptive/modifiable activation function of artificial neurons/NLBs in the artificial neural network], “both the activation function and input weights are constantly trained towards a target function during training [i.e., a target threshold activation function]. While the activation function is modified less frequently in an RANLB, this design still contains this feature that decreases training time.” [i.e., modifying activation functions until they are target threshold activation functions]), wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware (see, e.g., pages 34 and 53-54, “both the activation function and input weights are constantly trained towards a target function” [i.e., the target, threshold activation function], “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., neuromorphic hardware], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., selection of the activation functions from the sequence of functions enables creation and adaption of hardware-based neural networks implemented on target neuromorphic systems/neuromorphic hardware]); and 
 transfer the artificial neural network after training to a processing system (see, e.g., pages 4 and 54, “In biologically-inspired neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks to implement the desired functionality for a specific computing application. Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. … During this training process, the NLB is given all possible combination of inputs” [i.e., artificial neural network of trained NLBs], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., a processing system]. Because the training of NLBs is highly parallel, a very significant speedup would be expected in hardware implementations.” [i.e., transfer the neural network of NLBs to a processing system that performs signal processing]). 
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach propagate the input through the artificial neural network until an output of the artificial neural network is produced,
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks.
In the same field, analogous art Yoo teaches propagate the input through the artificial neural network until an output of the artificial neural network is produced (see, e.g., paragraphs 52 and 57, “neural network 100 includes an input layer 110, a hidden (or intermediate) layer 120, and an output layer 130. The input layer 110 receives an input to be used to perform training or recognition and transmits the input to the hidden layer 120 [i.e., propagate/transmit the input through the neural network]. The output layer 130 generates an output of the neural network 100 based on signals (or indicia) received from the hidden layer 120.”, “inputting a training input of training data and a corresponding training output into the neural network 100, and updating connection weights of edges so that output data corresponding to the training output of the training data may be output” [i.e., until an output of the neural network is produced]),
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks (as indicated above, “selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks”, under the BRI, in light of the specification, is selected activation functions that can be used for training or adapting deep neural networks) (see, e.g., paragraphs 55-56, “the neural network may include a plurality of hidden layers (such as seen, for example, in FIG. 5). A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning … output of a hidden node belonging to the second hidden layer may be connected to hidden nodes belonging to the third hidden layer.”, “the training apparatus … input outputs of previous hidden nodes included in a previous hidden layer into each hidden layer … by applying the connection weights to the outputs of the previous hidden nodes and activation functions” [i.e., selection of the activation functions enables training and adapting of deep neural networks]). 
Alternatively, Yoo also teaches provide training data as an input having a known output to the artificial neural network (see, e.g., paragraphs 52, 57 and 59, “the neural network 100 includes an input layer 110 … a training input of training data received from the input layer 110”, “inputting a training input of training data” [i.e., providing training data as an input to the neural network], “the objective function is a loss function to be used by the neural network 100 to calculate a loss between an actual output value and a value expected to be output with respect to a training input of training data.” [i.e., provide training data as training input having an expected/known output]).
Although Soltiz in view of Soltiz 2012 and Yoo substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 and Yoo is not relied on to teach a training system comprising a processor unit, 
target reduced precision neuromorphic hardware; and …

a processing system that is a lower power system relative to the training system.
In the same field, analogous art Davies teaches a training system comprising a processor unit configured to (Paragraph 32 of Applicant’s specification states “training system 206 may train artificial neural network 202 using a relatively high-power processor unit 214, such as a graphical processor unit 216.” Therefore, “a processor unit”, under the BRI, is any processing unit such as a central processing unit or a graphical processing unit (i.e., a CPU or GPU)) (see, e.g., paragraphs 34, 37, 190 and 208, “In an example of a spiking neural network, activation functions occur via spike trains”, “Each neuron may be characterized by an activation threshold. A spike message received by a neuron contributes to the activation of the neuron. … in response to the spike message, those destination neurons update their activation levels” [i.e., a training system for a neural network that selectively modifies activation levels/thresholds/functions of neurons in the neural network], “Machine (e.g., computer system) 26000 may include a neuromorphic processor 110, 300, a hardware processor 26002 (e.g., a central processing unit (CPU), a graphics processing unit (GPU),” [i.e., a processor unit/CPU/GPU configured to perform operations of system 26000], “the neuron's present activation state level and that is configured to be updated by the processor … wherein if an updated present activation state level exceeds a threshold activation level value, the processor is configured to generate an output spike event” [i.e., processor is configured to modify activation state levels]),
target reduced precision neuromorphic hardware (as indicated above, “target reduced precision neuromorphic hardware”, under the BRI, in light of the specification, is using selected activation functions in a spiking neural network that can be implemented with neuromorphic hardware, such as a neuromorphic processor) (see, e.g., paragraphs 32-34 and 75, “a neuromorphic processor may be architected … examples and techniques below provide architectures to achieve … a neuromorphic processor. As used herein, references to ‘neural network’ … refer to a ‘spiking neural network’ … references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network”, “In an example of a spiking neural network, activation functions occur via spike trains”, “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports a very wide spectrum of such choices.” [i.e., selected activation functions used on neuromorphic hardware/processor implementing a spiking neural network - reduced-precision neuromorphic hardware]); and …
a processing system that is a lower power system relative to the training system (see, e.g., paragraphs 43, 86, 71 and 75, “Very large-scale integration (VLSI) design technology, on the other hand, delivers much higher speed and more reliable circuits at the cost of … higher power.”, “algorithmic features such as … dynamic learning may add considerably more state per synapse.” [i.e., a relatively higher power system such as a learning/training system], “the use of the SYNAPSE_MAP 312 indirection allows … the spike payload to be smaller, thereby saving overall area and power” [i.e., processing system with a spiking network saves power], “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports … such choices.” [i.e., the processing system with the spiking network implemented on neuromorphic hardware supports/is a lower power system relative to the training system]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Yoo to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 and Yoo “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 9, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 8.
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach wherein the processor unit is a graphical processor unit.
In the same field, analogous art Davies teaches wherein the processor unit is a graphical processor unit (see, e.g., paragraph 190, “Machine (e.g., computer system) 26000 may include a neuromorphic processor 110, 300, a hardware processor 26002 (e.g., a central processing unit (CPU), a graphics processing unit (GPU),” [i.e., the processor unit is a GPU]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 10, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 8.
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose wherein the processor unit is configured to train a … neural network for implementation on neuromorphic hardware.
In the same field, analogous art Soltiz 2012 teaches wherein the processor unit is configured to train a … neural network for implementation on neuromorphic hardware (see, e.g., pages 4, 9, 23, 30 and 53, “In … neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks [i.e., a neural network] … During this training process, the NLB is given all possible combination of inputs”, “the complexity of training logic is also likely to be increased when considering a full neuromorphic system.”, “The ability of memristors to accurately emulate biological synapses with a single passive device makes them very appealing for the hardware implementation of neuromorphic systems. … NLBs not only simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time”, “In some large neural networks … , the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neuromorphic system”, “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs.” [i.e., training the neural network of NLBs/neurons for implementation on neuromorphic hardware]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).
Although Soltiz in view of Soltiz 2012 and Davies substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 and Davies is not relied on to teach wherein the processor unit is configured to train a deep neural network for implementation on neuromorphic hardware. 
In the same field, analogous art Yoo teaches wherein the processor unit is configured to train a deep neural network for implementation on neuromorphic hardware (see, e.g., paragraphs 3 and 55-56, “a neuromorphic processor modelling a number of synapse-connected neurons, that models characteristics of biological nerve cells” [i.e., implementation on neuromorphic hardware], “A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning. … a result of the activation functions needs to exceed a threshold of a current hidden node. In this example, a node maintains a deactivated state without firing (or sending) a signal to a next node until a predetermined threshold strength of activation is reached through input vectors.” [i.e., training a deep neural network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Davies to incorporate the teachings of Yoo to provide “a training method for neural network recognition, the training method including obtaining a first neural network that includes a plurality of layers and a second neural network that includes a layer connected to the first neural network” and to provide a “method and apparatus for performing complex recognition … based on a neural network 100, and a method and apparatus for training the neural network” where the first “neural network 100 may also be referred to as an artificial neural network” and the first “neural network may include a plurality of hidden layers … a deep neural network”. (See, e.g., Yoo, paragraphs 15, 47, 50 and 55). Doing so would have allowed Soltiz in view of Soltiz 2012 and Davies to use Yoo’s “recognition apparatus [that] is able to recognize multiple tasks corresponding to purposes using the first neural network … and the second neural network” in order to perform “a first task, for example, face recognition, corresponding to the first purpose, thereby improving a recognition rate of the first task”, as suggested by Yoo. (See, e.g., Yoo, paragraph 75). 

Regarding claim 12, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 8.
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose wherein each of the activation functions in the sequence of activation functions is piecewise differentiable.
In the same field, analogous art Soltiz 2012 teaches wherein each of the activation functions in the sequence of activation functions is piecewise differentiable (as indicated above, “activation functions” that “are piecewise differentiable”, under the BRI, in light of the specification, are any piecewise activation functions that can be differentiated from each other) (see, e.g., Table 3.1 showing two NLBs/neurons that each have a differentiable “Piecewise adaptive activation function.” and pages 20, 27 and 33, “the activation function is modeled as a piecewise continuous function comprised of the interpolation between m points”, “An adaptive activation function has previously been modeled as a piecewise continuous function, represented as the interpolation of m points, each of which has a floating point value ranging from 0.0 to 1.0.”, “the activation function is represented as a piecewise function consisting of m ranges, each of which can be trained to a value ranging from 0 to Vdd.” [i.e., each of the adaptive/modifiable activation functions are piecewise functions that can be differentiated from other such functions]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).

Regarding claim 13, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 8.
Soltiz further discloses wherein the processor unit is configured to selectively modify first activation functions of first artificial neurons in a first layer of the artificial neural network (see, e.g., Abstract and pages 1597-1599 and 1601, Sect. 3.2, “we propose two NLB designs … allowing the effective activation function to be adapted during the training … modify the activation function of individual neurons during the learning process [13], [14]. This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs”, “Each design is capable of adapting its effective activation function … in a single layer.” [i.e., adapt/modify first activation functions for first NLBs/neurons in a single, first layer], “our contributions are as follows: A perceptron-based NLB design with an adaptive activation function. A perceptron-based NLB with a static activation function and multiple activation thresholds … the NLBs use a perceptron model with a threshold activation function … we proposed a hardware NLB that integrates an adaptive activation function into a perceptron with memristive synapses.”, “hardware implementation of a neural logic block with an adaptive activation function”, “The use of this multithreshold activation function scheme has proven to give a neural logic block [NLB] the ability to learn nonlinearly separable functions in a single layer” [i.e., selectively adapt/modify activation functions of NLBs/neurons in a single, first layer of the neural network]) and then selectively modifying second activation functions of second artificial neurons in a second layer of the artificial neural network only after all of the first activation functions for the first artificial neurons in the first layer are … activation functions (see, e.g., Table 2 – showing “Reconfigurable Logic Block [neuron] implementations” where “11 blocks [including second neurons] connected in 3 layers are required to learn the worst-case 4-input XOR function” [i.e., 3 layers –including a first and second layer of the neural network] and pages 1597-1600 and 1602, “Non-linearly separable functions … must be implemented in multiple layers of NLBs” [i.e., including a first and second layer of the neural network with first and second neurons/NLBs], “Since the NLBs use a perceptron model with a threshold activation function, multilayer logic is required to implement nonlinearly separable functions”, “we proposed a hardware NLB that integrates an adaptive activation function into a perceptron with memristive synapses. Using multiple thresholds, the input current is broken up into ranges. Then, a second layer of trainable memristors is used to adjust the output associated with each range.”, “As we show in [20], the adaptive activation function can be represented using an additional layer of memristors” [i.e., then modify separable activation functions for second NLBs/memristors/neurons in a second layer of the network after the first NLBs/neurons in the first layer are activation functions]).
 Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose that the activation functions for the artificial neurons in the first layer are threshold activation functions.
In the same field, analogous art Soltiz 2012 teaches the activation functions for the artificial neurons in the first layer are threshold activation functions (as indicated above, “threshold activation functions”, under the BRI, are any activation functions that are modified or adapted (e.g., adaptive activation functions) based on a target or threshold value to be threshold activation functions) (see, e.g., pages 4, 8, 15, 23 and 34, “the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks”, “To overcome the limitations on the learnable set of functions for a single NLB, perceptron-based systems generally implement non-linearly separable functions in multiple layers.” [i.e., including a first layer], “hardware implementations of perceptron-based NLBs, which are capable of learning all logic functions in a single layer [i.e., first NLBs/neurons in the first layer]. … a perceptron-based NLB that utilizes a second layer of memristors to represent an adaptive activation function is proposed.” [i.e., an adaptive/modifiable activation function of artificial neurons/NLBs in layers of the artificial neural network], “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” [i.e., the first layer], “both the activation function and input weights are constantly trained towards a target function during training [i.e., a target, threshold activation function]. While the activation function is modified less frequently in an RANLB, this design still contains this feature that decreases training time.”, “algorithm is capable of learning any logic function rapidly in a single layer.” [i.e., the activation functions for the first NLBs/neurons in the single, first layer are modified to be threshold activation functions]). 
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).

With respect to independent claim 16, Soltiz discloses the invention as claimed including an artificial neural network … training … system (see, e.g., pages 1597-1598 and 1600-1601, “Neural logic blocks (NLBs) enable the realization of biologically inspired reconfigurable hardware [i.e., artificial neurons in a hardware platform/system]. Networks of NLBs can be trained” [i.e., a network of NLBs is an artificial neural network of artificial neurons that is trained], “During the training process, the system is given input sets and the expected output vector.”, “training hardware to provide the system with the ability to adjust input weights” [i.e., a training system], “the adaptive activation function provides several benefits. First of all, by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence times”, “the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neural network system” [i.e., an artificial neural network training system]), comprising:
a training system configured to train an artificial neural network (see, e.g., Abstract and pages 1597-1598 and 1600-1601, “Neural logic blocks (NLBs) enable the realization of biologically inspired reconfigurable hardware [i.e., artificial neurons on a hardware platform/system]. Networks of NLBs can be trained” [i.e., a network of NLBs is an artificial neural network of trainable artificial neurons running on a system], “During the training process, the system is given input sets and the expected output vector.”, “training hardware to provide the system with the ability to adjust input weights” [i.e., a training system], “the adaptive activation function provides several benefits. First of all, by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence times”, “the ability of a logic block to learn nonlinearly separable functions is crucial to the scalability of a neural network system” [i.e., configured to train an artificial neural network]) comprising artificial neurons and weights into a trained artificial neural network (see, e.g., pages 1597-1600, “hardware emulation of synaptic plasticity between large networks of neurons” [i.e., the neural network comprises artificial neurons/NLBs with synaptic connections between them], “By iteratively performing this process and looping through all input sets, the weights will be trained … the activation function is trained in parallel with the synaptic weights.”, “the weight of a synapse can be modified to strengthen or weaken a connection”, “by training both the shape of the activation function and the input weights in parallel, this method results in very fast training convergence” [i.e., training a neural network that includes neurons and weights into a trained network]),
provide training data as an input having a known output to the artificial neural network (see, e.g., pages 1599-1600, “provides training inputs to the Weighting and Input Select component” [i.e., providing training data as an input to the neural network], “a global training circuit compares the output to an expected output … the input weights are trained to match the expected output … the input weights are given two training cycles to attempt to match the expected output”, “an adjustable digital value must be associated with each input” [i.e., the training data has an expected/known output]); … and
selectively modify activation functions of the artificial neurons in the artificial neural network (see, e.g., Abstract and pages 1597 and 1599, Sect. 3.2, “we propose two NLB designs-robust adaptive NLB (RANLB) and multithreshold NLB (MTNLB) … allowing the effective activation function to be adapted during the training … modify the activation function of individual neurons during the learning process [13], [14]. This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs”, “hardware implementation of a neural logic block with an adaptive activation function” [i.e., selectively adapting/modifying activation functions of neurons in the neural network]) by selecting the activation functions from a sequence of activation functions for each of the artificial neurons in the artificial neural network (see, e.g., pages 1597-1599 and 1601, “modify the activation function of individual neurons during the learning process … This work leverages that observation, proposing two hardware implementations of perceptron-based NLBs, which are capable of learning all logic functions”, “our contributions are as follows: A perceptron-based NLB design with an adaptive activation function. A perceptron-based NLB with a static activation function and multiple activation thresholds … An adaptive activation function comprised of m points can implement any function with (m – 1) decision boundaries.” [i.e., selectively modifying/adapting activation functions for each of the artificial neurons/NLBs in the network], “Each design is capable of adapting its effective activation function during training to learn both linearly separable and nonlinearly separable functions”, “all logic functions’ ideal activation functions can be realized by limiting the input current range on a single, static activation function curve.” [i.e., modifying/adapting by the activation functions by selecting the activation functions from a separable sequence of effective, ideal activation functions on the activation function curve]).
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose an artificial neural network hybrid training and processing system, … 
compare the output of the neural network to the known output to determine an error value, 
use the error value to update the weights in the artificial neural network, and …
modify activation functions of the artificial neurons in the artificial neural network … until the activation functions for the artificial neurons are threshold activation functions, wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware; and
transfer the trained artificial neural network after training to a processing system comprising spiking network implemented on neuromorphic hardware; and the processing system comprising the neuromorphic hardware, wherein the processing system is configured to use the artificial neural network to process input data, wherein the processing system is lower power relative to the training system.
In the same field, analogous art Soltiz 2012 teaches an artificial neural network hybrid training and processing system (see, e.g., pages 4 and 54, “In biologically-inspired neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks to implement the desired functionality for a specific computing application. Using an error-based training mechanism, … During this training process, the NLB is given all possible combination of inputs” [i.e., a neuromorphic system performs input/data processing for a specific computing application and training], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., including processing applications]. Because the training of NLBs is highly parallel, a very significant speedup would be expected in hardware implementations.” [i.e., hardware implementations of neuromorphic systems include hybrid systems that perform training and processing – input signal processing]) ... 
compare the output of the neural network to the known output to determine an error value (see, e.g., FIG. 1.3 showing comparison of actual output Y to the known/expected output Yexp [i.e., to determine a difference between Yexp and Y, an error value] and Algorithm 1 where output error value E = Yexp - Y, and pages 4 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB … a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output”, “Algorithm 1 Training an NLB … while Output error, E > 0 do E [Wingdings font/0xDF] Yexp – Y” [i.e., compare expected/known output Yexp to the actual output of the neural network, Y, to determine error value E]); 
use the error value to update the weights in the artificial neural network (see, e.g., FIG. 1.3 showing that weights W are updated based on actual output Y not matching expected output Yexp [i.e., using an error value] and Algorithm 1 where error value E is used to update weights wi, and pages 4-5 and 21, “Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. Fig. 1.3 outlines the stochastic gradient descent process, a simple error minimization algorithm that can be applied to train a single NLB [17]. During this training process, the NLB is given all possible combination of inputs … coupled with the expected output, Yexp. If the actual output, Y, is different from the expected output, Yexp, the synaptic weights corresponding to the high inputs are adjusted. This process is repeated until all input combinations produce the correct output.”, “Algorithm 1 Training an NLB with an adaptive activation function …

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

[i.e., use the error value E to update/adjust the weights wi in the network]) modify activation functions of the artificial neurons in the artificial neural network … until the activation functions for the artificial neurons are threshold activation functions (as indicated above, “threshold activation functions”, under the BRI, are any activation functions that are modified or adapted (e.g., adaptive activation functions) based on a target or threshold value to be threshold activation functions) (see, e.g., pages 4, 15 and 34, “the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks”, “a perceptron-based NLB that utilizes … an adaptive activation function is proposed” [i.e., an adaptive/modifiable activation function of artificial neurons/NLBs in the artificial neural network], “both the activation function and input weights are constantly trained towards a target function during training [i.e., a target threshold activation function]. While the activation function is modified less frequently in an RANLB, this design still contains this feature that decreases training time.” [i.e., modifying activation functions until they are target threshold activation functions]) , wherein selection of the activation functions from the sequence of activation functions enables adaptation of … neural networks to target … neuromorphic hardware (see, e.g., pages 34 and 53-54, “both the activation function and input weights are constantly trained towards a target function” [i.e., the target, threshold activation function], “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., neuromorphic hardware], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., selection of the activation functions from the sequence of functions enables creation and adaption of hardware-based neural networks implemented on target neuromorphic systems/neuromorphic hardware]); and
transfer the trained artificial neural network after training to a processing system comprising … [a] network implemented on neuromorphic hardware; and the processing system comprising the neuromorphic hardware, wherein the processing system is configured to use the artificial neural network to process input data (see, e.g., pages 4, 14-15 and 53-54, “In biologically-inspired neuromorphic systems, the functionality of a single neuron with synapses at each input is modeled in a neural logic block (NLB). These NLBs are interconnected in large networks to implement the desired functionality for a specific computing application. Using an error-based training mechanism, it is fairly straightforward to train an individual NLB to implement different logic functions. … During this training process, the NLB is given all possible combination of inputs” [i.e., trained artificial neural network of trained NLBs after training process], “To truly see the benefits of hardware implementations of neuromorphic learning systems, it is critical to develop an NLB with both efficient synapse implementations and a neuron implementation that is capable of learning”, “In this work, it is proven that the scalability and efficiency of hardware-based neuromorphic systems can be improved drastically by adding complexity to neural logic block (NLB) designs” [i.e., a processing/neuromorphic system implemented on neuromorphic hardware and including the neuromorphic hardware], “neuromorphic systems have been used for a wide variety of applications, in fields such as pattern recognition, control systems, and signal processing. By improving scalability and efficiency in hardware NLB designs, this work opens the door to create hardware based neural networks for these applications [i.e., the processing system including a network implemented on hardware]. Because the training of NLBs is highly parallel, a very significant speedup would be expected in hardware implementations.” [i.e., transfer the trained network of NLBs to hardware implementations of neuromorphic systems include the processing system configured to use the network to perform signal processing by processing input signals/data]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach propagate the input through the artificial neural network until an output of the artificial neural network is produced,
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks.
In the same field, analogous art Yoo teaches propagate the input through the artificial neural network until an output of the artificial neural network is produced (see, e.g., paragraphs 52 and 57, “neural network 100 includes an input layer 110, a hidden (or intermediate) layer 120, and an output layer 130. The input layer 110 receives an input to be used to perform training or recognition and transmits the input to the hidden layer 120 [i.e., propagate/transmit the input through the neural network]. The output layer 130 generates an output of the neural network 100 based on signals (or indicia) received from the hidden layer 120.”, “inputting a training input of training data and a corresponding training output into the neural network 100, and updating connection weights of edges so that output data corresponding to the training output of the training data may be output” [i.e., until an output of the neural network is produced]),
wherein selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks (as indicated above, “selection of the activation functions from the sequence of activation functions enables adaptation of deep neural networks”, under the BRI, in light of the specification, is selected activation functions that can be used for training or adapting deep neural networks) (see, e.g., paragraphs 55-56, “the neural network may include a plurality of hidden layers (such as seen, for example, in FIG. 5). A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning … output of a hidden node belonging to the second hidden layer may be connected to hidden nodes belonging to the third hidden layer.”, “the training apparatus … input outputs of previous hidden nodes included in a previous hidden layer into each hidden layer … by applying the connection weights to the outputs of the previous hidden nodes and activation functions” [i.e., selection of the activation functions enables training and adapting of deep neural networks]). 
Alternatively, Yoo also teaches provide training data as an input having a known output to the artificial neural network (see, e.g., paragraphs 52, 57 and 59, “the neural network 100 includes an input layer 110 … a training input of training data received from the input layer 110”, “inputting a training input of training data” [i.e., providing training data as an input to the neural network], “the objective function is a loss function to be used by the neural network 100 to calculate a loss between an actual output value and a value expected to be output with respect to a training input of training data.” [i.e., provide training data as training input having an expected/known output]) and transfer the trained artificial neural network to neuromorphic hardware; and the neuromorphic hardware configured to use the artificial neural network to process input data (see, e.g., paragraphs 3 and 55-56, “a neuromorphic processor modelling a number of synapse-connected neurons, that models characteristics of biological nerve cells” [i.e., implementation on neuromorphic hardware], “A neural network including a plurality of hidden layers may be referred to as a deep neural network. Training the deep neural network may be referred to as deep learning. … a result of the activation functions needs to exceed a threshold of a current hidden node. In this example, a node maintains a deactivated state without firing (or sending) a signal to a next node until a predetermined threshold strength of activation is reached through input vectors.” [i.e., training a deep neural network on neuromorphic hardware]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 to incorporate the teachings of Yoo to provide “a training method for neural network recognition, the training method including obtaining a first neural network that includes a plurality of layers and a second neural network that includes a layer connected to the first neural network” and to provide a “method and apparatus for performing complex recognition … based on a neural network 100, and a method and apparatus for training the neural network” where the first “neural network 100 may also be referred to as an artificial neural network” and the first “neural network may include a plurality of hidden layers … a deep neural network”. (See, e.g., Yoo, paragraphs 15, 47, 50 and 55). Doing so would have allowed Soltiz in view of Soltiz 2012 to utilize Yoo’s “recognition apparatus [which] is able to recognize multiple tasks corresponding to purposes using the first neural network … and the second neural network” in order to perform “a first task, for example, face recognition, corresponding to the first purpose, thereby improving a recognition rate of the first task”, as suggested by Yoo (See, e.g., Yoo, paragraph 75). 
Although Soltiz in view of Soltiz 2012 and Yoo substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 and Yoo is not relied on to teach target reduced precision neuromorphic hardware; and …
a processing system comprising a spiking network implemented on neuromorphic hardware; and
the processing system comprising the neuromorphic hardware, … wherein the processing system is a lower power system relative to the training system.
In the same field, analogous art Davies teaches target reduced precision neuromorphic hardware (as indicated above, “target reduced precision neuromorphic hardware”, under the BRI, in light of the specification, is using selected activation functions in a spiking neural network that can be implemented with neuromorphic hardware, such as a neuromorphic processor) (see, e.g., paragraphs 32-34 and 75, “a neuromorphic processor may be architected … examples and techniques below provide architectures to achieve … a neuromorphic processor. As used herein, references to ‘neural network’ … refer to a ‘spiking neural network’ … references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network”, “In an example of a spiking neural network, activation functions occur via spike trains”, “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports a very wide spectrum of such choices.” [i.e., selected activation functions used on neuromorphic hardware/processor implementing a spiking neural network - reduced-precision neuromorphic hardware]); and …
a processing system comprising a spiking network implemented on neuromorphic hardware; and the processing system comprising the neuromorphic hardware (see, e.g., paragraphs 32-34, “a neuromorphic processor may be architected. It is, however, desirable to create an efficient and fast neuromorphic processor … The examples and techniques below provide architectures to achieve just such a neuromorphic processor. As used herein, references to ‘neural network’ for at least some examples is specifically meant to refer to a ‘spiking neural network’; thus, many references herein to a ‘neuron’ are meant to refer to an artificial neuron in a spiking neural network” [i.e., a processing system including a spiking neural network implemented on neuromorphic hardware/processor and including the neuromorphic hardware/processor]), … wherein the processing system is a lower power system relative to the training system (see, e.g., paragraphs 43, 86, 71 and 75, “Very large-scale integration (VLSI) design technology, on the other hand, delivers much higher speed and more reliable circuits at the cost of … higher power.”, “algorithmic features such as … dynamic learning may add considerably more state per synapse.” [i.e., a relatively higher power system such as a learning/training system], “the use of the SYNAPSE_MAP 312 indirection allows … the spike payload to be smaller, thereby saving overall area and power” [i.e., processing system with a spiking network saves power], “cruder neuroscience models use fewer resources and lower power. The neuromorphic architecture herein supports … such choices.” [i.e., the processing system with the spiking network implemented on neuromorphic hardware supports/is a lower power system relative to the training system]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Yoo to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 and Yoo “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 20, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 16.
Soltiz further discloses wherein the neuromorphic hardware is configured to use the trained artificial network to process image data to determine a classification of an image defined by the image data (see, e.g., pages 1597 and 1604-1605, Section 4.2, “Large networks of NLBs are able to learn complex functions and are applicable to problems in image recognition” [i.e., using the trained neural network], “Neuromorphic systems are commonly considered for OCR applications because their learning capability is appealing for developing a model of each individual character based on the classification of previous images … The OCR system is able to definitively classify all of the test images”, “each NLB could analyze a random set of pixels in the image … This OCR system is presented as a proof-of-concept to show the benefits of the proposed NLBs in a common application domain.” [i.e., the neuromorphic processing system is configured to use the trained neural network of NLBs to process image data to determine a classification of an image as a certain character defined by the image data]). 

Regarding claims 21 and 25, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the method of claim 1 and the system of 16.
Soltiz further discloses wherein the weights comprise:
weights for the artificial neurons;
weights for connections between the artificial neurons; or
weights for the artificial neurons and weights for the connections between the artificial neurons (see, e.g., pages 1598-1599, “the activation function is trained in parallel with the synaptic weights”, “a memristor can be thought of as a synapse with an input of V, an output of I, and a weight of 1/M … the weight of a synapse can be modified to strengthen or weaken a connection” [i.e., the weights include weights for connections between NLBs/artificial neurons]).

Regarding claim 26, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of 16.
Although Soltiz substantially discloses the claimed invention, it is not relied on to explicitly disclose wherein each of the activation functions in the sequence of activation functions is piecewise differentiable.
In the same field, analogous art Soltiz 2012 teaches wherein each of the activation functions in the sequence of activation functions is piecewise differentiable (as indicated above, “activation functions” that “are piecewise differentiable”, under the BRI, in light of the specification, are any piecewise activation functions that can be differentiated from each other) (see, e.g., Table 3.1 showing two NLBs/neurons that each have a differentiable “Piecewise adaptive activation function.” and pages 20, 27 and 33, “the activation function is modeled as a piecewise continuous function comprised of the interpolation between m points”, “An adaptive activation function has previously been modeled as a piecewise continuous function, represented as the interpolation of m points, each of which has a floating point value ranging from 0.0 to 1.0.”, “the activation function is represented as a piecewise function consisting of m ranges, each of which can be trained to a value ranging from 0 to Vdd.” [i.e., each of the adaptive/modifiable activation functions are piecewise functions that can be differentiated from other such functions]).
Soltiz and Soltiz 2012 are analogous art because they are both related to hardware implementations of neuromorphic learning systems that utilize adaptive activation functions, threshold activation functions, and memristive devices such as memristors/memristor synapses (see, e.g., Soltiz, Abstract and pages 1597-1598 and Soltiz 2012, Abstract and pages iii, 19 and 23).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz to incorporate the teachings of Soltiz 2012 to provide novel designs for neural logic blocks (NLBs) where “Each proposed NLB is capable of learning both linearly separable and nonlinearly separable functions in a single layer” (See, e.g., Soltiz 2012, Abstract and pages iii and 23). Doing so would have allowed Soltiz to use Soltiz 2012’s NLB designs to “overcome the limitations of previous hardware NLBs” by using NLB designs that “are capable of rapidly learning any function in a single layer” and “simplify large-scale neuromorphic systems to improve scalability drastically, but also improve overall energy, delay, and training time by reducing the number of blocks” as suggested by Soltiz 2012 (See, e.g., Soltiz 2012, Abstract and pages iii and 23).

Regarding claim 18, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 16.
Although Soltiz in view of Soltiz 2012 and Yoo substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach wherein the processor unit is a graphical processor unit.
In the same field, analogous art Davies teaches wherein the processor unit is a graphical processor unit (see, e.g., paragraph 190, “Machine (e.g., computer system) 26000 may include a neuromorphic processor 110, 300, a hardware processor 26002 (e.g., a central processing unit (CPU), a graphics processing unit (GPU),” [i.e., the processor unit is a GPU]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Yoo to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 and Yoo “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 23, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of claim 8.
Soltiz further discloses wherein the weights comprise:
weights for the artificial neurons;
weights for connections between the artificial neurons; or
weights for the artificial neurons and weights for the connections between the artificial neurons (see, e.g., pages 1598-1599, “the activation function is trained in parallel with the synaptic weights”, “a memristor can be thought of as a synapse with an input of V, an output of I, and a weight of 1/M .. the weight of a synapse can be modified to strengthen or weaken a connection” [i.e., the weights include weights for connections between NLBs/artificial neurons]).

Regarding claims 22, 24 and 27, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the method of claim 1, the system of claim 8, and the system of claim 16.
Although Soltiz in view of Soltiz 2012 and Yoo substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 and Yoo is not relied on to teach wherein training the artificial neural network comprises training the artificial neural network for implementation on a spiking network on neuromorphic hardware.
In the same field, analogous art Davies teaches wherein training the artificial neural network comprises training the artificial neural network for implementation on a spiking network on neuromorphic hardware (see, e.g., paragraphs 34, 37, 190 and 208, “In an example of a spiking neural network, activation functions occur via spike trains”, “Each neuron may be characterized by an activation threshold. A spike message received by a neuron contributes to the activation of the neuron. … in response to the spike message, those destination neurons update their activation levels” [i.e., training system for a neural network implementation on a spiking network], “Machine (e.g., computer system) 26000 may include a neuromorphic processor 110, 300” [i.e., system 26000 implemented on neuromorphic hardware], “the neuron's present activation state level and that is configured to be updated by the processor … wherein if an updated present activation state level exceeds a threshold activation level value, the processor is configured to generate an output spike event” [i.e., implementation on a spiking network on neuromorphic hardware]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 and Yoo to incorporate the teachings of Davies to provide “architectures to achieve … an efficient and fast neuromorphic processor that borrows from the biological model”. (See, e.g., Davies, paragraph 32). Doing so would have allowed Soltiz in view of Soltiz 2012 and Yoo “to implement neural information processing algorithms in the most efficient manner possible using present day design technology,” and to employ the architectures and neuromorphic processor of Davies to provide “a maximally efficient neuromorphic design efficiently [that] supports a range of precisions depending on the problem”, as suggested by Davies. (See, e.g., Davies, paragraphs 47 and 54). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 28, as discussed above, Soltiz in view of Soltiz 2012, Yoo and Davies teaches the system of 16.
Although Soltiz in view of Soltiz 2012 substantially teaches the claimed invention, Soltiz in view of Soltiz 2012 is not relied on to teach wherein the artificial neural network comprises an input layer, an output layer, and a number of hidden layers between the input layer and the output layer.
In the same field, analogous art Yoo teaches wherein the artificial neural network comprises an input layer, an output layer, and a number of hidden layers between the input layer and the output layer (see, e.g., paragraphs 52 and 55, “neural network 100 includes an input layer 110, a hidden (or intermediate) layer 120, and an output layer 130. The input layer 110 receives an input to be used to perform training or recognition and transmits the input to the hidden layer 120”, “the neural network may include a plurality of hidden layers (such as seen, for example, in FIG. 5). A neural network including a plurality of hidden layers may be referred to as a deep neural network. [i.e., the neural network includes an input layer, hidden layers and an output layer]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Soltiz in view of Soltiz 2012 to incorporate the teachings of Yoo to provide “a training method for neural network recognition, the training method including obtaining a first neural network that includes a plurality of layers and a second neural network that includes a layer connected to the first neural network” and to provide a “method and apparatus for performing complex recognition … based on a neural network 100, and a method and apparatus for training the neural network” where the first “neural network 100 may also be referred to as an artificial neural network” and the first “neural network may include a plurality of hidden layers … a deep neural network”. (See, e.g., Yoo, paragraphs 15, 47, 50 and 55). Doing so would have allowed Soltiz in view of Soltiz 2012 to utilize Yoo’s “recognition apparatus [which] is able to recognize multiple tasks corresponding to purposes using the first neural network … and the second neural network” in order to perform “a first task, for example, face recognition, corresponding to the first purpose, thereby improving a recognition rate of the first task”, as suggested by Yoo (See, e.g., Yoo, paragraph 75). 

Conclusion	
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure.
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125 

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125