DETAILED ACTION
This action is in response to the claims filed 01 July 2019 for application 16/459,596 filed 01 July 2019.
Claims 1-20 are pending.
Claims 1-20 are rejected.
	
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 4 is objected to because of the following informalities:
Claim 4, line 3, the second class of probability should read "the second class of probability distribution"
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 20 is rejected under 35 U.S.C. 101, because the claimed invention is directed to non- statutory subject matter.
Specifically, the claim as a whole does not fall within any statutory category and thus is nonstatutory. In particular, claim 20 is directed to a computer-readable storage that amounts to signals per se. While the specification does discuss the "computer-readable storage,” it discusses the storage in exemplary terms which does not specifically exclude signals per se and can therefore cover both transitory and non-transitory signals. In accordance with MPEP 2106.03: Non‐limiting examples of claims that are not directed to any of the statutory categories include transitory forms of signal transmission (often referred to as "signals per se"), such as a propagating electrical or electromagnetic signal or carrier wave. Even when a product has a physical or tangible form, it may not fall within a statutory category. For instance, a transitory signal, while physical and real, does not possess concrete structure that would qualify as a device or part under the definition of a machine, is not a tangible article or commodity under the definition of a manufacture (even though it is man-made and physical in that it exists in the real world and has tangible causes and effects), and is not composed of matter such that it would qualify as a composition of matter. In re Nuijten, 500 F.3d at 1356-1357, 84 USPQ2d at 1501-03. As such, a transitory, propagating signal does not fall within any statutory category. Mentor Graphics Corp. v. EVE-USA, Inc., 851 F.3d 1275, 1294, 112 USPQ2d 1120, 1133 (Fed. Cir. 2017); Nuijten, 500 F.3d at 1356-1357, 84 USPQ2d at 1501-03.

Claims 1-11, 14-20 are rejected under 35 U.S.C. 101, because the claims are directed to an abstract idea, and because the claim elements, whether considered individually or in combination, do not amount to significantly more than the abstract idea, see Alice Corporation Pty. Ltd. V. CLS Bank International et al., 573 US 208 (2014).

Regarding claim 1, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of at each of the input nodes, weighting the respective one of the plurality of input elements received by that input node by applying an instance of a first class of probability distribution to that input element, thereby generating a respective set of output parameters describing an output probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting an input value.
The limitation of said propagating comprising, at each of one or more nodes of at least one of the hidden layers, combining the sets of input parameters and weighting the combination by applying an instance of a second class of probability distribution to that combined set of input parameters, thereby generating a respective set of output parameters describing an output probability distribution for outputting to a next layer of the network, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses combining and weighting values. As part of this mental process, the claim limitation provides additional information regarding the probability distributions. This merely provides more descriptive information about the data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. In particular, the claim recites additional element(s) – computer-implemented. These additional elements are recited at a high-level of generality (i.e., as generic computer components performing generic computer functions of executing instructions on the computers) such that it amounts to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)). Additionally, the claim recites additional element(s) – neural network comprising a plurality of layers, the plurality of layers comprising: i) an input layer comprising a plurality of input nodes each configured to receive a respective one of a plurality of input elements, ii) one or more hidden layers each comprising a plurality of hidden nodes, each hidden node configured to receive sets of input parameters where each set describes an input probability distribution from one of the nodes in a previous layer of the network, and to output a set of output parameters describing an output probability distribution to a next layer of the network, and iii) an output layer comprising one or more output nodes each configured to output a respective output element, wherein the one or more hidden layers connect the input layer to the output layer. These additional elements are recited at a high-level of generality such that it amounts to no more than indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)). Additionally, the claim recites from each of the input nodes, outputting the respective set of output parameters as input parameters to one or more nodes in a next, hidden layer of the network, and thereby propagating the respective set of output parameters through the one or more hidden layers to the output layer, which is simply propagating data from one layer to the next layer of a network recited at a high level of generality. This is nothing more than insignificant extra-solution activity (MPEP 2106.05(g)). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, and, therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of computer-implemented, a neural network and propagating data through the network amount to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)), indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)) and insignificant extra-solution activity (MPEP 2106.05(g)), wherein the insignificant extra-solution activity is the well-understood routine and conventional activities of transmitting data (MPEP 2016.05(d)). Mere instructions to apply an exception using generic computer instructions, a field of use or technological environment in which to apply the judicial exception and insignificant extra- solution activity do not provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 2, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 2 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of said propagating comprises, at each node of the at least one hidden layer, combining the sets of input parameters and weighting the combination by applying an instance of the second class of probability distribution to that combination of input parameters, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses combining and weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 3, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 3 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of said propagating comprises, at least one node of some or all of the hidden layers, combining the sets of input parameters and weighting the combination by applying an instance of the second class of probability distribution to that combination of input parameters, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses combining and weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 4, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 4 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein the first class of probability distribution is only applied by the nodes of the input layer, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
The limitation of wherein each node of the plurality of hidden layers applies an instance of the second class of probability, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 5, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 5 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein each node that applies the first class of probability distribution applies a same form of the first class of probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 6, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 6 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein some or each of the nodes that apply the first class of probability distribution applies a different form of the first class of probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 7, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 7 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein each node that applies the second class of probability distribution applies the same form of the second class of probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.
Regarding claim 8, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 8 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein said instance of the first class of probability distribution is parametrized by at least a centre point at zero, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
The limitation of wherein a probability density of that instance of the first class of probability distribution tends to infinity at the centre point, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 9, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein the first class of probability distribution comprises one or more of the following forms of distribution, each instance of the first class taking one of these forms: a horseshoe probability distribution, a spike-and-slab probability distribution, a Laplace distribution, and a t-distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 10, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 10 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein the second class of probability distribution comprises one or more of the following forms of distribution, each instance of the second class taking one of these forms: a Gaussian distribution, and a uniform distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 11, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 11 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein each form of the first and/or second classes of probability distributions are parameterized by a respective set of parameters. and wherein the respective set of parameters comprise a centre point and/or a width of the probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 14, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 14 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of one or more cycles of said weighting at each of the input nodes and said propagating, wherein the neural network is trained to predict, after the one or more cycles, one or more predicted output elements based on the plurality of input elements, as drafted, is a process that, under its broadest reasonable interpretation, covers the performance of the limitation of the mind. That is, nothing in the claim elements precludes the step from practically being performed in the mind. For example, "predict" in the context of this claim encompasses simply generating a result.
The limitation of at each of the output nodes, outputting a respective predicted output element, as drafted, is a process that, under its broadest reasonable interpretation, covers the performance of the limitation of the mind. That is, nothing in the claim elements precludes the step from practically being performed in the mind. For example, "outputting" in the context of this claim encompasses simply presenting data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the "Mental Processes" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. In particular, the claim recites at each of the input nodes, receiving the respective one of the plurality of input elements, wherein each input element corresponds to a different input element of a prediction dataset, which is simply receiving data recited at a high level of generality. This is nothing more than insignificant extra-solution activity (MPEP 2106.05(g)). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea, and, therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of receiving data amounts to no more than insignificant extra-solution activity (MPEP 2106.05(g)), wherein the insignificant extra-solution activity is the well-understood routine and conventional activities of receiving data (MPEP 2016.05(d)). Insignificant extra-solution activity does not provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 15, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 15 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein said outputting of the predicted output elements comprises outputting the predicted output elements to a user, as drafted, is a process that, under its broadest reasonable interpretation, covers the performance of the limitation of the mind. That is, nothing in the claim elements precludes the step from practically being performed in the mind. For example, "outputting" in the context of this claim encompasses simply visually presenting data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the "Mental Processes" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 16, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 16 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of outputting to a user the respective sets of output parameters generated by one or more of the input nodes, as drafted, is a process that, under its broadest reasonable interpretation, covers the performance of the limitation of the mind. That is, nothing in the claim elements precludes the step from practically being performed in the mind. For example, "outputting" in the context of this claim encompasses simply visually presenting data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the "Mental Processes" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 17, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 17 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein at least one output parameter of each set of output parameters generated by one or more of the input nodes is a centre point of the probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses calculating values.
The limitation of outputting of those output parameters comprises outputting, for each set of output parameters, either a zero value or a non-zero value for the centre point of that probability distribution, wherein a zero value is output if the centre point is less than a threshold value, and wherein a non-zero value is output if the centre point is more than the threshold value, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses calculating values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 18, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 18 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer-implemented method of operating a neural network.
The limitation of wherein at least one output parameter of each set of output parameters generated by one or more of the input nodes is a centre point of the probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses calculating values.
The limitation of wherein at least one output parameter of each set of output parameters is a width of the probability distribution, and said outputting of the respective sets of output parameters comprises outputting, for each set of output parameters, the width of that probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses calculating values.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. The claim does not recite any additional elements which integrate the abstract idea into a practical application and, therefore, does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the claim does not recite any additional elements which provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 19, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 19 is directed to a computing apparatus, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computing apparatus to perform operations of operating a neural network.
The limitation of at each of the input nodes, weighting the respective one of the plurality of input elements received by that input node by applying an instance of a first class of probability distribution to that input element, thereby generating a respective set of output parameters describing an output probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting an input value.
The limitation of said propagating comprising, at each of one or more nodes of at least one of the hidden layers, combining the sets of input parameters and weighting the combination by applying an instance of a second class of probability distribution to that combined set of input parameters, thereby generating a respective set of output parameters describing an output probability distribution for outputting to a next layer of the network, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses combining and weighting values. As part of this mental process, the claim limitation provides additional information regarding the probability distributions. This merely provides more descriptive information about the data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. In particular, the claim recites additional element(s) – computing apparatus, one or more processors, storage storing code. These additional elements are recited at a high-level of generality (i.e., as generic computer components performing generic computer functions of executing instructions on the computers) such that it amounts to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)). Additionally, the claim recites additional element(s) – neural network comprising a plurality of layers, the plurality of layers comprising: i) an input layer comprising a plurality of input nodes each configured to receive a respective one of a plurality of input elements, ii) one or more hidden layers each comprising a plurality of hidden nodes, each hidden node configured to receive sets of input parameters where each set describes an input probability distribution from one of the nodes in a previous layer of the network, and to output a set of output parameters describing an output probability distribution to a next layer of the network, and iii) an output layer comprising one or more output nodes each configured to output a respective output element, N herein the one or more hidden layers connect the input layer to the output layer. These additional elements are recited at a high-level of generality such that it amounts to no more than indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)). Additionally, the claim recites from each of the input nodes, outputting the respective set of output parameters as input parameters to one or more nodes in a next, hidden layer of the network, and thereby propagating the respective set of output parameters through the one or more hidden layers to the output layer, which is simply propagating data from one layer to the next layer of a network recited at a high level of generality. This is nothing more than insignificant extra-solution activity (MPEP 2106.05(g)). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, and, therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of a computing apparatus, one or more processors, a storage storing code, a neural network and propagating data through the network amount to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)), indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)) and insignificant extra-solution activity (MPEP 2106.05(g)), wherein the insignificant extra-solution activity is the well-understood routine and conventional activities of transmitting data (MPEP 2016.05(d)). Mere instructions to apply an exception using generic computer instructions, a field of use or technological environment in which to apply the judicial exception and insignificant extra- solution activity do not provide an inventive concept, and, therefore, the claim is not patent eligible.

Regarding claim 20, the claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 20 is directed to a computer program, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a computer program to perform operations of operating a neural network.
The limitation of at each of the input nodes, weighting the respective one of the plurality of input elements received by that input node by applying an instance of a first class of probability distribution to that input element, thereby generating a respective set of output parameters describing an output probability distribution, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses weighting an input value.
The limitation of said propagating comprising, at each of one or more nodes of at least one of the hidden layers, combining the sets of input parameters and weighting the combination by applying an instance of a second class of probability distribution to that combined set of input parameters, thereby generating a respective set of output parameters describing an output probability distribution for outputting to a next layer of the network, as drafted, is a process that, under its broadest reasonable interpretation, covers a mathematical concept. This claim encompasses combining and weighting values. As part of this mental process, the claim limitation provides additional information regarding the probability distributions. This merely provides more descriptive information about the data.
If a claim limitation, under its broadest reasonable interpretation, covers performance of mathematical concepts, then it falls within the "Mathematical Concepts" grouping. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: With respect to the abstract idea, the judicial exception is not integrated into a practical application. In particular, the claim recites additional element(s) – computer program, computer-readable storage, one or more processors. These additional elements are recited at a high-level of generality (i.e., as generic computer components performing generic computer functions of executing instructions on the computers) such that it amounts to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)). Additionally, the claim recites additional element(s) – neural network comprises a plurality of layers, the plurality of layers comprising: i) an input layer comprising a plurality of input nodes each configured to receive a respective one of a plurality of input elements, ii) one or more hidden layers each comprising a plurality of hidden nodes, each hidden node configured to receive sets of input parameters where each set describes an input probability distribution from one of the nodes in a previous layer of the network, and to output a set of output parameters describing an output probability distribution to a next layer of the network, and iii) an output layer comprising one or more output nodes each configured to output a respective output element, wherein the one or more hidden layers connect the input layer to the output layer. These additional elements are recited at a high-level of generality such that it amounts to no more than indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)). Additionally, the claim recites from each of the input nodes, outputting the respective set of output parameters as input parameters to one or more nodes in a next, hidden layer of the network, and thereby propagating the respective set of output parameters through the one or more hidden layers to the output layer, which is simply propagating data from one layer to the next layer of a network recited at a high level of generality. This is nothing more than insignificant extra-solution activity (MPEP 2106.05(g)). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, and, therefore, the claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of a computer program, a computer-readable storage, one or more processors, a neural network and propagating data through the network amount to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(b)), indicating a field of use or technological environment in which to apply the judicial exception (MPEP 2106.05(h)) and insignificant extra-solution activity (MPEP 2106.05(g)), wherein the insignificant extra-solution activity is the well-understood routine and conventional activities of transmitting data (MPEP 2016.05(d)). Mere instructions to apply an exception using generic computer instructions, a field of use or technological environment in which to apply the judicial exception and insignificant extra- solution activity do not provide an inventive concept, and, therefore, the claim is not patent eligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 5, 7-11, 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Shridhar et al. (A Comprehensive Guide to Bayesian Convolutional Neural Network with Variational Inference, hereinafter referred to as "Shridhar") in view of Ghosh et al. (Structured Variational Learning of Bayesian Neural Networks with Horseshoe Prior, hereinafter referred to as “Ghosh”).

    PNG
    media_image1.png
    492
    562
    media_image1.png
    Greyscale

Figure 1: Bayesian Neural Network (Shridhar, Figure 1) with layer and class of distribution labels overlaid. This figure was taken from the Shridhar reverence (p. 3, Figure 1) and is referenced in the claims below as “Figure 1.”

Regarding claim 1, Shridhar teaches a computer-implemented (Shridhar, section 5.1.2 – teaches python source code [computer implemented]) method of operating a neural network (Figure 1 – teaches neural network; see also Shridhar, section 4 – convolutional neural network), wherein the neural network comprises a plurality of layers (Figure 1 – input layer, hidden layers, output layer), the plurality of layers comprising: i) an input layer (Figure 1 – input layer) comprising a plurality of input nodes (Figure 1 – input layer with input nodes 1, 2, … d) each configured to receive a respective one of a plurality of input elements (Figure 1 – input layer nodes input values x1, x2, … xd), ii) one or more hidden layers (Figure 1 – two hidden layers) each comprising a plurality of hidden nodes (Figure 1 – two hidden layers with hidden nodes 1,1 … 1,n and 2,1 … 2,n), each hidden node configured to receive sets of input parameters where each set describes an input probability distribution from one of the nodes in a previous layer of the network (Figure 1 – nodes of first hidden layer receiving parameters describing a distribution from nodes of input layer and nodes of second hidden layer receiving parameters describing a distribution from nodes of first hidden layer), and to output a set of output parameters describing an output probability distribution to a next layer of the network (Figure 1 – nodes of second hidden layer outputting parameters describing a distribution to nodes of output layer and nodes of first hidden layer outputting parameters describing a distribution to nodes of second hidden layer), and iii) an output layer (Figure 1 – output layer) comprising one or more output nodes (Figure 1 – output layer with output nodes 1, 2, … c) each configured to output a respective output element (Figure 1 – output layer nodes output values y1, y2, … yc), wherein the one or more hidden layers connect the input layer to the output layer (Figure 1 – two hidden layers connect input layer to output layer); and wherein the method comprises: 
at each of the input nodes, weighting the respective one of the plurality of input elements received by that input node by applying an instance of a first class of probability distribution to that input element, thereby generating a respective set of output parameters describing an output probability distribution (Figure 1 – teaches weighting each of the inputs with a first class of probability distribution to generate a set of output parameters describing an output distribution used as input to the first hidden layer); and 
from each of the input nodes, outputting the respective set of output parameters as input parameters to one or more nodes in a next, hidden layer of the network, and thereby propagating the respective set of output parameters through the one or more hidden layers to the output layer (Figure 1 – teaches weighting each of the inputs with a first class of probability distribution to generate a set of output parameters describing an output distribution used as input to the first hidden layer and repeating the process for each hidden layer to propagate parameters to the output layer); 
said propagating comprising, at each of one or more nodes of at least one of the hidden layers, combining the sets of input parameters (Figure 1 – teaches that each node of the first hidden layer receives input parameters from each of the nodes of the input layer and combines the input parameters) and weighting the combination by applying an instance of a second class of probability distribution to that combined set of input parameters (Figure 1 – teaches the combined output from each of the nodes of the first hidden layer is weighted with a second class of probability distribution), thereby generating a respective set of output parameters describing an output probability distribution for outputting to a next layer of the network (Figure 1 – teaches the combined output from each of the nodes of the first hidden layer is weighted with a second class of probability distribution to generate output parameters for the first hidden layer which is outputted to the second hidden layer), and wherein the first class of probability distribution is more sparsity inducing than the second class of probability distribution (Shridhar, section 4 – teaches using Gaussian distributions).
While Shridhar teaches weighting using Gaussian distributions, Shridhar does not explicitly teach wherein the first class of probability distribution is more sparsity inducing than the second class of probability distribution.
Ghosh teaches wherein the first class of probability distribution is more sparsity inducing than the second class of probability distribution (Ghosh, section 3 - teaches using horseshoe distributions [sparsity inducing] for the first layer [first class] but that a sparsity inducing distribution is not appropriate for the output layer [second class] and therefore uses Gaussian).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Shridhar with the teachings of Ghosh in order to improve compactness of the model while maintaining predictive performance in the field of Bayesian neural networks (Ghosh, Abstract – “Bayesian Neural Networks (BNNs) have recently received increasing attention for their ability to provide well-calibrated posterior uncertainties. However, model selection—even choosing the number of nodes—remains an open question. Recent work has proposed the use of a horseshoe prior over node pre-activations of a Bayesian neural network, which effectively turns off nodes that do not help explain the data. In this work, we propose several modeling and inference advances that consistently improve the compactness of the model learned while maintaining predictive performance, especially in smaller-sample settings including reinforcement learning.”).

Regarding claim 2, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein said propagating comprises, at each node of the at least one hidden layer, combining the sets of input parameters (Figure 1 – teaches that each node of the first hidden layer receives input parameters from each of the nodes of the input layer and combines the input parameters) and weighting the combination by applying an instance of the second class of probability distribution to that combination of input parameters (Figure 1 – teaches the combined output from each of the nodes of the first hidden layer is weighted with a second class of probability distribution).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 3, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 2 as noted above. Shridhar further teaches wherein the one or more hidden layers comprises a plurality of hidden layers (Figure 1 – teaches two hidden layers), and wherein said propagating comprises, at least one node of some or all of the hidden layers, combining the sets of input parameters (Figure 1 – teaches that each node of the first hidden layer receives input parameters from each of the nodes of the input layer and combines the input parameters) and weighting the combination by applying an instance of the second class of probability distribution to that combination of input parameters (Figure 1 – teaches the combined output from each of the nodes of the first hidden layer is weighted with a second class of probability distribution).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 2 above.

Regarding claim 5, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein each node that applies the first class of probability distribution applies a same form of the first class of probability distribution (Shridhar, section 4 – teaches using all Gaussian distributions).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 7, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein each node that applies the second class of probability distribution applies the same form of the second class of probability distribution (Shridhar, section 4 – teaches using all Gaussian distributions).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 8, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Ghosh further teaches wherein said instance of the first class of probability distribution is parametrized by at least a centre point at zero, and wherein a probability density of that instance of the first class of probability distribution tends to infinity at the centre point (Ghosh, section 3 – teaches using a horseshoe distribution with flat heavy tails while maintaining an infinitely tall spike at zero).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Shridhar and Ghosh in order to use horseshoe distributions to improve compactness of the model while maintaining predictive performance (Ghosh, Abstract).

Regarding claim 9, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Ghosh further teaches wherein the first class of probability distribution comprises one or more of the following forms of distribution, each instance of the first class taking one of these forms: 
- a horseshoe probability distribution (Ghosh, section 3 – teaches using horseshoe distributions for the first class probability distribution), 
- a spike-and-slab probability distribution (Gosh, section 1 – teaches that horseshoe distribution is a continuous relation of spike-and-slab distribution), 
- a Laplace distribution, and 
- a t-distribution.
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Shridhar and Ghosh in order to use horseshoe distributions to improve compactness of the model while maintaining predictive performance (Ghosh, Abstract).

Regarding claim 10, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein the second class of probability distribution comprises one or more of the following forms of distribution, each instance of the second class taking one of these forms: 
- a Gaussian distribution (Shridhar, section 4 – teaches using all Gaussian distributions), and 
- a uniform distribution.
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 11, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein each form of the first and/or second classes of probability distributions are parameterized by a respective set of parameters, and wherein the respective set of parameters comprise a centre point and/or a width of the probability distribution (Figure 1 – teaches that each distribution is parameterized by a mean [center point] and a standard deviation [width]).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 14, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein said operating comprises at least operating the neural network in a prediction phase (Shridhar, section 4.2 – teaches classifying unknown data points), the method comprising: 
at each of the input nodes, receiving the respective one of the plurality of input elements, wherein each input element corresponds to a different input element of a prediction dataset (Shridhar, section 4.2 – teaches classifying each unknown data example into a predicted class; Shridhar, section 5.2 – teaches datasets used in case studies have a training set of data points [training phase] and a test set of data points [prediction phase]); 
one or more cycles of said weighting at each of the input nodes and said propagating, wherein the neural network is trained to predict, after the one or more cycles, one or more predicted output elements based on the plurality of input elements (Shridhar, section 4.2 – teaches given an unknown data example, predicting the class of the data element; see also Shridhar, sections 2.4, 3.1 – teaches training); and 
at each of the output nodes, outputting a respective predicted output element (Shridhar, section 4.2 – teaches predicting the appropriate class).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 15, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 14 as noted above. Shridhar further teaches wherein said outputting of the predicted output elements comprises outputting the predicted output elements to a user (Shridhar, sections 5.3, 5.5 – teaches the results of the case studies are presented to the user because the results are compared to other methods).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 14 above.

Regarding claim 16, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches outputting to a user the respective sets of output parameters generated by one or more of the input nodes (Shridhar, section 5.1.9 – teaches that L1 norm is applied to the weights of all layers and weights that are zero or below a threshold are pruned [Using the weights to perform a secondary function in order to prune the weights means that the weights, including those of the input layer, are outputted to the user for performance of the secondary function.]).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 17, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 16 as noted above. Shridhar further teaches wherein at least one output parameter of each set of output parameters generated by one or more of the input nodes is a centre point of the probability distribution (Figure 1 – teaches that the output parameters of the input nodes includes the mean [center point] and standard deviation [width] of the distribution), and said outputting of those output parameters comprises outputting, for each set of output parameters, either a zero value or a non-zero value for the centre point of that probability distribution (Shridhar, section 5.1.9 – teaches the output parameters can be a zero or non-zero value for the mean), wherein a zero value is output if the centre point is less than a threshold value (Shridhar, section 5.1.9 – teaches zero values and values below a threshold are pruned [This means that non-zero values are treated like zero values and both output a zero value when pruned]), and wherein a non-zero value is output if the centre point is more than the threshold value (Shridhar, section 5.1.9 – teaches non-zero above a threshold are not pruned and are propagated to the next layer).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 16 above.

Regarding claim 18, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 16 as noted above. Shridhar further teaches wherein at least one output parameter of each set of output parameters is a width of the probability distribution, and said outputting of the respective sets of output parameters comprises outputting, for each set of output parameters, the width of that probability distribution (Figure 1 – teaches that the output parameters of the input nodes include the mean [center point] and standard deviation [width] of the distribution).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 16 above.

Regarding claim 19, it is the computer apparatus embodiment of claim 1 with similar limitations to claim 1 and is rejected under the same reasoning found in claim 1. Shridhar further teaches one or more processors and storage storing code arranged to run on the one or more processors, wherein the code is configured so as when run to perform operations of operating a neural network (Shridhar, section 5.1.2 – teaches python source code [code] stored on GitHub [storage]; Shridhar, sections 5.2-5.5 – teaches cases studies performed using the source code [Performing case studies using source code necessarily requires processors performing code stored in storage]) ...
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Regarding claim 20, it is the computer program embodiment of claim 1 with similar limitations to claim 1 and is rejected under the same reasoning found in claim 1. Shridhar further teaches a computer program embodied on computer-readable storage and configured so as when run one or more processors to perform operations of operating a neural network (Shridhar, section 5.1.2 – teaches python source code [code] stored on GitHub [storage]; Shridhar, sections 5.2-5.5 – teaches cases studies performed using the source code [Performing case studies using source code necessarily requires processors performing code stored in storage]) ...
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar and Ghosh for the same reasons as disclosed in claim 1 above.

Claims 4, 6 are rejected under 35 U.S.C. 103 as being unpatentable over Shridhar in view of Ghosh and further in view of Jylӓnki et al. (Expectation Propagation for Neural Networks with Sparsity-Promoting Priors, hereinafter referred to as “Jylanki”).

Regarding claim 4, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 3 as noted above. While Shridhar in view of Ghosh teaches one distribution on the input layer and another distribution on the output layer, Shridhar in view of Ghosh does not explicitly teach wherein the first class of probability distribution is only applied by the nodes of the input layer, and wherein each node of the plurality of hidden layers applies an instance of the second class of probability.
 Jylanki teaches wherein the first class of probability distribution is only applied by the nodes of the input layer, and wherein each node of the plurality of hidden layers applies an instance of the second class of probability (Jylanki, section 2.1 – teaches input weighting distributions of Laplace or Gaussian and that the grouping of weights for the input layer can be chosen freely and may also include other distributions and that to reduce the effects of irrelevant features, sparsity promoting distributions are used on the input layer weights [Because the grouping of distributions can be chosen freely, it would be obvious to a person skilled in the art that the input layer can have a first distribution and the remaining layers can have a different distribution.]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Shridhar in view of Ghosh with the teachings of Jylanki in order to reduce irrelevant features in the field of Bayesian neural networks (Jylanki, section 2.1 – “To reduce the effects of irrelevant features, sparsity-promoting priors ... with hierarchical scale parameters ... are placed on the input layer weights...”).

Regarding claim 6, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. However, Shridhar in view of Ghosh does not explicitly teach wherein some or each of the nodes that apply the first class of probability distribution applies a different form of the first class of probability distribution.
Jylanki teaches wherein some or each of the nodes that apply the first class of probability distribution applies a different form of the first class of probability distribution (Jylanki, section 2.1 – teaches input weighting distributions of Laplace or Gaussian and that the grouping of weights for the input layer can be chosen freely and may also include other distributions).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Shridhar in view of Ghosh with the teachings of Jylanki in order to reduce irrelevant features in the field of Bayesian neural networks (Jylanki, section 2.1 – “To reduce the effects of irrelevant features, sparsity-promoting priors ... with hierarchical scale parameters ... are placed on the input layer weights...”).

Claims 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Shridhar in view of Ghosh and further in view of Zayats et al. (US 2020/0334599 A1 – Identifying Correlated Roles Using a System Driven by a Neural Network, hereinafter referred to as “Zayats”).

Regarding claim 12, Shridhar in view of Ghosh teaches all of the limitations of the method of claim 1 as noted above. Shridhar further teaches wherein said operating comprises at least operating the neural network in a training phase (Shridhar, sections 2.4, 3.1 – teaches training phase), the method comprising: 
at each of the input nodes, receiving the respective one of the plurality of input elements, wherein each input element corresponds to a different input element of a training dataset (Shridhar, sections 2.4, 5.1 - teaches training using backpropagation where the input element is propagated through model, optimizing a cost function and backpropagating the gradients to adjust the weights; Shridhar, section 5.2 – teaches datasets used in case studies have a training set of data points [training phase] and a test set of data points [prediction phase]); 
receiving a set of known output elements, each known output elements corresponding to a different output element of the training dataset (Shridhar, sections 2.4, 5.1 – teaches backpropagation using supervised training and calculating the errors between the generated output and the known output; see also Shridhar, section 5.2 – teaches cases studies using datasets known in the art to have known outputs for supervised training); and 
training the neural network to predict the set of known output elements based on the received input elements (Shridhar, sections 2.4, 3.1, 5.1 – teaches training phase using supervised training; see also, Shridhar, section 5.2), wherein said training comprises: 
an initial cycle of said weighting at each of the input nodes and said propagating, thereby outputting, by the one or more output nodes, an initial estimated set of output element (Shridhar, section 2.4, 5.1 – teaches backpropagation using supervised training by propagating the input features through the model and calculating the errors between the generated output and the known output); and 
one or more further cycles of said weighting at each of the input nodes and said propagating, thereby outputting, by the one or more output nodes, an updated estimated set of output elements (Shridhar, section 2.4, 5.1 – teaches iterative [one or more cycles] backpropagation using supervised training by propagating the input features through the model, weighting at each node and calculating the errors between the generated output and the known output), wherein for each further cycle, one or both of the weighting of the plurality of input elements and the weighting of the combined set of input parameters are adjusted to generate the updated estimated set of output elements (Shridhar, section 2.4 - teaches training using backpropagation where the input element is propagated through model, optimizing a cost function and backpropagating the gradients to adjust the weights) …
While Shridhar in view of Ghosh teaches backpropagation training for the neural network, Shridhar in view of Ghosh does not explicitly teach using an error difference threshold as the training stop condition.
Zayats teaches one or more further cycles of said weighting at each of the input nodes and said propagating, thereby outputting, by the one or more output nodes, an updated estimated set of output elements (Zayats, ¶0044 – teaches iterative [one or more cycles] backpropagation using supervised training by propagating the input features through the model, weighting at each node and calculating the errors between the generated output and the known output), wherein for each further cycle, one or both of the weighting of the plurality of input elements and the weighting of the combined set of input parameters are adjusted to generate the updated estimated set of output elements (Zayats, ¶0044 - teaches training using backpropagation where the input element is propagated through model and backpropagating the gradients to adjust the weights) until the updated estimated set of output elements differs from the set of known output elements by less than a threshold (Zayats, ¶0044 – teaches iteratively repeating the backpropagation until a stop condition is reached, where in the stop condition is a threshold level of accuracy [between generated result and known result]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Shridhar in view of Ghosh with the teachings of Zayats in order to provide for an efficient and effective utilization of resources of devices in the field of training neural networks (Zayats, ¶0079 – “By mapping the organization-specific roles to the standardized roles, the role management is able to perform actions that improve efficiencies within the organization. In this way, the role management platform provides for an efficient and effective utilization of resources of devices associated with the organization (e.g., processing resources, network resources, memory resources, and/or the like)”).

Regarding claim 13, Shridhar in view of Ghosh and further in view of Zayats teaches all of the limitations of the method of claim 12 as noted above. Shridhar further teaches after said weighting of the respective ones of the plurality of input elements, for any input element that results in the generation, at one of the input nodes, of a respective set of output parameters comprising one or more parameters less than a threshold value, preventing that input element from propagating through the one or more hidden layers to the output layer (Shridhar, section 5.1.9 – teaches that L1 norm is applied to the weights after the weighting of all layers, including the input layer, and weights that are zero or below a threshold are pruned [prevented from propagating]).
It would have been obvious to one of ordinary skill in the art before the filing data of the claimed invention to combine the teaching of Shridhar, Ghosh and Zayats for the same reasons as disclosed in claim 12 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Blundell et al. (Weight Uncertainty in Neural Networks) teaches Bayes by Backprop training method for neural networks having distributions for weights.

Any inquiry concerning this communication or earlier communication from the examiner should be directed to MARSHALL WERNER whose telephone number is (469) 295-9143. The examiner can normally be reached on Monday – Thursday 7:30 AM – 4:30 PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at (571) 272-7796. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of unpublished applications may be obtained from the Private Patent Application Information Retrieval (Private PAIR) system. Information regarding the status of published applications may be obtained from the Patent Center. For more information about the PAIR system, see http://pair-direct.uspto.gov. To file and manage patent submissions in the Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about the Patent Center. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (USA or Canada) or 571-272-1000.

/MARSHALL L WERNER/               Examiner, Art Unit 2125