Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 6 and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 6, “wherein the magnitude of the syndrome is equal to a dot product of a neuron vector produced by the layer and a parity-check matrix minus a bias vector” is indefinite.  This definition contradicts the well-known mathematical definition of a magnitude.  A dot product returns a scalar which is a product of two magnitudes, and subtracting a bias vector from a scalar would produce a vector.  One of ordinary skill in the art would expect that a magnitude could be calculated from a vector and therefore if the vector were a magnitude calculating a magnitude from a magnitude would be either redundant or undefined.  In the interest of further examination and with respect to the instant specification the bias vector is interpreted as being equal to zero.

Regarding claim 6, one or more neurons comprising a plurality of neurons is indefinite.  One neuron comprising a plurality of neurons is contradictory.  One of ordinary skill in the art would expect a layer to comprise a plurality of neurons, but the expected behavior of a neuron containing a plurality of neurons is neither well-defined nor common in the art.  With the interest of further examination the claim is interpreted such that it should read “a plurality of sub-networks included in a layer of the neural network wherein the sub-networks contain neurons, biases, or links.”

Regarding claim 18, penalizing two or more neurons or parameters in two or more different neural networks is indefinite.  The specification does not define this limitation which could refer to simply applying the constraints to two different completely unrelated full neural networks or applying the constraints to sub-networks of the same neural network which would contradict the first interpretation.  With respect to consistency with the interpretation of claim 6 two or more different neural networks is interpreted as synonymous with two or more different sub-networks of the full neural network.

Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-24 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
obtaining, by one or more computing devices, data descriptive of a neural network (gathering data), evaluating, by the one or more computing devices, a loss function that is descriptive of a performance of the neural network with respect to a set of training examples (observation and evaluation), 
determining, by the one or more computing devices, a gradient of the loss function with respect to the one or more neurons, links, or biases of the neural network, wherein, for at least the one or more neurons, links, or biases, the loss function includes an additional loss term that penalizes non-adherence of the one or more neurons, links, or biases to one or more code constraints (mathematical calculations).  
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “backpropagating, by the one or more computing devices, the loss function through the neural network to train the neural network, wherein backpropagating, by the one or more computing devices, the loss function through the neural network comprises, for one or more neurons, links, or biases of the neural network”, and “modifying, by the one or more computing devices, the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function with respect to the one or more neurons, links, or biases”. However, these additional features are generic functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
The above analysis also applies to claim 24 which recites corresponding features.  Therefore, claims 1, and 24 recite an abstract idea which is a judicial exception.

Regarding Claim 2:  Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 2 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 2 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein the one or more code constraints comprise a set of equations on values produced by the one or more neurons, links, or biases, wherein the set of equations comprises linear equations, non-linear equations, or both linear and non-linear equations. (mathematical calculations).  
Therefore, claim 2 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 2 recites the additional elements introduced in claim 1.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the 
Step 2B Analysis:  Claim 2 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 2 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 3:  Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 3 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 3 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein the set of equations comprise a set of parity check equations (mathematical calculations).  
Therefore, claim 3 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 3 recites the additional elements introduced in claim 3.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional 
Step 2B Analysis:  Claim 3 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 3 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 4:  Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 4 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 4 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 4 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 4 recites the additional elements introduced in claim 1.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional wherein the one or more code constraints comprise one or more error control code constraints, one or more modulation constraints, or one or more lattice code constraints”. However, these additional features amount to no more than selection of a data type.  Therefore, claim 4 is directed to a judicial exception.
Step 2B Analysis:  Claim 4 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 4 amount to no more than mere instructions to apply the judicial exception using a generic computer component.



Regarding Claim 5:  Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 5 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 5 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components 
the additional loss term provides a penalty based at least in part on a magnitude of a syndrome, wherein the magnitude of the syndrome is based at least in part on a dot product of a neuron vector or link or bias vector produced by the layer and a parity-check matrix (mathematical calculations).  
Therefore, claim 5 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 5 recites the additional elements introduced in claim 1 as well the following elements: the one or more neurons, links, or biases comprise a plurality of neurons included in a layer of the neural network.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 5 is directed to a judicial exception.
Step 2B Analysis:  Claim 5 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 5 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 6:  Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 6 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 6 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
the additional loss term provides a penalty based at least in part on a magnitude of a syndrome computed for the layer, wherein the magnitude of the syndrome is equal to a dot product of a neuron vector produced by the layer and a parity-check matrix minus a bias vector (mathematical calculations).  
Therefore, claim 6 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 6 recites the additional elements introduced in claim 6 as well the following elements: the one or more neurons, links, or biases comprise a plurality of neurons included in a layer of the neural network.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a 
Step 2B Analysis:  Claim 6 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 6 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 7:  Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 7 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 7 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein the additional loss term comprises the square error of the one or more code constraints to a composite loss (mathematical calculations that can be performed using pen and paper).  
Therefore, claim 7 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 7 recites the additional elements introduced in claim 1.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to 
Step 2B Analysis:  Claim 7 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 7 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 8:  Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 8 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 8 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein modifying, by the one or more computing devices, the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function comprises modifying, by the one or more computing devices, the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function to minimize a loss provided by the loss function with respect to one or more neurons in the layer but to maximize the loss with respect to a multiplier dual variable vector (mathematical calculations).  
Therefore, claim 8 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 8 recites the additional elements introduced in claim 1.  However, these additional features are generic computer functions recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 8 is directed to a judicial exception.
Step 2B Analysis:  Claim 8 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 8 amount to no more than mere instructions to apply the judicial exception using a generic computer component.


Regarding Claim 9:  Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 9 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
obtaining data descriptive of a neural network (gathering data), 
wherein training the neural network comprises injecting a gradient of an additional loss term into the neural network during backpropagation of a loss function through the neural network, wherein the additional loss term penalizes non-adherence of two or more neurons or parameters of the neural network to one or more code constraints (mathematical calculation).  
Therefore, claim 9 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 9 recites additional elements “one or more processors”, “one or more non-transitory computer-readable media that store instructions that, when executed by the one or more processors, cause the computing system to perform operations”.  However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not training the neural network based on a training dataset” which amounts to a generic computer function performed on generic computer components and is not enough to integrate the judicial exception into a practical application.  Therefore, claim 9 is directed to a judicial exception.
Step 2B Analysis:  Claim 9 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 9 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
Regarding Claim 10:  Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 10 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 10 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 10 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 10 recites the additional elements introduced in claim 9.  However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial wherein the one or more code constraints comprise one or more error correcting codes, modulation codes, or lattice codes that constrain the two or more neurons or parameters of the neural network” which amounts to selection of a data type and is not enough to integrate the judicial exception into a practical application.  Therefore, claim 10 is directed to a judicial exception.
Step 2B Analysis:  Claim 10 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 10 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 11:  Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 11 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  
wherein the one or more code constraints comprise parity constraints applied to the two or more neurons or parameters of the neural network, wherein the parity constraints sum to zero or another real number (mathematical calculation).  Therefore, claim 11 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 11 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 11 is directed to a judicial exception.
Step 2B Analysis:  Claim 11 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 11 amount to no more than mere instructions to apply the judicial exception using a generic computer component.


Regarding Claim 12:  Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 12 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 12 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 12 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 12 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 12 also introduces additional elements “wherein the one or more code constraints comprise one or more rateless code constraints” which amounts to selection of a data type and does not integrate the judicial exception into a practical application.  Therefore, claim 12 is directed to a judicial exception.
Step 2B Analysis:  Claim 12 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 12 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 13:  Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 13 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 13 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 13 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 13 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 13 also introduces additional elements “wherein the two or more neurons or parameters to which the one or more constraints are applied are located in a same single hidden layer of the neural network” which amounts to a generic computer function using a generic computing component and does not integrate the judicial exception into a practical application.  Therefore, claim 13 is directed to a judicial exception.
Step 2B Analysis:  Claim 13 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the 

Regarding Claim 14:  Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 14 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 14 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 14 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 14 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 14 also introduces additional elements “wherein the two or more neurons or parameters to which the one or more constraints are applied are located in two or more different hidden layers of the neural network.” which amounts to a generic function using a generic computing component and does 
Step 2B Analysis:  Claim 14 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 14 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 15:  Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 15 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 15 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein injecting the gradient of the additional loss term during backpropagation comprises treating the additional loss term as a regularization term for the loss function or performing Lagrangian constrained optimization (mathematical calculation).  
Therefore, claim 15 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 15 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-
Step 2B Analysis:  Claim 15 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 15 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 16:  Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 16 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 16 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
wherein injecting the gradient of the additional loss term comprises: determining a syndrome based on a code parity check matrix associated with the one or more code constraints; and determining the gradient of the additional loss term based at least in part on the syndrome (mathematical calculation).  
Therefore, claim 16 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 16 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 16 is directed to a judicial exception.
Step 2B Analysis:  Claim 16 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 16 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 17:  Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 17 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 17 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a 
Step 2A Prong Two Analysis:  Claim 17 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 17 also introduces additional elements “wherein the one or more code constraints comprise one or more of: block code constraints; convolutional code constraints; cyclic code constraints; quasi-cyclic code constraints; lattice or real number valued code constraints; Hamming code constraints; Hadamard code constraints; BCH code constraints; LDPC code constraints; turbo code constraints; LDGM code constraints; HDPC code constraints; Reed-Solomon code constraints; Reed-Muller code constraints; CRC code constraints; Golay code constraints; or Polar code constraints.” which amounts to selection of a data type and does not integrate the judicial exception into a practical application.  Therefore, claim 17 is directed to a judicial exception.
Step 2B Analysis:  Claim 17 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the 

Regarding Claim 18:  Claim 18 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 18 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 18 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 18 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 18 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 18 also introduces additional elements “wherein the additional loss term penalizes non-adherence of two or more neurons or parameters respectively included in two or more different neural networks to the one or more code constraints.” which amounts to a generic function using a generic computing component and does not integrate the judicial exception into a practical application.  Therefore, claim 18 is directed to a judicial exception.
Step 2B Analysis:  Claim 18 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 18 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 19:  Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 19 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 19 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 19 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 19 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 19 also introduces additional elements “wherein the neural network comprises an input embedding layer that receives an input embedding, and wherein the two or more neurons or parameters to which the one or more code constraints are applied are included in the input embedding layer.” which amounts to a generic function using a generic computing component and does not integrate the judicial exception into a practical application.  Therefore, claim 19 is directed to a judicial exception.
Step 2B Analysis:  Claim 19 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 19 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 20:  Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 20 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 20 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 20 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 20 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely wherein the operations further comprise puncturing one or more links of the neural network to break symmetry of the two or more neurons or parameters.” which amounts to a generic function using a generic computing component and does not integrate the judicial exception into a practical application.  Therefore, claim 20 is directed to a judicial exception.
Step 2B Analysis:  Claim 20 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 20 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 21:  Claim 21 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 21 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  
wherein training the neural network further comprises dropping out or shuffling training examples from the training dataset between components of the network or separate networks trained together (gathering and outputting data).  Therefore, claim 21 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 21 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 21 is directed to a judicial exception.
Step 2B Analysis:  Claim 21 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 21 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 22:  Claim 22 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 22 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 22 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  Therefore, claim 22 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 22 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 22 also introduces additional elements “wherein the two or more neurons or parameters of the neural network comprise two or more neurons of a hidden layer of the neural network, wherein the one or more code constraints are applied to the two or more neurons pre-activation or post-activation” which amounts to a generic function performed a generic computer component and does not integrate the judicial exception into a practical application.  Therefore, claim 22 is directed to a judicial exception.
Step 2B Analysis:  Claim 22 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 22 amount to no more than mere instructions to apply the judicial exception using a generic computer component.

Regarding Claim 23:  Claim 23 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 23 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 23 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
retrieving the neural network from a storage device (gathering data).  
Therefore, claim 23 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 23 recites additional elements introduced in claim 9. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 23 also introduces additional elements “wherein the operations further comprise applying error correction to enforce one or more additional code constraints on the neural network when”, and “combining a plurality of versions of the neural network trained in parallel” which amounts to a generic 
Step 2B Analysis:  Claim 23 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 23 amount to no more than mere instructions to apply the judicial exception using a generic computer component.


Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-24 are rejected under 35 U.S.C. § 101. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 9, 13-15, 18, 20-22, and 24 are rejected under 35 U.S.C. 102(a)(1) as being unpatentable over Teig (US 11017295 B1). 

Regarding claim 1, Teig teaches A computer-implemented method to train neural networks, the method comprising: ([Col. 1 l. 36] "Some embodiments of the invention provide a novel method for training a multi-layer node network that results in weights used by the nodes being assigned only a discrete set of values." ).
obtaining, by one or more computing devices, data descriptive of a neural network; ([Abstract] “The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device. The network of computation nodes includes multiple layers of nodes.” Obtaining data descriptive of a neural network interpreted as synonymous with applying a network of computation nodes to an input.).
evaluating, by the one or more computing devices, a loss function that is descriptive of a performance of the neural network with respect to a set of training examples; and ([Col. 9 l. 7-20] "In some embodiments, the error calculator 310 computes the error for each individual input as the network 330 generates its output. The error calculator 310 receives both the predicted output from the input generator 305 and the output of the network 330, and uses a loss function that quantifies the difference between the predicted output and the actual output for each input." Actual output interpreted as synonymous with set of training examples.).
backpropagating, by the one or more computing devices, the loss function through the neural network to train the neural network, wherein backpropagating, by the one or more computing devices, the loss function through the neural network comprises, for one or more neurons, links, or biases of the neural network: ([Col. 2 l. 22-35] "In typical training, the gradient of the loss function is back propagated through the network in a process that determines, for each weight, the rate of change of the loss function with respect to a change of the weight at the current value of the loss function.).
determining, by the one or more computing devices, a gradient of the loss function with respect to the one or more neurons, links, or biases of the neural network, wherein, for at least the one or more neurons, links, or biases, the loss function includes an additional loss term that penalizes non-adherence of the one or more neurons, links, or biases to one or more code constraints; and ([Col. 9 l. 23-25] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term to the computed error. This constraint term penalizes (i.e., adds to the loss function computation) for weight values that do not belong to their set of discrete values" [Col. 10 l. 35] “Returning to the process 400, the weight modifier 325 adjusts (at 430) the weight values based on the relative rates of change and a training rate factor. That is, the error propagator 315 provides, for each weight value wik, the partial derivative of the loss function with respect to that wik. These partial derivatives are used to update the weight values by moving the weight values in the direction opposite the gradient (to attempt to reduce the loss function value)” Constraint term interpreted as synonymous with code constraint.).
modifying, by the one or more computing devices, the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function with respect to the one or more neurons, links, or biases. ([Col. 2 l. 33-29] "These gradients are used to update the weight values by moving the weight values in the direction opposite the gradient (to attempt to reduce the loss function value)"). 

Regarding claim 2, Teig teaches The computer-implemented method of claim 1, wherein the one or more code constraints comprise a set of equations on values produced by the one or more neurons, links, or biases, wherein the set of equations comprises linear equations, non-linear equations, or both linear and non-linear equations. ([Col. 9 l. 21-35] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term...this additional term is actually an amalgamation (e.g., a summation) of terms for each weight used in the multi-layer network. The additional term for a particular weight, in some embodiments, uses a function that evaluates to zero when the weight is one of the set of discrete values desired for that weight." Amalgamation interpreted as synonymous with set of equations.  Summation interpreted as linear equation.  Weights are interpreted as values produced by the one or more neurons.). 

Regarding claim 9, Teig teaches A computing system comprising: ([Col. 17 l. 9] "The electronic system 700 may be a computer").
one or more processors; ([Col. 17 l. 9] "Electronic system 700 includes a bus 705, processing unit(s)").
one or more non-transitory computer-readable media that store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: ("Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media)...The computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations." ).
obtaining data descriptive of a neural network; and ([Abstract] "The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device" Storing interpreted as synonymous with obtaining. ).
training the neural network based on a training dataset, wherein training the neural network comprises injecting a gradient of an additional loss term into the neural network during backpropagation ([Col. 2 l. 22] "In typical training, the gradient of the loss function is back propagated") of a loss function through the neural network, wherein the additional loss term penalizes non-adherence of two or more neurons or parameters of the neural network ([Col. 5 l. 15-33] "Specifically, during training, some embodiments add a continuously-differentiable term to the loss function for the multi-layer network that biases training of the weights toward a set of discrete values. Rather than simply training the network and then rounding the weights to the nearest discrete value in a pre-defined set, augmenting the loss function with the to one or more code constraints ([Col. 9 l. 23-25] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term to the computed error). 

Regarding claim 13, Teig teaches The computing system of claim 9, wherein the two or more neurons or parameters to which the one or more constraints are applied are located in a same single hidden layer of the neural network. ([Col. 4 l. 22] "FIG. 5 illustrates a simple feed-forward neural network 500 with one hidden layer having two nodes, and a single output layer with one output node" Teig explicitly teaches that two or more neurons are located in a same single hidden layer of the neural network.  [Col. 9 l. 23-25] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term to the computed error. This constraint term penalizes (i.e., adds to the loss function computation) for weight values that do not belong to their set of discrete values" Code constraint interpreted as synonymous with constraint term. [Col. 7 l. 16] "In this equation, wik are weight values associated with the inputs yk of the neuron i in layer l+1."  Teig teaches that the code constraints are applied to the weights which are used as inputs to the neural networks which is interpreted as synonymous with applying constraints the neurons located in a same single hidden layer.). 

The computing system of claim 9, wherein the two or more neurons or parameters to which the one or more constraints are applied are located in two or more different hidden layers of the neural network. ([Col. 1 l. 39-52] "The multi-layer network of some embodiments includes a layer of one or more input nodes, a layer of one or more output nodes, and one or more layers of hidden (interior) nodes. Each node in the multi-layer network produces an output value based on one or more input values. Specifically, each hidden node and output node" [Col. 4 l. 22] "FIG. 5 illustrates a simple feed-forward neural network 500 with one hidden layer having two nodes, and a single output layer with one output node." FIG. 5 shows multiple nodes in a single layer, Teig also teaches that there may one or more hidden layers. [Col. 9 l. 23-25] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term to the computed error. This constraint term penalizes (i.e., adds to the loss function computation) for weight values that do not belong to their set of discrete values" Code constraint interpreted as synonymous with constraint term. [Col. 7 l. 16] "In this equation, wik are weight values associated with the inputs yk of the neuron i in layer l+1." Teig teaches that the code constraints are applied to the weights which are used as inputs to the neural networks which is interpreted as synonymous with applying constraints the neurons located in a two or more different hidden layers.). 

Regarding claim 15, Teig teaches The computing system of claim 9, wherein injecting the gradient of the additional loss term during backpropagation comprises treating the additional loss term as a regularization term for the loss function or performing Lagrangian constrained optimization. ([Col. 9 l. 35] " The full term introduced as an addition to the loss function, in some embodiments, is this penalty function (whether the previous example or a different function) multiplied by a variable Lagrangian multiplier (i.e., making the additional function a Lagrangian function), λikh(wik)." Performing Lagrangian constrained optimization is interpreted as synonymous with minimizing a loss function with Lagrangian multipliers as constraints.). 

Regarding claim 18, Teig teaches The computing system of claim 9, wherein the additional loss term penalizes non-adherence of two or more neurons or parameters respectively included in two or more different neural networks to the one or more code constraints. ([Col. 9 l. 21-25] "The process 400 (e.g., the error propagator 315) adds (at 420) a continuously-differentiable constraint term to the computed error. This constraint term penalizes (i.e., adds to the loss function computation) for weight values that do not belong to their set of discrete values;" [Col. 16 l. 8-12] "The memory also stores multiple sets of sub-network parameters 680, including at least a set of weight values for an audio-processing network and a set of weight values for an image-processing network." See FIG. 6. Two or more networks within a single neural network interpreted as synonymous with sub-network.). 

Regarding claim 20, Teig teaches The computing system of claim 9, wherein the operations further comprise puncturing one or more links of the neural network to break symmetry of the two or more neurons or parameters. ([Col. 16 l. 8-19] "These multiple sets of weights may be used by the processing units 610 when 

Regarding claim 21, Teig teaches The computing system of claim 9, wherein training the neural network further comprises dropping out or shuffling training examples from the training dataset between components of the network or separate networks trained together. ([Col. 16 l. 8-19] "If a larger number of the weight values for each network are 0, this simplifies the processing for each sub-network, as many of the edges (and possibly entire nodes) will effectively drop out." Weights and edges interpreted as being representative of training examples from the training dataset.). 

Regarding claim 22, Teig teaches The computing system of claim 9, wherein the two or more neurons or parameters of the neural network comprise two or more neurons of a hidden layer of the neural network, wherein the one or more code constraints are applied to the two or more neurons pre-activation or post-activation. ([Col. 4 l. 22] "FIG. 5 illustrates a simple feed-forward neural network 500 with one hidden layer having two nodes, and a single output layer with one output node." See Eqn. I, J and [Col. 13 l. 25-50] for mathematical temporal relation between constraint and activation.  In equation J Teig teaches that the constraint term is formed 

Regarding claim 24, claim 24 effectively mirrors claim 1 with the exception that it is in regards to One or more non-transitory computer-readable media that store instructions that, when executed by one or more processors, cause the one or more processors to perform operations. Teig teaches ([Abstract] “Some embodiments provide a set of processing units and a set of machine-readable media. The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device”).  The rest of claim 24 is rejected under a similar interpretation to claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim 3-5, 6, 10, 11, 16, 17, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Teig and in view of Agrawal (US 2021/0142158 A1). 

Regarding claim 3, Teig teaches The computer-implemented method of claim 2.  However, Teig does not explicitly teach, wherein the set of equations comprise a set of parity check equations.  

Agrawal who teaches a related art of training a neural network based on constraints teaches wherein the set of equations comprise a set of parity check equations. ([¶0018] "According to examples of the present disclosure, the loss function may comprise: 
    PNG
    media_image1.png
    84
    379
    media_image1.png
    Greyscale
wherein: N is the number of bits in the transmitted codeword, I{f} is the indicator function, H is the parity check matrix of the code to which the transmitted codeword belongs"). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Regarding claim 4, Teig teaches The computer-implemented method of claim 1.  However, Teig does not explicitly teach, wherein the one or more code constraints comprise one or more error control code constraints, one or more modulation constraints, or one or more lattice code constraints.  

Agrawal who teaches a related art of training a neural network based on constraints teaches wherein the one or more code constraints comprise one or more error control code constraints, one or more modulation constraints, or one or more lattice code constraints. ([¶0012] "According to a first aspect of the present disclosure, there is provided a method for training a Neural Network, NN, to recover a codeword of a Forward Error Correction" Forward Error Correction codeword which is a type of error control code is interpreted as synonymous with error control code constraint. Agrawal explicitly teaches that the error control code may be used as a code constraint for training the neural network [¶0012].). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Regarding claim 5, Teig teaches The computer-implemented method of claim 1, wherein: the one or more neurons, links, or biases comprise a plurality of neurons included in a layer of the neural network; and ([Abstract] " The network of computation nodes includes multiple layers of nodes." Node interpreted as synonymous with neuron. ). However, Teig does not explicitly teach the additional loss term provides a penalty based at least in part on a magnitude of a syndrome, wherein the magnitude of the syndrome is based at least in part on a dot product of a neuron vector or link or bias vector produced by the layer and a parity-check matrix.  

Agrawal who teaches a related art of training a neural network based on constraints teaches the additional loss term provides a penalty based at least in part on a magnitude of a syndrome, wherein the magnitude of the syndrome is based at least in part on a dot product of a neuron vector or link or bias vector produced by the layer and a parity-check matrix. ([¶0055] "Examples of the proposed solutions provide robustness towards learning for reducing BLER by introducing penalty in case of decoding failure." [¶0107] "Syndrome check vector S is given by  [Eqn. 1.12] *Check ⇒ If S=0, then the codeword ŝ is returned as output by the decoder. Else, continue to next step."  [¶0141] "at step 130, optimising trainable parameters of the NN to minimise a loss function. As illustrated in FIG. 4, and performing a syndrome check on the generated intermediate output codeword at step 123." Syndrome equation shows that syndrome is calculated by dot product of the parity-matrix and neuron vector.  Agrawal 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Regarding claim 6, Teig teaches The computer-implemented method of claim 1, wherein: the one or more neurons, links, or biases comprise a plurality of neurons included in a layer of the neural network; and ([Abstract] " The network of computation nodes includes multiple layers of nodes." [Col. 16 l. 17-20] "If a larger number of the weight values for each network are 0, this simplifies the processing for each sub-network, as many of the edges (and possibly entire nodes) will effectively drop out." Node interpreted as synonymous with neuron. ). However, Teig does not explicitly teach the additional loss term provides a penalty based at least in part on a magnitude of a syndrome computed for the layer, wherein the magnitude of the syndrome is equal to a dot product of a neuron vector produced by the layer and a parity-check matrix minus a bias vector.  

Agrawal who teaches a related art of training a neural network based on constraints teaches the additional loss term provides a penalty based at least in part on a magnitude of a syndrome computed for the layer, wherein the magnitude of the syndrome is equal to a dot product of a neuron vector produced by the layer and a parity-check matrix minus a bias vector. ([¶0055] "Examples of the proposed solutions provide robustness towards learning for reducing BLER by introducing penalty in case of decoding failure." [¶0107] "Syndrome check vector S is given by  [Eqn. 1.12] *Check ⇒ If S=0, then the codeword ŝ is returned as output by the decoder. Else, continue to next step."  [¶0141] "at step 130, optimising trainable parameters of the NN to minimise a loss function. As illustrated in FIG. 4,...and performing a syndrome check on the generated intermediate output codeword at step 123." Syndrome equation shows that syndrome is calculated by dot product of the parity-matrix and neuron vector.  Agrawal explicitly teaches that the penalty is based on the syndrome vector being equal or non-equal to zero.  Checking that the vector is equal to zero is interpreted as checking that the magnitude is equal to zero. ). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and 

Regarding claim 10, Teig teaches The computing system of claim 9. However, Teig does not explicitly teach, wherein the one or more code constraints comprise one or more error correcting codes, modulation codes, or lattice codes that constrain the two or more neurons or parameters of the neural network.  

Agrawal who teaches a related art of training a neural network based on constraints teaches The computing system of claim 9, wherein the one or more code constraints comprise one or more error correcting codes, modulation codes, or lattice codes that constrain the two or more neurons or parameters of the neural network. ([¶0012] "According to a first aspect of the present disclosure, there is provided a method for training a Neural Network, NN, to recover a codeword of a Forward Error Correction" [¶0018] See Eqn. "H is the parity check matrix of the code to which the transmitted codeword belongs,...According to examples of the present disclosure, the loss function may be a cross entropy multi-loss function calculated on the basis of all intermediate output representations at layers up to and including the layer at which the syndrome check is satisfied." Forward Error Correction codeword which is a type of error control code is interpreted as synonymous with error control code constraint. Agrawal explicitly teaches that the error control code may be used as a code constraint for training the neural network [¶0012].). 



Regarding claim 11, Teig teaches The computing system of claim 9, wherein the one or more code constraints comprise…constraints applied to the two or more neurons or parameters of the neural network, wherein the… constraints sum to zero or another real number. ([Col. 9 l. 21-32] "This constraint term penalizes (i.e., adds to the loss function computation)...In some embodiments, this additional term is actually an amalgamation (e.g., a summation) of terms for each weight used in the multi-layer network. The additional term for a particular weight, in some embodiments, uses a function that evaluates to zero when the weight is one of the set of discrete values desired for that weight."). However, Teig does not explicitly teach parity constraints.  

Agrawal who teaches a related art of training a neural network based on constraints teaches parity constraints ([¶0014] “According to examples of the present disclosure, performing a syndrome check may comprise multiplying a vector of the generated 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Regarding claim 16, Teig teaches The computing system of claim 9.  However, Teig does not explicitly teach wherein injecting the gradient of the additional loss term comprises: determining a syndrome based on a code parity check matrix associated with the one or more code constraints; 
and determining the gradient of the additional loss term based at least in part on the syndrome.  

Agrawal who teaches a related art of training a neural network based on constraints teaches wherein injecting the gradient of the additional loss term comprises: determining a syndrome based on a code parity check matrix associated with the one or more code constraints; ([¶0014] “According to examples of the present disclosure, performing a syndrome check may comprise multiplying a vector of the 
and determining the gradient of the additional loss term based at least in part on the syndrome. ([¶0016] "According to examples of the present disclosure, example optimisation methods for minimising the loss function may include stochastic gradient descent methods" As Agrawal teaches that the loss function is dependent on the syndrome, and similarly that the loss function may include stochastic descent.  Since gradient descent by definition minimizes a loss function (which in this case depends on the syndrome) the gradient descent function must necessarily depend on the syndrome.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

 The computing system of claim 9. However, Teig does not explicitly teach wherein the one or more code constraints comprise one or more of: Hamming code constraints;  

Agrawal who teaches a related art of training a neural network based on constraints teaches wherein the one or more code constraints comprise one or more of: Hamming code constraints; ([¶0058] "FIG. 2 illustrates the parity check matrix of a Hamming (7, 4) code and Tanner graph of the parity check matrix"). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Regarding claim 23, Teig teaches the computing system of claim 9, and retrieving the neural network from a storage device. ([Col. 17 l. 16] "The system memory stores some of the instructions and data that the processor needs at runtime. In some embodiments, the invention's processes are stored in the system memory 725, the permanent storage device 735, and/or the read-only memory 730. From these various memory units, the processing unit(s) 710 retrieves instructions to execute and data to wherein the operations further comprise applying error correction to enforce one or more additional code constraints on the neural network when:  

Agrawal who teaches a related art of training a neural network based on constraints teaches ([¶0012] "According to a first aspect of the present disclosure, there is provided a method for training a Neural Network, NN, to recover a codeword of a Forward Error Correction" Forward Error Correction codeword interpreted as synonymous with error control code constraint.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the methods for training neural networks in Teig and Agrawal. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Agrawal ([¶0267] “the teachings of these embodiments may improve the network performance and data accuracy, and thereby provide benefits such as better responsiveness and reduced user waiting time.”).  

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Teig and in view of Kumar (US 2018/0343017 A1). 

 The computer-implemented method of claim 1. However, Teig does not explicitly teach wherein the additional loss term comprises the square error of the one or more code constraints to a composite loss.  

Kumar who teaches a related art of training a neural network based on code constraints teaches wherein the additional loss term comprises the square error of the one or more code constraints to a composite loss ([¶0065] "In an example, the loss function l includes a mean-squared error function that minimizes the average squared error between an output ƒ (x) and a target value y over all the example pairs (x, y)" ). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the loss functions in Teig and Kumar. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Kumar ([¶0065] “The neural network 400 also uses a loss function l (or, referred to also as a cost function c) to find an optimal solution. The optimal solution represents the situation where no solution has a loss less than the loss of the optimal solution.”).  

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Teig and in view of Milenova (US 7490071 B2). 

Regarding claim 8, Teig teaches The computer-implemented method of claim 1, wherein modifying, by the one or more computing devices, the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function comprises modifying, by the one or more computing devices, ([Col. 2 l. 33-29] "These gradients are used to update the weight values by moving the weight values in the direction opposite the gradient (to attempt to reduce the loss function value)").
the one or more neurons, links, or biases of the neural network based at least in part on the gradient of the loss function to minimize a loss provided by the loss function with respect to one or more neurons in the layer ([Col 10 l. 50] "That is, the gradient component for a particular weight provides an amount to move (in the direction opposite to the gradient component, as the goal is to minimize the loss function)  that weight value relative to the other weight values, while the training rate specifies the distance of that move." ).
the loss with respect to a multiplier dual variable vector. ([Col. 9 l. 35] " The full term introduced as an addition to the loss function, in some embodiments, is this penalty function (whether the previous example or a different function) multiplied by a variable Lagrangian multiplier (i.e., making the additional function a Lagrangian function), λikh(wik)." multiplier dual variable vector is also described by lambda in Teig.). However, Teig does not explicitly teach to maximize the loss  

Milenova who teaches a related art of machine learning teaches to maximize the loss ([Col. 10 l. 27] "It can be shown that the maximal margin hyperplane has a margin 1/∥w∥2. That is, by minimizing the weights, one maximizes the margin...Problems of this 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the loss optimization functions of Teig and Milenova. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Milenova ([¶Col. 9 l. 3-5], “The optimization problem is convex, implying that it has a unique optimum solution. In contrast with Neural Networks, SVM does not get caught in local minima.” Teaches that SVM can be substituted for NN for classification problems, and similarly teaches the benefits to using the optimization method proposed.).  

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Teig and in view of Kadhim (“EFFECTS OF DEGREE DISTRIBUTION IN RATELESS CODING”, 2014). 

Regarding claim 12, Teig teaches The computing system of claim 9.  However, Teig does not explicitly teach, wherein the one or more code constraints comprise one or more rateless code constraints.  

Kadhim who teaches a related art of generic constraints teaches The computing system of claim 9, wherein the one or more code constraints comprise one or more rateless code constraints. ([p.9] "Fountain codes, also known as rateless 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the constraints in Teig with those in Kadhim. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Kadhim ([p. 11] “Low density equality-check code (LDPC) is basically error linear code, mostly utilizing a noisy transmission network to decrease the likelihood of waste of data.” [p. 48] “Rateless codes are the appropriate choice because they ensure reliable data transmission. The decoder of rateless code can reconstruct the source packet without depending on a channel coefficient or the manner of this channel. In this thesis, the two rateless code models are illustrated in a general form for Raptor codes, LT codes and LDPC codes. We tested LDPC code with the effective linear time encoding method” Kadhim teaches a method of determining rateless codes as a generic error correction method for transmitting network data.).  

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Teig and in view of Li (US 2017/0109355 A1). 

Regarding claim 19, Teig teaches The computing system of claim 9. However, Teig does not explicitly teach, wherein the neural network comprises an input embedding layer that receives an input embedding, and wherein the two or more neurons or parameters to which the one or more code constraints are applied are included in the input embedding layer.  

Li who teaches a related method of training a neural network teaches wherein the neural network comprises an input embedding layer that receives an input embedding, and wherein the two or more neurons or parameters to which the one or more code constraints are applied are included in the input embedding layer ([¶0087] "In embodiments, for training the relation ranking model, the same setting (i.e., the two models do not share the word embedding in this embodiment) is used for the word embedding layer as in the subject labeling model... In embodiments, for the relation embedding, only 128d vectors are used. During training, each relation embedding is constrained to remain within the unit-ball [See eqn.]"). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the embedding layer in Li with the neural network of Teig. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Li ([¶0047] “At step 2442, the embedding layer 210 transforms the one or more words of the input query into one or more embeddings, where each embedding is a vector that represents the corresponding word.”).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Lillicrap (US20160162781A1), and Sheiman (US 20210306005 A1).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 






/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/Kevin W Figueroa/Primary Examiner, Art Unit 2124