DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “100” has been used to designate both the method in Figure 1 and the system in Figure 4A. In specification paragraph [0061], the reference character 100 designates the method in Figure 1. In specification paragraph [0102], lines 1 and 3, the reference character 100 designates the system in Figure 4A. The reference character 100 appears at the top of both Figures 1 and 4A.
 Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities: In paragraph [0091], line 3, “the method 100” should read “the method 200”. Appropriate correction is required.

Claim Interpretation
Claim 1 is a method claim that recites “determining, by the one or more computing devices, whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge.” This limitation is a contingent limitation and carries no patentable weight. Therefore, for the method claim, the prior art only needs to teach the determination of the estimated utility because it is essentially a calculation. See MPEP 2111.04(II); see also Ex parte Schulhauser. Examiner further notes that even though the broadest reasonable interpretation of the method claim merely requires the one condition, the prior art below reads on the structure for performing all of the functionality of the system and the medium claims.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
for at least one edge of the plurality of edges: determining an estimated utility of the edge; and (e.g., evaluating an estimated utility of the edge)
determining whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge. (e.g., evaluating or considering a choice)
	These limitations are mathematical computations, and they are mental processes which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
obtaining data 
descriptive of a machine-learned neural network, the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges, and wherein a plurality of weights are respectively associated with the plurality of edges; and 
One or more computing devices and a machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). This section also states: “For instance, a data gathering step that is limited to… a particular type of data (such as power grid data or XML tags) could be considered to be both insignificant extra-solution activity and a field of use limitation.” Obtaining data is mere data-gathering which is an insignificant extra-solution activity. See MPEP 2106.05(g). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices and a machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine 

CLAIM 2 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining the estimated utility of the edge comprises determining the estimated utility of the edge based at least in part on a first-order approximation of a loss function at the weight associated with the edge. (e.g., evaluating an estimated utility of the edge)
This limitation is mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. The broadest reasonable interpretation of “determining an estimated utility” includes determining or evaluating a utility based on value. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEPE 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract 

CLAIM 3 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining the estimated utility of the edge comprises determining a first derivative of a loss function with respect to a logit… but not determining any higher-order derivatives of the loss function. 
This limitation is a mathematical computation of calculating a first derivative. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
a receiving neuron at the weight associated with the edge
One or more computing devices and a receiving neuron are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices and a receiving neuron are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h). The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining the estimated utility of the edge comprises determining a sum over one or more training examples included in a training dataset of a proposed change in the weight multiplied by an output of a transmitting neuron multiplied by a first derivative of a loss function with respect to a logit of a receiving neuron at the weight and training example. 
This limitation is a mathematical computation of calculating a first derivative. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEPE 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEPE 2106.05(h). The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge comprises determining whether to prune the edge based at least in part on the estimated utility of the edge. (e.g., choosing whether or not to prune)
This limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h). The claim is not patent eligible.

CLAIM 6 incorporates the rejection of claim 5.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 5 are incorporated. The claim recites the following limitations:
pruning the edge; and 
after pruning the edge, supplementing the machine-learned neural network with at least one additional edge at a different location within the machine-learned neural network.
These limitations are mathematical computations, and they are mental processes and they are mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEPE 2106.05(h). The claim is not patent eligible.

CLAIM 7 incorporates the rejection of claim 5.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 5 are incorporated. The claim recites the following limitations:
pruning the edge; and 
after pruning the edge, … prevents any other edges that connect to a same neuron as the edge from being modified in one or more pruning iterations.
	Pruning the edge is a mathematical computation. Pruning the edge and preventing other edges from being modified are mental process which can reasonably be performed be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
storing a data item
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). Storing a data item is mere data-gathering which is an insignificant extra-solution activity. See MPEP 2106.05(g). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). Storing a data item is well-understood, routine, conventional activity of storing information in memory, as discussed in MPEP § 2106.05(d), subsection II, example (iv). The claim is not patent eligible.

CLAIM 8 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge comprises selecting one of two or more proposed quantization schemes based at least in part on the estimated utility of the edge. (e.g., selecting)
Selecting is a mental process which can reasonably be performed be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). One or more computing devices are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 9 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
 adding a patch subnetwork 
Adding a patch subnetwork is a mathematical computation, and it is a mental process which can reasonably be performed be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
one or more computing devices
the machine-learned neural network
the patch subnetwork is trained to predict an error associated with its input.
These additional elements are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). These additional elements are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 10
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
determining a respective estimated utility of each of the plurality of edges; (e.g., evaluating an estimated utility of the edge)
selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges; and 
deleting the selected one or more edges.
	These limitations are mathematical computations, and they are mental processes which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
A computer system
one or more processors; 
25one or more non-transitory computer-readable media 
instructions
obtaining data 
descriptive of a machine-learned neural network, the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges;
A computer system, processors, non-transitory computer-readable media, instructions, and a machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). This section also states: “For instance, a data gathering step that is limited to… a particular type of data (such as power grid data or XML tags) could be considered to be both insignificant extra-solution activity and a field of use limitation.” Obtaining data is mere data-gathering which is an insignificant extra-solution activity. See MPEP 2106.05(g). 

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). A computer system, processors, non-transitory computer-readable media, instructions, and a machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). Obtaining data is well-understood, routine, conventional activity of receiving data over a network, as discussed in MPEP § 2106.05(d), subsection II, example (i). The claim is not patent eligible.

CLAIMS 11-13 incorporate the rejection of claim 10. Claims 11, 12, and 13 are rejected for the same reasons as claims 2, 3 and 4, respectively.

CLAIM 14 incorporates the rejection of claim 10.
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges 26comprises selecting a predetermined number of the plurality of edges that have the lowest estimated utilities.
This limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application because the claim recites no additional elements they impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). The claim is not patent eligible.

CLAIM 15 incorporates the rejection of claim 10.
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges comprises selecting a predetermined percentage of the plurality of edges that have the lowest estimated utilities.
This limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application because the claim recites no additional elements they impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). The claim is not patent eligible.

CLAIM 16 incorporates the rejection of claim 10.
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:

This limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the machine-learned neural network.
A machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). A machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 17 incorporates the rejection of claim 16.
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
adding one or more new edges comprises adding a same number of new edges as was deleted from the machine-learned neural network.
This limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the machine-learned neural network.
A machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). A machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 18
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
determining a plurality of different proposed quantization schemes, each proposed quantization scheme including changes to the respective weight of one or more edges to be quantized under such scheme; (e.g., determining a list of quantization schemes)
estimating a change in loss for each of the plurality of different proposed quantization schemes, (e.g., evaluating a change)
determining an estimated change in utility of each edge to be quantized; (e.g., evaluating an estimated change)
selecting one of the proposed quantization schemes based at least in part on the estimated changes in loss; and (e.g., selecting)
27applying the selected quantization scheme
changing the respective weight of the one or more edges to be quantized under such scheme.
	These limitations are mathematical computations, and they are mental processes which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
One or more non-transitory computer-readable media
instructions 
one or more processors, 
obtaining data descriptive of a machine-learned neural network
the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges, and wherein a plurality of weights are respectively associated with the plurality of edges; 
Non-transitory computer-readable media, instructions, processors, and the machine-learned neural network are generally linking the abstract idea to the particular technological environment of machine learning, as discussed in MPEP 2106.05(h). Processors are mere instructions to implement the abstract ideas on a computer, as discussed in MPEP 2106.05(f). Obtaining data descriptive of a machine-learned neural network is mere data-gathering which is an insignificant extra-solution activity, as discussed in MPEP 2106.05(g). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). Non-transitory computer-readable media, instructions, processors, 
CLAIM 19 incorporates the rejection of claim 18.
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining the estimated change in utility of each edge to be quantized comprises determining the estimated change in utility of each edge to be quantized based at least in part on a first-order approximation of a loss function at the weight associated with the edge. (e.g., evaluating an estimated change in utility of each edge)
The limitation is a mathematical computation, and it is a mental process which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The claim recites no additional elements to integrate the judicial exceptions into a practical application. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). The claim is not patent eligible.

CLAIM 20 incorporates the rejection of claim 18.
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
determining the estimated change in utility of each edge to be quantized comprises determining, for each edge to be quantized, a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge without determining any higher-order derivatives of the loss function. 
This limitation is a mathematical computation of calculating a first derivative. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The claim recites no additional elements to integrate the judicial exceptions into a practical application. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception(s). The claim is not patent eligible.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 18-19 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Baum et al. (US 20180285736 A1).

	Regarding CLAIM 18, Baum teaches: One or more non-transitory computer-readable media that collectively store instructions that, when executed by one or more processors, cause the one or more processors to perform operations, (¶ [0051], first sentence)
the operations comprising: obtaining data descriptive of a machine-learned neural network, wherein the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges, and wherein a plurality of weights are respectively associated with the plurality of edges; (Obtaining data includes analyzing weights in at least one layer as taught by Baum claim 4 on p. 10. An ANN is taught by ¶ [0007] and ¶ [0074], figure 2.)
determining a plurality of different proposed quantization schemes, each proposed quantization scheme including changes to the respective weight of one or more edges to be quantized under such scheme; (Two quantization schemes are equations (2) and (4) in ¶ [0091]. Two other quantization schemes are taught in ¶ [0104], where scale-and-shift is taught in ¶ [0105] and dropping bits in ¶ [0112].)
estimating a change in loss for each of the plurality of different proposed quantization schemes, wherein estimating the change in loss for each proposed quantization scheme comprises determining an estimated change in utility of each edge to be quantized; (abstract, Estimating a change in loss is taught by ¶ [0091], equations (2) and (4), and by Claim 4, lines 4-6 on p. 10, “selecting…”)
selecting one of the proposed quantization schemes based at least in part on the estimated changes in loss; and (Baum p. 10, Claim 4, lines 4-6: “selecting…”)
27applying the selected quantization scheme to the machine-learned neural network, wherein applying the selected quantization scheme comprises changing the respective weight of the one or more edges to be quantized under such scheme. (Baum p. 10, Claim 1, last limitation.)

CLAIM 19, Baum teaches: The one or more non-transitory computer-readable media of claim 18, wherein determining the estimated change in utility of each edge to be quantized comprises determining the estimated change in utility of each edge to be quantized based at least in part on a first-order approximation of a loss function at the weight associated with the edge.  (The BRI of an estimated change in utility includes computing a loss function based on gradient descent, as taught in the middle of ¶ [0011] (“Minimizing this cost… networks”).)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that 

Claims 1-2, 5-8, 10-11, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Bellec et al. (“Deep Rewiring: Training very sparse deep networks”) in view of Baum et al. (US 20180285736 A1).

	Regarding CLAIM 1, Bellec teaches: A computer-implemented method, comprising: 
obtaining data descriptive of a machine-learned neural network, (P. 2, ¶ 2, lines 5-7; P. 3, “The Deep R Algorithm”, end of first paragraph teaches data of network parameters θ and network weights w. On p. 4, middle paragraph, line 6 teach a number of connections K are active during training.)
wherein a plurality of weights are respectively associated with the plurality of edges; and (P. 2, ¶ 2, lines 5-7; P. 3, “The Deep R Algorithm”, second paragraph, lines 1-3)
for at least one edge of the plurality of edges: 
determining an estimated utility of the edge; and (The BRI of this limitation includes determining the sign of a connection parameter because the sign controls whether the connection gets pruned. Taught by p. 2, second paragraph, lines 5-9; and by p. 3, “The Deep R Algorithm”, second paragraph, lines 1-3.)
determining whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge. (p. 3, “The Deep R Algorithm”, second paragraph, second sentence.)
	However, Bellec does not explicitly teach: obtaining, determining, and determining by one or more computing devices 
wherein the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges, and
	But Baum teaches: obtaining, determining, and determining by one or more computing devices (¶ [0059] teaches computing device 11)
wherein the machine-learned neural network comprises a plurality of neurons respectively connected by a plurality of edges, and (¶ [0074] teaches an ANN.)
	Baum is in the same field of endeavor as the claimed invention, namely machine learning. Therefore, it would have been obvious to one of ordinary skill in the art to have used Baum’s computing device to perform Bellec’s experiments, and to have incorporated Baum’s neural network structure into Bellec’s system. A motivation for the combination is to improve the performance of neural networks. (Baum ¶ [0003])

Regarding CLAIM 2, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1,
Bellec teaches: wherein determining the estimated utility of the edge comprises determining the estimated utility of the edge based at least in part on a first-order approximation of a loss function at the weight associated with the edge. (The BRI of this limitation includes determining an updated connection parameter                         
                            
                                
                                    θ
                                
                                
                                    k
                                
                            
                        
                     in Algorithm 1, line 3. The error function in line 3 is further taught on p. 2, § 2, end of ¶ 1.)
	However, Bellec does not explicitly teach: determining, by the one or more computing devices
	But Baum teaches: determining, by the one or more computing devices (¶ [0059] teaches computing device 11)

CLAIM 5, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1,
	Bellec teaches: wherein determining whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge comprises determining whether to prune the edge based at least in part on the estimated utility of the edge. (p. 3, “The Deep R Algorithm”, second paragraph, second sentence.)
	However, Bellec does not explicitly teach: determining, by the one or more computing devices
	But Baum teaches: determining, by the one or more computing devices (¶ [0059] teaches computing device 11)

Regarding CLAIM 6, the combination of Bellec and Baum teaches: The computer-implemented method of claim 5,
 Bellec teaches: further comprising: pruning the edge; and (P. 3, Algorithm 1, line 4; and p. 4, first full paragraph, second sentence.)
after pruning the edge, supplementing the machine-learned neural network with at least one additional edge at a different location within the machine-learned neural network. (P. 3, Algorithm 1, line 7; and p. 4, first full paragraph, third sentence)
	However, Bellec does not explicitly teach: pruning and supplementing, by the one or more computing devices
	But Baum teaches: pruning and supplementing, by the one or more computing devices (¶ [0059] teaches computing device 11)

Regarding CLAIM 7, the combination of Bellec and Baum teaches: The computer-implemented method of claim 5,
further comprising: pruning the edge; and (P. 3, Algorithm 1, line 4; and p. 4, first full paragraph, second sentence.)
after pruning the edge,  … a data item that prevents any other edges that connect to a same neuron as the edge from being modified in one or more pruning iterations. (The broadest reasonable interpretation of this claim is that the last iteration of Algorithm 1 (P. 3) prevents any other edges from being modified.)
However, Bellec does not explicitly teach: pruning, by the one or more computing devices and storing, by the one or more computing devices
	But Baum teaches: pruning, by the one or more computing devices (¶ [0059] teaches computing device 11)
	storing, by the one or more computing devices (¶ [0059] teaches computing device 11 comprises main memory 24)

	Regarding CLAIM 8, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1, 
However, Bellec does not explicitly teach: wherein determining, by the one or more computing devices, whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge comprises selecting, by the one or more computing devices, one of two or more proposed quantization schemes based at least in part on the estimated utility of the edge.
But Baum teaches: wherein determining, by the one or more computing devices, whether to adjust the weight associated with the edge based at least in part on the estimated utility of such edge comprises selecting, by the one or more computing devices, one of two or more proposed quantization schemes based at least in part on the estimated utility of the edge. (Two quantization schemes are equations (2) and (4) in ¶ [0091]. Selection is taught by p. 10, Claim 4, lines 4-6: 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have selected one of Baum’s quantization scheme for quantizing Bellec’s weights with a motivation to extract further performance improvements based on the specific data presented to the network. (Baum ¶ [0076], first and last sentences.)

	CLAIM 10 recites: A computer system, comprising: one or more processors; and 25one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computer system to perform operations comprising the method of claim 1. Baum teaches a computer system by the computing device 11 in ¶ [0059], processors in ¶ [0060], line 1, and main memory in ¶ [0059], last line. Baum teaches instructions in ¶ [0051], first sentence. Claim 10 is rejected under 35 U.S.C. 103 for the reasons set forth in the rejection of claim 1.


Regarding CLAIM 11, the combination of Bellec and Baum teaches: The computing system of claim 10, 
Bellec teaches: wherein determining the respective estimated utility of each of the plurality of edges comprises determining the respective estimated utility of each of the plurality of edges based at least in part on a first-order approximation of a loss function at a weight associated with the edge. (The BRI of this limitation includes determining an updated connection parameter                         
                            
                                
                                    θ
                                
                                
                                    k
                                
                            
                        
                     in Algorithm 1, line 3. The error function in line 3 is further taught on p. 2, § 2, end of ¶ 1.)

CLAIM 16, the combination of Bellec and Baum teaches: The computing system of claim 10, 
Bellec teaches: wherein the operations further comprise adding one or more new edges to the machine-learned neural network. (P. 3, Algorithm 1, lines 6-9 and P. 4, middle paragraph, lines 3-7)

Regarding CLAIM 17, the combination of Bellec and Baum teaches: The computing system of claim 16, 
Bellec teaches: wherein adding one or more new edges to the machine-learned neural network comprises adding a same number of new edges to the machine- learned neural network as was deleted from the machine-learned neural network. (P. 3, Algorithm 1, lines 6-9 and P. 4, middle paragraph, lines 3-7)

Claims 3 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Bellec et al. (“Deep Rewiring: Training very sparse deep networks”) in view of Baum et al. (US 20180285736 A1), and further in view of Sadowski (“Notes on Backpropagation”).

	Regarding CLAIM 3, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1, 
Bellec teaches: wherein determining the estimated utility of the edge comprises determining a backpropagation algorithm (The BRI of this limitation includes determining the sign of a connection parameter (p. 2, ¶ 2, lines 5-9) and performing gradient descent using backpropagation (p. 2, Algo. 1, line 3 and caption).)
However, Bellec does not explicitly teach: determining, by the one or more computing devices, a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge, but not determining, by the one or more computing devices, any higher-order derivatives of the loss function.
But Baum teaches: determining, by the one or more computing devices (¶ [0059] teaches computing device 11)
However, neither Bellec nor Baum explicitly teaches: determining a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge, but not determining any higher-order derivatives of the loss function.
	But Sadowski teaches: determining a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge, but not determining any higher-order derivatives of the loss function. (P. 2, equation (11) teaches a first derivative of an error with respect to a logit                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    .)
	Sadowski is in the same field of endeavor as the claimed invention, namely, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have performed backpropagation on Bellec’s neural network, with a motivation to reduce the error of the network. (Sadowski, Abstract). 

	CLAIM 12 recites the computing system of claim 10. Claim 12 is rejected under 35 U.S.C. 103 for the reasons set forth in the rejection of claim 3.

Claims 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Bellec et al. (“Deep Rewiring: Training very sparse deep networks”) in view of Baum et al. (US 20180285736 A1), and further in view of Ng (“Sparse autoencoder”).

CLAIM 4, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1, 
Bellec teaches: determining the estimated utility of the edge comprises determining a backpropagation algorithm (The BRI of this limitation includes determining the sign of a connection parameter (p. 2, ¶ 2, lines 5-9) and performing gradient descent using backpropagation (p. 2, Algo. 1, line 3 and caption).)
However, Bellec does not explicitly teach: determining, by the one or more computing devices, a sum over one or more training examples included in a training dataset of a proposed change in the weight multiplied by an output of a transmitting neuron multiplied by a first derivative of a loss function with respect to a logit of a receiving neuron at the weight and training example.
	But Baum teaches: determining, by the one or more computing devices (¶ [0059] teaches computing device 11)
	However, neither Bellec nor Baum explicitly teaches: determining a sum over one or more training examples included in a training dataset of a proposed change in the weight multiplied by an output of a transmitting neuron multiplied by a first derivative of a loss function with respect to a logit of a receiving neuron at the weight and training example.
But Ng teaches: determining a sum over one or more training examples included in a training dataset of a proposed change in the weight multiplied by an output of a transmitting neuron multiplied by a first derivative of a loss function with respect to a logit of a receiving neuron at the weight and training example. (A sum over training examples is taught by equation                         
                            
                                
                                    ∂
                                
                                
                                    ∂
                                    
                                        
                                            W
                                        
                                        
                                            i
                                            j
                                        
                                        
                                            
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                            
                            J
                            
                                
                                    W
                                    ,
                                    b
                                
                            
                        
                     on p. 7 at the bottom, where each pair (x,y) form a training example. A proposed change in the weight is the learning rate                         
                            α
                        
                     in the update formula for                         
                            
                                
                                    W
                                
                                
                                    i
                                    j
                                
                                
                                    
                                        
                                            l
                                        
                                    
                                
                            
                        
                     on p. 7. An output of a transmitting neuron is                         
                            
                                
                                    a
                                
                                
                                    j
                                
                                
                                    
                                        
                                            l
                                        
                                    
                                
                            
                        
                                             
                            
                                
                                    δ
                                
                                
                                    i
                                
                                
                                    
                                        
                                            
                                                
                                                    n
                                                
                                                
                                                    l
                                                
                                            
                                        
                                    
                                
                            
                        
                     in step 2 on p. 8, where z is a logit, as discussed on p. 4 below equation (5).)
	Ng is in the same field of endeavor as the claimed invention, namely, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have performed backpropagation on Bellec’s neural network, with a motivation to train the network. (Ng, top of p. 6)

	CLAIM 13 recites the computing system of claim 10. Claim 13 is rejected under 35 U.S.C. 103 for the reasons set forth in the rejection of claim 4.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Bellec et al. (“Deep Rewiring: Training very sparse deep networks”) in view of Baum et al. (US 20180285736 A1), and further in view of Kauschke et al. (“Batchwise Patching of Classifiers”).

	Regarding CLAIM 9, the combination of Bellec and Baum teaches: The computer-implemented method of claim 1, 
However, neither Bellec nor Baum explicitly teaches: further comprising: adding, by the one or more computing devices, a patch subnetwork to the machine- learned neural network, wherein the patch subnetwork is trained to predict an error associated with its input.
	But Kauschke teaches: further comprising: adding, by the one or more computing devices, a patch subnetwork to the machine learned neural network, wherein the patch subnetwork is trained to predict an error associated with its input. (P. 3375, § 2.2, steps (i) and (ii).)
	Kauschke is in the same field of endeavor as the claimed invention, namely, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date (Kauschke, P. 3374, col. 2, end of second paragraph)

Claims 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Bellec et al. (“Deep Rewiring: Training very sparse deep networks”) in view of Baum et al. (US 20180285736 A1), and further in view of Wang et al. (US Patent 11,200,495 B2)

Regarding CLAIM 14, the combination of Bellec and Baum teaches: The computing system of claim 10, 
	However, neither Bellec nor Baum explicitly teaches: wherein selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges 26comprises selecting a predetermined number of the plurality of edges that have the lowest estimated utilities.
But Wang teaches: wherein selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges 26comprises selecting a predetermined number of the plurality of edges that have the lowest estimated utilities. (The BRI of a number includes a percentage. Wang teaches this limitation in C. 3, L. 63 to C. 4, L. 5)
	Wang is in the same field of endeavor as the claimed invention, namely network pruning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have pruned a predetermined percentage of Bellec’s edges according to Wang’s method 100. A motivation for the combination is to clamp the weights of connections that are close to zero to zero. (C. 3, L. 64-65)

Regarding CLAIM 15, the combination of Bellec and Baum teaches: The computing system of claim 10, 
	However, neither Bellec nor Baum explicitly teaches: wherein selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges comprises selecting a predetermined percentage of the plurality of edges that have the lowest estimated utilities.
But Wang teaches: wherein selecting one or more edges for deletion based at least in part on the respective estimated utility of each of the plurality of edges comprises selecting a predetermined percentage of the plurality of edges that have the lowest estimated utilities. (C. 3, L. 63 to C. 4, L. 5)
	Wang is in the same field of endeavor as the claimed invention, namely network pruning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have pruned a predetermined percentage of Bellec’s edges according to Wang’s method 100. A motivation for the combination is to clamp the weights of connections that are close to zero to zero. (C. 3, L. 64-65)

Claim 20 are rejected under 35 U.S.C. 103 as being unpatentable over Baum et al. (US 20180285736 A1) in view of Sadowski (“Notes on Backpropagation”).

	Regarding CLAIM 20, Baum teaches: The one or more non-transitory computer-readable media of claim 18, wherein determining the estimated change in utility of each edge to be quantized comprises determining, for each edge to be quantized, 
a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge without determining any higher-order derivatives of the loss function.
But Saudowski teaches: determining a first derivative of a loss function with respect to a logit of a receiving neuron at the weight associated with the edge without determining any higher-order derivatives of the loss function. (P. 2, equation (11) teaches a first derivative of an error with respect to a logit                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    .)
Sadowski is in the same field of endeavor as the claimed invention, namely, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have performed backpropagation on Baum’s neural network, with a motivation to reduce the error of the network. (Abstract). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20210182683 A1 to Dai et al. teaches a method for first growing neural network connections and then pruning some of those connections (see Dai Fig. 2).
US 20170046614 A1 to Golovashkin et al. teaches a method of sparsifying edges and then restoring some of the same connections.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASHER H. JABLON/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127