DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements (IDSs) submitted on 4/9/2020 and 4/30/2021 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements have been considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference signs mentioned in the description: 

The drawings are additionally objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description: 
Reference characters 210 and 216 shown in Figure 2A are not found in the detailed description.
Reference character 224 shown in Figure 2B is not found in the detailed description.
Reference characters 880 (880(0), 880(1) and 880(D-1)) shown in Figure 8 are not found in the detailed description.
Appropriate correction is required.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the 

Specification
The disclosure is objected to because of the following informalities:
The description of FIG. 1 in paragraph 17 in applicant’s specification mentions reference characters 115 and 134 (see, e.g., references to “pruning engine 115” and “memory 134”), which are not shown in Figure 1. It appears that refences to “pruning engine 115” and “memory 134” should read ““pruning engine 114” and “memory 120”, as shown in FIG. 1 and indicated elsewhere in the specification (see, e.g., paragraphs 16, 22-23, 29, 37, 50-53 and 58-60). Also, the description of FIG. 7 in paragraphs 61-62, 66, 69, 71-72, 78-79, 92 and 95 mentions “computer system 700” and “system memory 704”, which are not shown in Figure 7. Additionally, the description of FIG. 8 in paragraphs 77-78 mentions “DRAM 820” and “DRAMs 820”, which are not shown in Figure 8. Appropriate correction is required.
The specification is also objected to because reference characters 210 and 216 shown in Figure 2A are not described in applicant’s specification (see, e.g., paragraphs 27-30 describing FIG. 2A). Further, reference character 224 shown in Figure 2B is not described in applicant’s specification (see, e.g., paragraph 31 describing FIG. 2B). Lastly, reference characters 880 shown in Figure 8 are not described in applicant’s specification (see, e.g., paragraphs 70-80 describing FIG. 8). Appropriate correction is required.

Claim Objections
Claims 9-16 are objected to because of the following informalities: 
Line 6 of independent claim 9 recites “a neural network”. However, applicant previously introduced “a neural network” in line 3 of this claim. As such, the subsequent recitation of “a neural network” apparently refers to the previously-introduced “neural network”. Thus, for the sake of consistency and clarity, the subsequent recitation should read “[[a]] the neural network”. For examination purposes, the subsequent recitation of “a neural network” is being interpreted as the previously-introduced “neural network”. Appropriate correction is required.
Also, claims 10-16, which each depend directly or indirectly from claim 9, are objected to based on their respective dependencies from claim 9. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 9-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Lines 5-6 of independent claim 9 recite “each of a plurality of corresponding neurons from the plurality of network layers within a neural network”. However, applicant previously introduced “a plurality of corresponding neurons in a plurality of network layers within a neural network” in lines 3-4 of this claim. It is unclear whether the subsequently-recited “each of a plurality of corresponding neurons from the plurality of network layers within a neural network” the plurality of corresponding neurons from the plurality of network layers within [[a]] the neural network” or “each neuron of [[a]] the plurality of corresponding neurons from the plurality of network layers within [[a]] the neural network”. For examination purposes, “each of a plurality of corresponding neurons from the plurality of network layers within a neural network is being interpreted as each neuron of the previously-introduced plurality of corresponding neurons from the previously-introduced plurality of network layers within the previously-introduced neural network. 
Independent claim 17 recites “wherein the plurality of computational logic units are to be programmed according to a neural network architecture” in lines 3-5. It is unclear when the recited “computational logic units” will be programmed – i.e., at some (undetermined and unknown) future time. That is, the claim does not clearly recite when the programming will occur, or specify what event(s) will trigger or cause the programming. Aside from merely repeating the claim language in paragraph 115, applicant’s specification does not provide any clarity on what is meant by “computational logic units are to be programmed”. For examination purposes, the “plurality of computational logic units are to be programmed” is being interpreted as any computational logic units that are programmed or are programmable. Appropriate correction is required.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and 
Claims 1-4, 6, 8-9, 11-13, 15 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (U.S. Patent Application Pub. No. 2018/0114114 A1, hereinafter “Molchanov”) in view of non-patent literature Li, Hao, et al. ("Pruning filters for efficient convnets." arXiv preprint arXiv:1608.08710v3 (2017): 1-13, hereinafter “Li”).
With respect to claim 1, Molchanov discloses the invention as claimed including a computer-implemented method (see, e.g., paragraph 17, “a method for neural network pruning … the method 100 may also be performed by a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a computer-implemented method]) comprising:
identifying a plurality of corresponding neurons in a plurality of network layers within a neural network (see, e.g., paragraphs 20 and 29, “at least one neuron having a lowest importance is identified. … the at least one neuron corresponds to a feature map in a convolutional layer … the at least one neuron comprises a predetermined percentage of all of the neurons in the trained neural network.”, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers” [i.e., identify neurons in layers within a neural network]), wherein each neuron in the plurality of corresponding neurons is located at a … location within a different network layer included in the plurality of network layers (see, e.g., paragraphs 29, 45 and 49, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers”, “neurons corresponding to a single feature map is pruned i.” [i.e., each neuron of the corresponding neurons/corresponding to a single feature map, is located in a different layer in the network layers]); and
deactivating each of the plurality of corresponding neurons from the plurality of network layers based, at least in part, on a metric associated with the plurality of corresponding neurons (see, e.g., paragraphs 21, 27 and 42, “at least one neuron is removed from the trained neural network to produce a pruned neural network. … In one embodiment, greedy criteria-based pruning is interleaved with fine-tuning to iteratively remove neurons from the trained neural network”, “a criteria-based pruning technique is preferred, starting with a full set of the parameters W and pruning as a backward filter by iteratively identifying and removing at least one least important layer parameter”, “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing each of the neurons from the layers of the neural network based in part on a criteria/parameter/importance value/metric associated with the neurons]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each neuron in the plurality of corresponding neurons is located at a matching location within a different network layer included in the plurality of network layers.
wherein each neuron in the plurality of corresponding neurons is located at a matching location within a different network layer included in the plurality of network layers (aside from repeating the claim language in paragraphs 99 and 115, and providing examples in paragraphs 32 and 34, applicant’s specification does not define “a matching location within a different network layer”. Paragraphs 32 and 34 of the instant specification state “corresponding neurons are located at the same location within the respective input layers. For example, assume layer 1 and layer 2 are input layers into an element-wise addition operation. In such an example, a first elementwise addition operation will be performed on neuron A at a first width, height, and depth within layer 1 and neuron B at the same width, height, and depth in layer 2” and “corresponding neurons are those neurons that are located at the same location (e.g., coordinates or index) within the respective input layers.” Therefore, “a matching location within a different network layer”, under its broadest reasonable interpretation (BRI) in light of the specification, is any corresponding location, coordinate or index, in another network layer) (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where neurons/feature maps in a same index/location in different layers are pruned and pages 3-5, sections 3.1 and 3.3, “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.”, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

With regard to independent claim 9, Molchanov discloses the invention as claimed including a computer-implemented method (see, e.g., paragraph 17, “a method for neural network pruning … the method 100 may also be performed by a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a computer-implemented method]) comprising:
identifying a plurality of corresponding neurons in a plurality of network layers within a neural network (see, e.g., paragraphs 20 and 29, “at least one neuron having a lowest importance is identified. … the at least one neuron corresponds to a feature map in a convolutional layer … the at least one neuron comprises a predetermined percentage of all of the neurons in the trained neural network.”, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers” [i.e., identify neurons in layers within a neural network]), wherein each of the plurality of corresponding neurons is associated with a … feature type (see, e.g., paragraphs 29, 45 and 49, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers”, “neurons corresponding to a single feature map is pruned during each iteration” [i.e., each of the neurons is associated with a feature type of a single feature map], “trained neural networks may be iteratively pruned using either a first criterion or a second criterion that are each computed based on a first-order gradient of the cost function w.r.t. the layer parameter hi.” [i.e., each neuron of the corresponding neurons/corresponding to a single feature map, is located in a different layer in the network layers]); and
deactivating each of a plurality of corresponding neurons from the plurality of network layers within a neural network based, at least in part, on a metric associated with the plurality of corresponding neurons (as indicated above, “each of a plurality of corresponding neurons from the plurality of network layers within a neural network” has been interpreted as each neuron of the previously-introduced plurality of corresponding neurons from the previously-introduced plurality of network layers within the previously-introduced neural network) (see, e.g., paragraphs 21, 27 and 42, “at least one neuron is removed from the trained neural network to produce a pruned neural network. … In one embodiment, greedy criteria-based 
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of corresponding neurons is associated with a matching feature type.
In the same field, analogous art Li teaches wherein each of the plurality of corresponding neurons is associated with a matching feature type (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned and pages 3-5, sections 3.1 and 3.3, “Prune m filters with the smallest sum values and their corresponding feature maps. The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “we observe that layers in the same stage (with the same feature map size) [i.e., feature maps with the same size – a matching feature type] have a similar sensitivity to pruning. … pruning the identity feature maps … of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claims 2 and 11, as discussed above, Molchanov in view of Li teaches the methods of claims 1 and 9.
Molchanov further discloses computing the metric associated with the plurality of corresponding neurons based on one or more weights associated with each neuron in the plurality of corresponding neurons (see, e.g., paragraphs 18, 25 and 30, “In one embodiment, the layer input parameter is a weight … a layer input parameter for one layer in a neural network 
Alternatively, Li also teaches computing the metric associated with the plurality of corresponding neurons based on one or more weights associated with each neuron in the plurality of corresponding neurons (see, e.g., page 3, section 3.1,”We measure the relative importance of a filter in each layer by calculating the sum of its absolute weight” [i.e., computing the metric/measure relative importance based on one or more weights associated with each neuron]).

Regarding claims 3 and 12, as discussed above, Molchanov in view of Li teaches the methods of claims 2 and 11.
Molchanov further discloses wherein computing the metric comprises performing one or more equalization operations on the one or more weights associated with each neuron in the plurality of corresponding neurons to generate the metric (see, e.g., paragraphs 33-34, “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers”, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the 2 normalization, each layer has some feature maps that are highly important and others that are unimportant.” [i.e., performing equalization operations on the weights for normalizing and scaling of the criteria/metric]).

Regarding claims 4 and 13, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Molchanov further discloses wherein performing the one or more equalization operations comprises applying an equalization operator to one or more weights assigned to a first neuron in the plurality of corresponding neurons and one or more weights assigned to a second neuron in the plurality of corresponding neurons (see, e.g., paragraphs 25, 30 and 33-34, “The neural network's parameters W={(w11,b11), (w12, b12), ... (wLCl, bLCl)} are optimized to minimize a cost value C(W). In one embodiment, a parameter (w,b)ϵW may represent an individual weight”, ”Neurons (or feature maps) for a particular layer are represented as circles and connections between the neurons are each associated with a weight.” [i.e., one or more weights are assigned to first and second neurons in the plurality of neurons], “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers: 
    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale
”, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the first layers more important than last layers … After 12 normalization, each 

Regarding claims 6 and 15, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Molchanov further discloses wherein performing the one or more equalization operations comprises:
determining that at least one neuron in the plurality of corresponding neurons is associated with an individual metric that is at or above a threshold (see, e.g., paragraph 20, “At step 130, at least one neuron having a lowest importance is identified. … the at least one neuron includes neurons having importances below a threshold value.” [i.e., determining that at least one neuron in the neurons besides the neuron having a lowest importance is associated with an individual importance metric that is greater than or equal to the threshold value]); and 
setting the metric associated with the plurality of corresponding neurons to the threshold (see, e.g., paragraphs 19 and 45 and claim 10, “a pruning criterion is computed for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron … and is associated with the layer” [i.e., setting/computing the pruning criterion/importance value associated with the neurons in the layer], “Pruning may be considered to be complete when a threshold number of neurons are removed. In one embodiment, neurons corresponding to a single feature map is pruned during each iteration, allowing fine-tuning and re-evaluation of the criterion”, “The neural network pruning system of claim 1, wherein the at least one neuron includes neurons having 

Regarding claim 8, as discussed above, Molchanov in view of Li teaches the method of claim 1.
Molchanov further discloses wherein each of the plurality of corresponding neurons produces a different input into a given computational component of the neural network (see, e.g., paragraphs 18, 24 and 41, “a layer input parameter for one layer in a neural network is output by a previous layer, so that a ‘layer parameter’ refers to either a layer input parameter or a layer output parameter.”, “The feature maps can either be the input to the neural network, z0, or the output from a convolutional layer z1”, “neural network 225 processes the input data 215 and generates prediction data 135 (i.e., output data).” [i.e., each neuron from a previous layer produces its output, which is a different input into a given computational component in a subsequent layer of the neural network]).

With regard to independent claim 17, Molchanov discloses the invention as claimed including a processor comprising:
a plurality of computational logic units to generate a plurality of results based on one or more inputs and one or more weight values (see, e.g., paragraphs 17-18, 40, 44 and 76, “for neural network pruning, … method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a processor], “the layer input parameter is a weight … a layer input parameter for one layer in a neural network is output by a previous layer, so that a ‘layer , wherein the plurality of computational logic units are to be programmed according to a neural network architecture comprising a plurality of network layers (as indicated above, the “plurality of computational logic units are to be programmed” has been interpreted as any computational logic units that are programmed or are programmable) (see, e.g., FIG. 1D – showing pruning of a neural network including Layers 1 and 2, and paragraphs 16-17 and 40, “CNNs are composed of a variety of layer types, runtime during prediction is dominated by the evaluation of convolutional layers.”, “performed by a program, custom circuitry, or by a combination of custom circuitry and a program … capable of implementing a neural network” [i.e., computational logic units of the processor are programmable/programmed], “the trained neural network 225 is a convolutional neural network” [i.e., the CNN/neural network architecture includes a plurality of network layers]), wherein each of the plurality of computational logic units corresponds to a different layer in the plurality of layers and is located at a … location within the corresponding layer (see, e.g., paragraphs 29, 44, 45 and 49, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers”, “GPU, i.” [i.e., each neuron of the corresponding neurons/corresponding to a single feature map, is located in a different layer in the network layers]), and 
wherein the plurality of computational logic units are deactivated based, at least in part, on a metric associated with the one or more weight values (see, e.g., paragraphs 21, 27 and 42, “at least one neuron is removed from the trained neural network to produce a pruned neural network. … In one embodiment, greedy criteria-based pruning is interleaved with fine-tuning to iteratively remove neurons from the trained neural network”, “a criteria-based pruning technique is preferred, starting with a full set of the parameters W and pruning as a backward filter by iteratively identifying and removing at least one least important layer parameter”, “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing each of the neurons from the layers of the neural network based in part on a criteria/parameter/importance value/metric associated with the neurons]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of computational logic units corresponds to a different layer in the plurality of layers and is located at a matching location within the corresponding layer.
wherein each of the plurality of computational logic units corresponds to a different layer in the plurality of layers and is located at a matching location within the corresponding layer (as indicated above, “a matching location within a different network layer”, under the BRI in light of the specification, is any corresponding location, coordinate or index, in another network layer) (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned and pages 3-5, sections 3.1 and 3.3, “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.”, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters of the shortcut convolutional layers (with 1 x 1 kernels). The second layer of the residual block is pruned with the same filter index as selected by the pruning of the shortcut layer.” [i.e., computational logic units/neurons to be pruned are in a same index/location in different, consecutive layers]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune 

Regarding claim 18, as discussed above, Molchanov in view of Li teaches the processor of claim 17.
Molchanov further discloses wherein each of the plurality of corresponding neurons is associated with a … feature type (see, e.g., paragraphs 29, 45 and 49, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers”, “neurons corresponding to a single feature map is pruned during each iteration” [i.e., each of the neurons is associated with a feature type of a single feature map], “trained neural networks may be iteratively pruned using either a first criterion or a second criterion that are each computed based on a first-order gradient of the cost function w.r.t. the layer parameter hi.” [i.e., each neuron of the corresponding neurons/corresponding to a single feature map, is located in a different layer in the network layers]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of corresponding neurons is associated with a matching feature type.
wherein each of the plurality of corresponding neurons is associated with a matching feature type (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned and pages 3-5, sections 3.1 and 3.3, “Prune m filters with the smallest sum values and their corresponding feature maps. The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “we observe that layers in the same stage (with the same feature map size) [i.e., feature maps with the same size – a matching feature type] have a similar sensitivity to pruning. … pruning the identity feature maps … of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.”, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters” [i.e., neurons to be pruned have a matching feature type in a corresponding, identical feature map]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that 

Regarding claim 19, as discussed above, Molchanov in view of Li teaches the processor of claim 17.
Molchanov further discloses wherein the metric is computed based on an equalization operation performed on the one or more weight values (see, e.g., paragraphs 33-34, “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers”, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the first layers more important than last layers … After 12 normalization, each layer has some feature maps that are highly important and others that are unimportant.” [i.e., the criteria/metric is computed, normalized and scaled based on an equalization operation performed on the weights]).

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov in view of Li as applied to claims 1 and 9 and further in view of Chen et al. (2021/0182077 A1, hereinafter “Chen”). Chen was filed as a national stage application of PCT application no. PCT/CN2018/105463 filed September 13, 2018, and this date is before the effective filing date of this application, i.e., November 21, 2018. Therefore, Chen constitutes prior art under 35 .
Regarding claims 5 and 14, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein the one or more equalization operations comprises at least one of an arithmetic mean operation, a geometric mean operation, a union operation, and an intersection operation.
In the same field, analogous art Chen teaches wherein the one or more equalization operations comprises at least one of an arithmetic mean operation, a geometric mean operation, a union operation, and an intersection operation (see, e.g., paragraphs 588 and 593, “pruning unit configured to perform a coarse-grained pruning operation on weights of a neural network”, “the amount of information of the M weights is an arithmetic mean of absolute values of the M weights, a geometric mean of the absolute values of the M weights, or a maximum value of the M weights” [i.e., the operations include an arithmetic mean or a geometric mean operation]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Chen with Molchanov in view of Li to provide “a processing device including: a coarse-grained pruning unit configured to perform a coarse-grained pruning operation on weights of a neural network to obtain pruned weights” where “The coarse-grained pruning unit is configured to: select M weights from the weights of the neural network” and “the amount of information of the M weights is an arithmetic mean of absolute values of the M weights, a geometric mean of the absolute values of the M weights, or a . 

Claims 7, 10, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov in view of Li as applied to claims 1 and 9 and further in view of Theodorakopoulos et al. (U.S. Patent Application Pub. No. 2018/0137417 A1, hereinafter “Theodorakopoulos”).
Regarding claims 7, 16 and 20, as discussed above, Molchanov in view of Li teaches the methods of claims 1 and 9, and the processor of claim 17.
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network.
In the same field, analogous art Li teaches wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network (see, e.g., pages 6-7, sections 4 and 4.2, “We prune two types of networks: simple CNNs … and Residual networks (ResNet-56/110 on CIFAR-10 … a convolutional layer is pruned, the weights of the subsequent batch normalization layer are also removed”, “ResNets for CIFAR-10 have three stages of residual blocks … the shortcut layer 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network.
In the same field, analogous art Theodorakopoulos teaches wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network (see, e.g., paragraphs 100-101, “in the event that a residual CNN architecture is employed. In residual CNNs [He], each convolutional layer is only responsible for, in effect, fine-tuning the output from a previous layer by just adding a learned ‘residual’ to the input.”, “Residual CNN architectures utilize layer bypass connections (68,69 in 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Theodorakopoulos with Molchanov in view of Li to provide a “Learning Kernel Activation Module (LKAM), serving the purpose of enforcing the utilization of less convolutional kernels by learning kernel activation rules and by actually controlling the engagement of various computing elements: The module activates/deactivates a sub-set of filtering kernels, groups of kernels, or groups of full connected neurons” (See, e.g., Theodorakopoulos, paragraph 18). Doing so would have allowed Molchanov in view of Li to use Theodorakopoulos’ Learning Kernel Activation Module (LKAM) so that a “CNN essentially learns how to reduce its initial size on-the-fly (e.g. for every input image or datum), through an optimization process which guides the network to learn which kernel need to be engaged for a specific input datum. This results in the selective engagement of a subset of computing elements for every specific input datum” to achieve “a reduction in the number of applied kernels in any layer” and a “reduction of the overall computational load” in order “to produce additional processing optimization”, as suggested by Theodorakopoulos (See, e.g., Theodorakopoulos, paragraphs 19-21). 

Regarding claim 10, as discussed above, Molchanov in view of Li teaches the method of claim 9.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein each of the plurality of corresponding neurons computationally determines a probability of a feature having the feature type being present in given input data.
In the same field, analogous art Theodorakopoulos teaches wherein each of the plurality of corresponding neurons computationally determines a probability of a feature having the feature type being present in given input data (see, e.g., paragraphs 41-42 and 96, “a number L of convolutional layers there may be any number of fully connected layers (42 in FIG. 1). These densely connected layers are identical to the layers in a standard fully connected multilayer neural network”, “The outputs of such a network is a vector of numbers, from which the probability that a specific input image belongs to the specific class (e.g. the face of a specific person) … the output layer (43 in FIG. 1) of the CNN … maps the network output vector to class probabilities. But the required type of output should be a single binary decision for the specific image (e.g. is it this specific person) …This is achieved through thresholding on class probabilities”, “every single neuron (e.g. a computing element usually corresponding to a linear function performed on its inputs), in a fully connected layer is linked to a number of neurons in the next layer.” [i.e., the neurons of layers of the neural network/CNN each computationally determine a probability of a specific class/feature type being present in a given input image]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Theodorakopoulos with Molchanov in view of Li to provide a “Learning Kernel Activation Module (LKAM), serving the purpose of enforcing the utilization of less convolutional kernels by learning kernel activation rules and by actually controlling the engagement of various computing elements: The module activates/deactivates a sub-set of filtering kernels, groups of kernels, or groups of full connected neurons” (See, e.g., Theodorakopoulos, paragraph 18). Doing so would have allowed Molchanov in view of Li to use 

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125