DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the amendment filed 8/25/2022. In the amendment, claims 1-4, 6-13 and 15-20 were amended by applicant, claims 21-22 were added, and no claims were cancelled in the amendment. As such, claims 1-22 are pending.

Response to Amendment
The amendment filed on 8/25/2022 has been entered. However, as discussed below, claims 1-22 are rejected under 35 U.S.C. 112(a) because the amendments to independent claims 1, 9 and 17 filed 8/25/2022 introduce new subject matter which is not described in the specification.
The previous objections to the specification and drawings are withdrawn in view of the amendments to the specification and drawings. However, as documented below, an objection to the specification remains.
The previous objections to claims 9-16 are withdrawn in view of the amendments to independent claim 9.
The previous rejections of claims 9-20 under 35 U.S.C. 112(b) are withdrawn in view of the amendments to independent claims 9 and 17. However, as documented below, rejections of claims 1-22 under 35 U.S.C. 112(b) remain.

Response to Arguments
Applicant's arguments filed 8/25/2022 with respect to the previous objections to the specification and drawings have been fully considered and are persuasive. However, as documented below, an objection to the specification remains.
Applicant's arguments with respect to the objections to claims 9-16 have been fully considered and are persuasive.
Applicant's arguments with respect to the previous rejections of claims 9-20 under 35 U.S.C. 112(b) have been fully considered and are persuasive. However, as documented below, rejections of claims 1-22 under 35 U.S.C. 112(b) remain.
Applicant's arguments with respect to the rejections of claims 1-20 under 35 U.S.C. 103 have been fully considered but are not persuasive. In particular, as discussed in detail below, the previously-applied combination of references (i.e., Molchanov in view of Li) is applied to reject amended independent claims 1, 9 and 17 as well as amended dependent claim 2-4, 6, 8, 11-13, 15, 18-19 and newly-added claims 21-22. As also detailed below, the previously-applied combination of references (i.e., Molchanov in view of Li and further in view of Chen) is also applied to reject dependent claims 5 and 14. As further discussed below, the previously-applied combination of references Molchanov in view of Li and further in view of Theodorakopoulos is applied to reject amended dependent claims 7, 10, 16 and 20. 
With reference to claim 1, applicant states “Applicant amends claim 1 to recite, in relevant part, ‘deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric.’” and “Cited portions of Molchanov disclose ‘deactivating/pruning/removing each of the neurons from the layers of the neural network based in part on a criteria/parameter/importance value/metric associated with the neurons.’” before asserting, which examiner does not concede, that “Molchanov is silent on the neurons being at corresponding locations of different layers as recited in claim 1. Furthermore, Applicant submits that Li does not cure the defficiency [sic – deficiency] of Molchanov” and “Li is silent on pruning a plurality of neurons at corresponding locations of different layers based on one metric, as taught in claim 1.” (applicant’s remarks, page 11, paraphrasing and characterizing claim language). 
The examiner disagrees and respectfully notes that contrary to applicant’s assertions above, claim 1 does not recite “the neurons being at corresponding locations of different layers as recited in claim 1” or “pruning a plurality of neurons at corresponding locations of different layers based on one metric, as taught in claim 1.” Id.
Regarding amended independent claims 9 and 17, applicant states “that claims 9 and 17 are allowable at least for reasons including some of those discussed above in connection with claim 1. For example, claim 9 recites ‘deactivating each of a plurality of neurons with a matching feature type from a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric.’” before alleging “that the proposed combination of Molchanov and Li does not teach such subject matter as recited in claims 9 and 17.” (applicant’s remarks, page 12). 
As a preliminary matter, and as discussed in the section 112(a) rejections below, the amendments to claims 1, 3, 9, 12, 17 and 19 introduce new matter which was not described in the specification. Regarding applicant’s arguments and proffered support for the amendments, applicant generally asserts “Support for all amended claims can be found in the specification, and no new matter has been added by these amendments” and states “Applicant amends claim 1 to recite, in relevant part, ‘deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric.’ Support can at least be found at FIGS. 3 and 4, and paragraphs [0039]-[0045].” (applicant’s remarks, pages 9 and 11).
The examiner disagrees and notes that the original specification, in the portions cited by the applicant, and in other portions is silent regarding any “performance metric” as recited in amended claims 1, 3, 9, 12, 17 and 19, much less “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” as recited in independent claims 1, 9 and 17. For example, the relied-upon FIGs. 3 and 4, which are reproduced below, and their corresponding descriptions in applicant’s specification, do not depict or mention, let alone describe, the “performance metric” limitations added to claims 1, 3, 9, 12, 17 and 19.
 
    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

Secondly, as discussed in the section 112(a) rejections below, the newly-added limitation of “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” is a relative term which renders the claims indefinite. As discussed below, this relative term is being interpreted as having a negligible effect on a metric associated with a neural network (i.e., having low importance or a relatively small effect, change or impact).
Applicant’s amendments and new claims have necessitated the claim rejections under 35 U.S.C. 112(a), 112(b) and 103 discussed below.
Regarding applicant’s apparent argument that the claim limitations added, using respective similar language, to claims 1, 9 and 17, i.e., “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” and “deactivating each of a plurality of neurons with a matching feature type from a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” are not disclosed or taught in the portions of Molchanov and Li cited in the previous office action, the examiner respectfully disagrees and points applicant to the discussion of Molchanov and Li below.
Regarding the new limitation “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” added, using respective similar language, to claims 1, 9 and 17, as indicated above and discussed in the section 112(b) rejections below, “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” is being interpreted as having a negligible effect on a metric associated with a neural network - i.e., having low importance or a relatively small effect, change or impact.
With continued reference to the above-noted deactivating limitation, the examiner points to paragraphs 20, 27 and 42 of Molchanov, which explicitly disclose that “at least one neuron having a lowest importance is identified. … the at least one neuron … in a convolutional layer … the at least one neuron includes neurons having importances below a threshold value” [i.e., neurons in convolutional layers within a neural network with importances/metrics below/not beyond a first value/threshold], “at least one neuron is removed from the trained neural network to produce a pruned neural network. … criteria-based pruning … to iteratively remove neurons from the trained neural network”, “a criteria-based pruning technique … pruning … by iteratively identifying and removing at least one least important layer parameter” and “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing low importance neurons from layers of the neural network only if a criteria/parameter/importance value/performance metric associated with the network is below a threshold/first value – low importance neurons do not affect performance of the network beyond a first value/threshold].
Regarding the limitation “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network” added to claim 1, paragraphs 32 and 34 of applicant’s specification state “corresponding neurons are located at the same location within the respective input layers.” and “corresponding neurons are those neurons that are located at the same location (e.g., coordinates or index) within the respective input layers.” Therefore, “neurons within corresponding locations of a plurality of layers within a neural network”, under its broadest reasonable interpretation (BRI) in light of the specification, is any corresponding location, coordinate or index, in network layers. 
With continued reference to the above-noted deactivating limitations the examiner points to FIGs. 3 and 4 of Li, which depict “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where neurons/feature maps in a same index/location in network layers are pruned. The examiner further points to pages 3-5, section 3.1 and 3.3 of Li, which explicitly disclose that “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters of the shortcut convolutional layers (with 1 x 1 kernels). The second layer of the residual block is pruned with the same filter index as selected by the pruning of the shortcut layer.” [i.e., pruning/deactivating neurons in a same index/location in different, consecutive layers of the neural network].
With reference to the limitation “deactivating each of a plurality of neurons with a matching feature type”, added, using respective similar language, to independent claims 9 and 17, the examiner points to FIGs. 3 and 4 of LI, which show “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned. The examiner also points to pages 3-5, sections 3.1 and 3.3 of Li, which explicitly disclose “Prun[ing] m filters with the smallest sum values and their corresponding feature maps. The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “we observe that layers in the same stage (with the same feature map size) [i.e., feature maps with the same size – a matching feature type] have a similar sensitivity to pruning. … pruning the identity feature maps … of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters” [i.e., neurons to be pruned/deactivated have a matching feature type in a corresponding, identical feature map]).
Further, as discussed in detail below, the combination of Molchanov and Li (i.e., Molchanov in view of Li) teaches the limitations of amended independent claims 1, 9 and 17 and dependent claims 2-4, 6, 8, 11-13, 15, 18-19 and 21-22.
As additionally detailed below the combination of Molchanov, Li and Chen (i.e., Molchanov in view of Li and further in view of Chen) teaches the limitations of dependent claims 5 and 14. 
As further discussed below, Molchanov in view of Li and further in view of Theodorakopoulos teaches the limitations of dependent claims 7, 10, 16 and 20. 
Applicant’s amendments and new claims have necessitated the claim rejections under 35 U.S.C. 112(a), 112(b) and 103 discussed below.

Specification
The disclosure is objected to because of the following informalities:
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01(o). Correction of the following is required: claims 1, 9 and 17 as amended do not appear to have support in the originally filed specification. There does not appear to be any discussion of “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” as recited, using respective similar language in amended claims 1, 9 and 17. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-22 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claims contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.
In particular, as noted above, the claim limitations “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” are recited, using respective similar language, in amended independent claims 1, 9 and 17 in the amendment filed on 8/25/2022. However, the written description of the current application fails to disclose the above-identified limitation.
The original specification does not describe the negative, exclusionary limitation of “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric”. See MPEP § 2173.05(i): “Any negative limitation or exclusionary proviso must have basis in the original disclosure. If alternative elements are positively recited in the specification, they may be explicitly excluded in the claims. … The mere absence of a positive recitation is not basis for an exclusion. Any claim containing a negative limitation which does not have basis in the original disclosure should be rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph, as failing to comply with the written description requirement.”
However, the specification is silent regarding “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric.” Paragraphs 29 and 45-47 of applicant’s specification state “the pruning engine 114 selects neurons having a corresponding metric below a pruning threshold.”, “equalization engine 304 provides the equalized metrics associated with each set of corresponding neurons to the removal engine 306”, “the removal engine 306 prunes the input layers to the element-wise operation based on the equalized metrics associated with each set of corresponding neurons in the input layers. The removal engine 306 deactivates neurons from the input layers that have an equalized metric that is less than a threshold pruning weight.” and “the specific neurons that are deactivated by the removal engine 306 depends on the equalization operator applied by the equalization engine 304 when equalizing the metrics associated with the sets of corresponding neurons. … the metric associated with each neuron in a set of corresponding neurons must be below the threshold pruning weight for the set of corresponding neurons to be deactivated.” However, the original specification does not mention, let alone describe, any deactivation of “a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” as recited, using respective similar language, in amended claims 1, 9 and 17. Indeed, the specification is silent regarding any “performance metric” or performance metrics associated with a neural network, much less deactivating neurons based on “a performance metric associated with the neural network” as recited in amended claims 1, 9 and 17. Accordingly, claims 1, 9 and 17 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.
Amended claims 2, 11 and 19 each recite, using respective similar language, “computing the performance metric associated with the neural network based on one or more weights associated with each neuron in the plurality of neurons.” As discussed above, applicant’s specification is silent regarding any “performance metric”. Moreover, the specification does not mention, let alone describe any computation of a “performance metric”. Accordingly, claims 2, 11 and 19 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.
Also, amended claims 3 and 12 both recite, using respective similar language, “wherein computing the performance metric comprises performing one or more equalization operations on the one or more weights associated with each neuron in the plurality of neurons to generate the performance metric.” As noted above, the specification is silent regarding any “performance metric”. Further, the specification does not mention, let alone describe any computation or generation of a “performance metric”. Accordingly, claims 3 and 12 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.
Claims 3-6 and 12-15, which each depend directly from claims 2 and 11, respectively, are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement under the same rationale as claims 2 and 11.
Additionally, claims 4-6 and 13-15, which each depend directly from claims 3 and 12, respectively, are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement under the same rationale as claims 3 and 12.
Also, claims 2-8 and 21-22, 10-16, and 18-20, which each depend directly or indirectly from independent claims 1, 9 and 17, respectively, are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement under the same rationale as claims 1, 9 and 17.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-22 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
In amended independent claims 1, 9 and 17, the recitation of “deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” is a relative term which renders the claims indefinite. 
 The term "does not substantially affect a performance metric" is not defined by the claims, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. In particular, it is unclear what metrics are used for ascertaining the requisite degree of change, alteration, impact or effect on the metric in the term "does not substantially affect a performance metric” in the phrase "does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric”. Applicant’s specification mentions that “the pruning engine 114 selects neurons having a corresponding metric below a pruning threshold.”, “equalization engine 304 provides the equalized metrics associated with each set of corresponding neurons to the removal engine 306”, “the removal engine 306 prunes the input layers … based on the equalized metrics associated with each set of corresponding neurons in the input layers. The removal engine 306 deactivates neurons from the input layers that have an equalized metric that is less than a threshold pruning weight.” and “the specific neurons that are deactivated by the removal engine 306 depends on the equalization operator applied by the equalization engine 304 when equalizing the metrics associated with the sets of corresponding neurons. … the metric associated with each neuron in a set of corresponding neurons must be below the threshold pruning weight for the set of corresponding neurons to be deactivated” (see, e.g., paragraphs 29 and 45-47). However, applicant’s specification does not does not provide a standard for ascertaining the requisite degree of change, alteration, impact or affect in the term "does not substantially affect a performance metric” recited in claims 1, 9 and 17. 
Therefore, one of ordinary skill in the art would not be able to ascertain what “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” would encompass. See MPEP § 2173.05(b). For the purposes of determining patent eligibility and comparison with the prior art, the examiner is interpreting the term “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” as having a negligible effect on a metric associated with a neural network (i.e., having low importance or a relatively small effect, change, impact or alteration on a metric associated with a neural network). Appropriate correction is required.
Amended claim 12 recites “the metric” in lines 3-4. However, claim 12 previously introduced “the performance metric” in lines 1-2. It is unclear whether the subsequent recitation of “the metric” refers to the previously introduced “the performance metric”, or to some other metric. For examination purposes, “the metric” is being interpreted as the previously introduced “the performance metric” (see, e.g., line 4 of claim 3). Appropriate correction is required.
Amended claims 6 and 15 recite “the metric associated with the plurality of neurons” in line 5 of both claims. However, claims 6 and 5 previously introduced “at least one neuron in the plurality of neurons is associated with an individual metric” (see lines 3-4 of both claims). Intervening claims 3 and 12, which claims 6 and 15 respectively depend upon, previously introduced “the performance metric” and “the metric” (see lines 1-2 of claims 3 and 12 and lines 3-4 of claim 12). It is unclear whether the subsequent recitations of “the metric” in claims 6 and 15 refer to the previously introduced “the performance metric”, “the metric”, the “individual metric”, or to some other metric. For examination purposes, “the metric associated with the plurality of neurons” is being interpreted as any metric associated with the plurality of neurons. Appropriate correction is required.
Claims 13-15, which each depend directly from claim 12, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claim 12.
Also, claims 2-8 and 21-22, 10-16, and 18-20, which each depend directly or indirectly from independent claims 1, 9 and 17, respectively, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claims 1, 9 and 17.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 6, 8-9, 11-13, 15, 17-19 and 21-22 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (U.S. Patent Application Pub. No. 2018/0114114 A1, hereinafter “Molchanov”) in view of non-patent literature Li, Hao, et al. ("Pruning filters for efficient convnets." arXiv preprint arXiv:1608.08710v3 (2017): 1-13, hereinafter “Li”).

With respect to claim 1, Molchanov discloses the invention as claimed including a computer-implemented method (see, e.g., paragraph 17, “a method for neural network pruning … the method 100 may also be performed by a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a computer-implemented method]) comprising:
deactivating each of a plurality of neurons within … a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric (as indicated above, “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” has been interpreted as having a negligible effect on a metric associated with a neural network - i.e., having low importance or a relatively small effect, change or impact) (see, e.g., paragraphs 20 “at least one neuron having a lowest importance is identified. … the at least one neuron … in a convolutional layer … the at least one neuron includes neurons having importances below a threshold value” [i.e., neurons in convolutional layers within a neural network with importances/metrics below/not beyond a first value/threshold], 21 “at least one neuron is removed from the trained neural network to produce a pruned neural network. … criteria-based pruning … to iteratively remove neurons from the trained neural network”, 27 “a criteria-based pruning technique … pruning … by iteratively identifying and removing at least one least important layer parameter” and 42 “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing low importance neurons from layers of the neural network only if a criteria/parameter/importance value/performance metric associated with the network is below a threshold/first value – low importance neurons do not affect performance of the network beyond a first value/threshold]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network.
In the same field, analogous art Li teaches deactivating each of a plurality of neurons within corresponding locations of a plurality of layers within a neural network (Paragraphs 32 and 34 of applicant’s specification state “corresponding neurons are located at the same location within the respective input layers.” and “corresponding neurons are those neurons that are located at the same location (e.g., coordinates or index) within the respective input layers.” Therefore, “neurons within corresponding locations of a plurality of layers within a neural network”, under its broadest reasonable interpretation (BRI) in light of the specification, is any corresponding location, coordinate or index, in network layers) (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where neurons/feature maps in a same index/location in network layers are pruned and pages 3, section 3.1 “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, 4, section 3.3, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and 5, section 3.3, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters of the shortcut convolutional layers (with 1 x 1 kernels). The second layer of the residual block is pruned with the same filter index as selected by the pruning of the shortcut layer.” [i.e., pruning/deactivating neurons in a same index/location in different, consecutive layers of the neural network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding new claim 21, as discussed above, Molchanov in view of Li teaches the method of claim 1.
Molchanov further discloses wherein deactivating each of the plurality of neurons is based, at least in part, on metrics of at least two of the plurality of neurons (see, e.g., paragraphs 20, “neuron includes neurons having importances below a threshold value … comprises a predetermined percentage of all of the neurons in the trained neural network.”, 42, “the pruning criterion for a layer parameter, Θss(Wi) is computed using the second criterion. The pruning criteria are represented as importance values that are provided to a neuron removal unit 265.” [i.e., pruning criteria/importance values/metrics of at least two neurons] and 49, “regularization may be applied to compute importances and prune less important neurons … regularization … may be applied to compute importances based on other conditions.” [i.e., pruning/deactivating neurons is based on pruning criteria/importance values/metrics of at least two less important neurons]).

Regarding new claim 22, as discussed above, Molchanov in view of Li teaches the method of claim 1.
Molchanov further discloses wherein the plurality of neurons is deactivated based, at least in part, on a calculated metric associated with a plurality of metrics of the plurality of neurons (see, e.g., paragraphs 42, “the pruning criterion for a layer parameter, Θss(Wi) is computed using the second criterion. The pruning criteria are represented as importance values that are provided to a neuron removal unit 265.” [i.e., neurons are pruned/deactivated based on a computed/calculated pruning criterion associated with a plurality of importance values/metrics of the neurons] and 49, “regularization may be applied to compute importances and prune less important neurons … regularization … may be applied to compute importances based on other conditions.” [i.e., pruning/deactivating neurons based on a computed/calculated importance/metric associated with a plurality of importance values/metrics of the neurons]).

With regard to independent claim 9, Molchanov discloses the invention as claimed including a computer-implemented method (see, e.g., paragraph 17, “a method for neural network pruning … the method 100 may also be performed by a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a computer-implemented method]) comprising:
deactivating each of a plurality of neurons … from a plurality of layers within a neural network only if deactivating each of the plurality of neurons does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric (as indicated above, “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” has been interpreted as having a negligible effect on a metric associated with a neural network - i.e., having low importance or a relatively small effect, change or impact) (see, e.g., paragraphs 20 “at least one neuron having a lowest importance is identified. … the at least one neuron … in a convolutional layer … the at least one neuron includes neurons having importances below a threshold value” [i.e., neurons in convolutional layers within a neural network with importances/metrics below/not beyond a first value/threshold], 21 “at least one neuron is removed from the trained neural network to produce a pruned neural network. … criteria-based pruning … to iteratively remove neurons from the trained neural network”, 27 “a criteria-based pruning technique … pruning … by iteratively identifying and removing at least one least important layer parameter” and 42 “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing low importance neurons from layers of the neural network only if a criteria/parameter/importance value/performance metric associated with the network is below a threshold/first value – low importance neurons do not affect performance of the network beyond a first value/threshold]).
Although Molchanov substantially discloses the claimed invention, and Molchanov discloses “at least one neuron having a lowest importance is identified. … the at least one neuron corresponds to a feature map in a convolutional layer” (see, e.g., Molchanov, paragraph 20), Molchanov is not relied on for explicitly disclosing deactivating each of a plurality of neurons with a matching feature type.
In the same field, analogous art Li teaches deactivating each of a plurality of neurons with a matching feature type (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned, and pages 3, section 3.1, “Prune m filters with the smallest sum values and their corresponding feature maps. The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, 4, section 3.3, “we observe that layers in the same stage (with the same feature map size) [i.e., feature maps with the same size – a matching feature type] have a similar sensitivity to pruning. … pruning the identity feature maps … of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and 5, section 3.3, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters” [i.e., neurons to be pruned/deactivated have a matching feature type in a corresponding, identical feature map]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claims 2 and 11, as discussed above, Molchanov in view of Li teaches the methods of claims 1 and 9.
Molchanov further discloses computing the performance metric associated with the neural network based on one or more weights associated with each neuron in the plurality of neurons (see, e.g., paragraphs 18, “In one embodiment, the layer input parameter is a weight … a layer input parameter for one layer in a neural network is output by a previous layer, so that a ‘layer parameter’ refers to either a layer input parameter or a layer output parameter.”, 25, “The neural network's parameters W are optimized to minimize a cost value C(W). In one embodiment, a parameter (w,b)ϵW may represent an individual weight, a convolutional kernel, or an entire set of kernels that compute a feature map.” 30, “Neurons (or feature maps) for a particular layer are represented as circles and connections between the neurons are each associated with a weight.” and 49 “regularization may be applied to compute importances and prune less important neurons … regularization … may be applied to compute importances based on other conditions.” [i.e., computing the performance parameter/metric associated with the neural network is based on one or more weights and conditions associated with each neuron]).
Alternatively, Li also teaches computing the performance metric associated with the neural network based on one or more weights associated with each neuron in the plurality of neurons (see, e.g., page 3, section 3.1,”We measure the relative importance of a filter in each layer by calculating the sum of its absolute weight” [i.e., computing the performance metric/measure of relative importance associated with layers of the neural network based on one or more weights associated with each neuron]).
The motivation to combine Molchanov and Li is the same as discussed above with respect to claims 1 and 9.

Regarding claims 3 and 12, as discussed above, Molchanov in view of Li teaches the methods of claims 2 and 11.
Molchanov further discloses wherein computing the performance metric comprises performing one or more equalization operations on the one or more weights associated with each neuron in the plurality of neurons to generate the performance metric (as indicated above, “the metric” recited in claim 12 has been interpreted as the previously introduced “the performance metric”) (see, e.g., paragraphs 33, “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers”, 34, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the first layers more important than last layers … After 12 normalization, each layer has some feature maps that are highly important and others that are unimportant.” and 42 “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265.” [i.e., performing equalization operations on the weights for normalizing and scaling of the importance value/criteria/performance metric]).

Regarding claims 4 and 13, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Molchanov further discloses wherein performing the one or more equalization operations comprises applying an equalization operator to one or more weights assigned to a first neuron in the plurality of neurons and one or more weights assigned to a second neuron in the plurality of neurons (see, e.g., paragraphs 25, “The neural network's parameters W={(w11,b11), (w12, b12), ... (wLCl, bLCl)} are optimized to minimize a cost value C(W). In one embodiment, a parameter (w,b)ϵW may represent an individual weight”, 30, “Neurons (or feature maps) for a particular layer are represented as circles and connections between the neurons are each associated with a weight.” [i.e., one or more weights are assigned to first and second neurons in the plurality of neurons], and 33-34 “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers: 
    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale
”, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the first layers more important than last layers … After 12 normalization, each layer has some feature maps that are highly important and others that are unimportant.” [i.e., applying an equalization operator to one more weights assigned to the first and second neurons]).

Regarding claims 6 and 15, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Molchanov further discloses wherein performing the one or more equalization operations comprises:
determining that at least one neuron in the plurality of neurons is associated with an individual metric that is at or above a threshold (see, e.g., paragraph 20, “At step 130, at least one neuron having a lowest importance is identified. … the at least one neuron includes neurons having importances below a threshold value.” [i.e., determining that at least one neuron in the neurons besides the neuron having a lowest importance is associated with an individual importance metric that is greater than or equal to the threshold value]); and 
setting the metric associated with the plurality of neurons to the threshold (see, e.g., paragraphs 19, “a pruning criterion is computed for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron … and is associated with the layer” [i.e., setting/computing the pruning criterion/importance value associated with the neurons in the layer] and 45, “Pruning may be considered to be complete when a threshold number of neurons are removed. In one embodiment, neurons corresponding to a single feature map is pruned during each iteration, allowing fine-tuning and re-evaluation of the criterion” and claim 10, “The neural network pruning system of claim 1, wherein the at least one neuron includes neurons having importances below a threshold value.” [i.e., tuning the criterion/setting the importance metric associated with the neurons to the threshold]).

Regarding claim 8, as discussed above, Molchanov in view of Li teaches the method of claim 1.
Molchanov further discloses wherein each of the plurality of neurons produces a different input into a given computational component of the neural network (see, e.g., paragraphs 18, 24 and 41, “a layer input parameter for one layer in a neural network is output by a previous layer, so that a ‘layer parameter’ refers to either a layer input parameter or a layer output parameter.”, “The feature maps can either be the input to the neural network, z0, or the output from a convolutional layer z1”, “neural network 225 processes the input data 215 and generates prediction data 135 (i.e., output data).” [i.e., each neuron from a previous layer produces its output, which is a different input into a given computational component in a subsequent layer of the neural network]).

With regard to independent claim 17, Molchanov discloses the invention as claimed including a processor comprising:
a plurality of computational logic units (see, e.g., paragraphs 17, “for neural network pruning, … method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a processor] and 76, “processing cores 550. Each core 550 may include a fully-pipelined, single-precision processing unit that includes a floating point arithmetic logic unit and an integer arithmetic logic unit. The core 550 may also include a double-precision processing unit including a floating point arithmetic logic unit. In one embodiment, the floating point arithmetic logic units” [i.e., a processor comprising computational logic units]) to be deactivated only if deactivating each of the plurality of computational logic units does not substantially affect a performance metric associated with a neural network to beyond a first value of the performance metric (as indicated above, “does not substantially affect a performance metric associated with the neural network to beyond a first value of the performance metric” has been interpreted as having a negligible effect on a metric associated with a neural network - i.e., having low importance or a relatively small effect, change or impact) (see, e.g., paragraphs 20 “at least one neuron having a lowest importance is identified. … the at least one neuron … in a convolutional layer … the at least one neuron includes neurons having importances below a threshold value” [i.e., neurons in convolutional layers within a neural network with importances/metrics below/not beyond a first value/threshold], 21 “at least one neuron is removed from the trained neural network to produce a pruned neural network. … criteria-based pruning … to iteratively remove neurons from the trained neural network”, 27 “a criteria-based pruning technique … pruning … by iteratively identifying and removing at least one least important layer parameter” and 42 “The pruning criteria are represented as importance values that are provided to a neuron removal unit 265. The neuron removal unit 265 indicates to the trained neural network 225 that one or more neurons should be removed from the trained neural network 225.” [i.e., deactivating/pruning/removing low importance neurons/units from layers of the neural network only if a criteria/parameter/importance value/performance metric associated with the network is below a threshold/first value – low importance neurons do not affect performance of the network beyond a first value/threshold]), 
wherein each of the plurality of computational logic units corresponds to a … layer in a plurality of layers within the neural network (see, e.g., paragraphs 29, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers”, 44, “GPU, CPU, neural network, or any processor capable of implementing a neural network” [i.e., each layer has neurons and the computational logic units of the processor implementing the neural network and its neurons], 45, “neurons corresponding to a single feature map is pruned during each iteration” and 49, “trained neural networks may be iteratively pruned using either a first criterion or a second criterion that are each computed based on a first-order gradient of the cost function w.r.t. the layer parameter hi.” [i.e., each neuron of the neurons/corresponding to a single feature map, is located in a layer in the network layers]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of computational logic units corresponds to a different layer in a plurality of layers within the neural network.
In the same field, analogous art Li teaches wherein each of the plurality of computational logic units corresponds to a different layer in a plurality of layers within the neural network (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in different, consecutive layers are pruned and pages 3, sections 3.1, “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, 4, section 3.3, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and 5, section 3.3, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters of the shortcut convolutional layers (with 1 x 1 kernels). The second layer of the residual block is pruned with the same filter index as selected by the pruning of the shortcut layer.” [i.e., each of the computational logic units/neurons to be pruned are in different, consecutive layers within the neural network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 18, as discussed above, Molchanov in view of Li teaches the processor of claim 17.
Molchanov further discloses wherein each of the plurality of computational logic units is associated with a … feature type (see, e.g., paragraphs 29, “Each layer also typically has less valuable neuron. Therefore, pruning should scale across all layers” and 45, “neurons corresponding to a single feature map is pruned during each iteration” [i.e., each of the neurons/computational logic units is associated with a feature type of a single feature map]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of computational logic units is associated with a matching feature type.
In the same field, analogous art Li teaches wherein each of the plurality of computational logic units is associated with a matching feature type (see, e.g., FIGs. 3 and 4 – depicting “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” where nodes/neurons in a same kernel index/location in different layers are pruned and pages 3, section 3.1, “Prune m filters with the smallest sum values and their corresponding feature maps. The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.” 4, section 3.3, “we observe that layers in the same stage (with the same feature map size) [i.e., feature maps with the same size – a matching feature type] have a similar sensitivity to pruning. … pruning the identity feature maps … of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and 5, section 3.3, “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters” [i.e., computational logic units/neurons to be pruned have a matching feature type in a corresponding, identical feature map]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

Regarding claim 19, as discussed above, Molchanov in view of Li teaches the processor of claim 17.
Molchanov further discloses wherein the performance metric is computed based on an equalization operation performed on one or more weight values associated with the plurality of computational logic units (see, e.g., paragraphs 33, “A scale of the first criteria values varies with the depth, in terms of layers within the network. Therefore, a layer-wise 12-normalization is computed to rescale the first criterion across the layers” and 34, “Scaling a criterion across layers is very important for pruning. … Without normalization, a conventional weight magnitude criterion tends to rank feature maps from the first layers more important than last layers … After 12 normalization, each layer has some feature maps that are highly important and others that are unimportant.” [i.e., the criteria/performance metric is computed, normalized and scaled based on an equalization operation performed on weights associated with the computational logic units of the neural network layers]).

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov in view of Li as applied to claims 3 and 12 and further in view of Chen et al. (2021/0182077 A1, hereinafter “Chen”). Chen was filed as a national stage application of PCT application no. PCT/CN2018/105463 filed September 13, 2018, and this date is before the effective filing date of this application, i.e., November 21, 2018. Therefore, Chen constitutes prior art under 35 U.S.C. 102(a)(2). Chen also claims foreign priority to Chinese application no. 201711036374.9, filed October 30, 2017, which is also before the effective filing date of this application.
Regarding claims 5 and 14, as discussed above, Molchanov in view of Li teaches the methods of claims 3 and 12.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein the one or more equalization operations comprises at least one of an arithmetic mean operation, a geometric mean operation, a union operation, and an intersection operation.
In the same field, analogous art Chen teaches wherein the one or more equalization operations comprises at least one of an arithmetic mean operation, a geometric mean operation, a union operation, and an intersection operation (see, e.g., paragraphs 588, “pruning unit configured to perform a coarse-grained pruning operation on weights of a neural network” and 593, “the amount of information of the M weights is an arithmetic mean of absolute values of the M weights, a geometric mean of the absolute values of the M weights, or a maximum value of the M weights” [i.e., the operations include an arithmetic mean or a geometric mean operation]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Chen with Molchanov in view of Li to provide “a processing device including: a coarse-grained pruning unit configured to perform a coarse-grained pruning operation on weights of a neural network to obtain pruned weights” where “The coarse-grained pruning unit is configured to: select M weights from the weights of the neural network” and “the amount of information of the M weights is an arithmetic mean of absolute values of the M weights, a geometric mean of the absolute values of the M weights, or a maximum value of the M weights.” (See, e.g., Chen, paragraphs 587-588, 590-591 and 593). Doing so would have allowed Molchanov in view of Li to use Chen’s pruning unit “to repeatedly perform a coarse-grained pruning operation on the weights of the neural network and train the neural network according to the pruned weights until no weight satisfies the preset condition under the premise that precision does not suffer a loss of a preset amount”, “thereby improving the information processing efficiency”, as suggested by Chen (See, e.g., Chen, Abstract and paragraph 595). 

Claims 7, 10, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov in view of Li as applied to claims 1 and 9 and further in view of Theodorakopoulos et al. (U.S. Patent Application Pub. No. 2018/0137417 A1, hereinafter “Theodorakopoulos”).
Regarding claims 7, 16 and 20, as discussed above, Molchanov in view of Li teaches the methods of claims 1 and 9, and the processor of claim 17.
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein the neural network comprises a residual network, and wherein the plurality of layers include a convolutional layer of the residual network.
In the same field, analogous art Li teaches wherein the neural network comprises a residual network, and wherein the plurality of layers include a convolutional layer of the residual network (see, e.g., pages 6 section 4, “We prune two types of networks: simple CNNs … and Residual networks (ResNet-56/110 on CIFAR-10 … a convolutional layer is pruned, the weights of the subsequent batch normalization layer are also removed” and 7, section 4.2, “ResNets for CIFAR-10 have three stages of residual blocks … the shortcut layer provides an identity mapping” [i.e., the neural network includes a residual network and the layers include an identity layer of the residual network]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Li to provide “an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy.” (See, e.g., Li, Abstract, page 1). Doing so would have allowed Molchanov to use Li’s method that includes “removing whole filters in the network together with their connecting feature maps” so that “computation costs are reduced significantly” and “to reduce computation costs without incurring additional overheads”, as suggested by Li (See, e.g., Li, Abstract and pages 1 and 2). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network.
In the same field, analogous art Theodorakopoulos teaches wherein the neural network comprises a residual network, and wherein the plurality of network layers include a convolutional layer of the residual network (see, e.g., paragraphs 100-101, “in the event that a residual CNN architecture is employed. In residual CNNs [He], each convolutional layer is only responsible for, in effect, fine-tuning the output from a previous layer by just adding a learned ‘residual’ to the input.”, “Residual CNN architectures utilize layer bypass connections (68,69 in FIG. 7), which offer an alternative path for information flow between consecutive convolutional layers.” [i.e., the neural network includes a residual convolutional neural network/CNN with a convolutional layer]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Theodorakopoulos with Molchanov in view of Li to provide a “Learning Kernel Activation Module (LKAM), serving the purpose of enforcing the utilization of less convolutional kernels by learning kernel activation rules and by actually controlling the engagement of various computing elements: The module activates/deactivates a sub-set of filtering kernels, groups of kernels, or groups of full connected neurons” (See, e.g., Theodorakopoulos, paragraph 18). Doing so would have allowed Molchanov in view of Li to use Theodorakopoulos’ Learning Kernel Activation Module (LKAM) so that a “CNN essentially learns how to reduce its initial size on-the-fly (e.g. for every input image or datum), through an optimization process which guides the network to learn which kernel need to be engaged for a specific input datum. This results in the selective engagement of a subset of computing elements for every specific input datum” to achieve “a reduction in the number of applied kernels in any layer” and a “reduction of the overall computational load” in order “to produce additional processing optimization”, as suggested by Theodorakopoulos (See, e.g., Theodorakopoulos, paragraphs 19-21). 

Regarding claim 10, as discussed above, Molchanov in view of Li teaches the method of claim 9.
Although Molchanov in view of Li substantially teaches the claimed invention, Molchanov in view of Li is not relied on for teaching wherein each of the plurality of neurons computationally determines a probability of a feature having the feature type being present in given input data.
In the same field, analogous art Theodorakopoulos teaches wherein each of the plurality of neurons computationally determines a probability of a feature having the feature type being present in given input data (see, e.g., paragraphs 41-42, “a number L of convolutional layers there may be any number of fully connected layers (42 in FIG. 1). These densely connected layers are identical to the layers in a standard fully connected multilayer neural network”, “The outputs of such a network is a vector of numbers, from which the probability that a specific input image belongs to the specific class (e.g. the face of a specific person) … the output layer (43 in FIG. 1) of the CNN … maps the network output vector to class probabilities. But the required type of output should be a single binary decision for the specific image (e.g. is it this specific person) …This is achieved through thresholding on class probabilities” and 96, “every single neuron (e.g. a computing element usually corresponding to a linear function performed on its inputs), in a fully connected layer is linked to a number of neurons in the next layer.” [i.e., the neurons of layers of the neural network/CNN each computationally determine a probability of a specific class/feature type being present in a given input image]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Theodorakopoulos with Molchanov in view of Li to provide a “Learning Kernel Activation Module (LKAM), serving the purpose of enforcing the utilization of less convolutional kernels by learning kernel activation rules and by actually controlling the engagement of various computing elements: The module activates/deactivates a sub-set of filtering kernels, groups of kernels, or groups of full connected neurons” (See, e.g., Theodorakopoulos, paragraph 18). Doing so would have allowed Molchanov in view of Li to use Theodorakopoulos’ Learning Kernel Activation Module (LKAM) so that a “CNN essentially learns how to reduce its initial size on-the-fly (e.g. for every input image or datum), through an optimization process which guides the network to learn which kernel need to be engaged for a specific input datum. This results in the selective engagement of a subset of computing elements for every specific input datum” to achieve “a reduction in the number of applied kernels in any layer” and a “reduction of the overall computational load” in order “to produce additional processing optimization”, as suggested by Theodorakopoulos (See, e.g., Theodorakopoulos, paragraphs 19-21). 

Conclusion
Applicant's amendment and new claims necessitated the new grounds of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure.
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125