DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the application and claims filed 8/23/2019.
Claims 1-24 are pending and have been examined.

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The Examiner has noted applicant’s claim for foreign priority based on Indian Provisional Application No. 201841031680, filed on August 23, 2018, Indian Patent Application No. 201841031680, filed on August 20, 2019 (hereinafter “Indian non-provisional application”), and Korean Patent Application No. 10-2019-0103841, filed on August 23, 2019. The examiner acknowledges that a certified copy of Indian Provisional Application No. 201841031680 was received on 10/9/2019, as required by 37 CFR 1.55. The examiner acknowledges that a certified copy of Indian non-provisional application No. 201841031680 was retrieved on 12/24/2019, and a certified copy of Korean Patent Application No. 10-2019-0103841 was retrieved on 12/11/2019, as required by 37 CFR 1.55.
The disclosure of one of the prior-filed applications, Indian Provisional Application No. 201841031680, filed on August 23, 2018 fails to provide adequate support or enablement in the manner provided by 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph for one or more claims of this application. Independent claims 1 and 14 both recite, inter alia, generating “a plurality of intermediate deep learning models by generating a respective intermediate deep learning model corresponding to each of the plurality of pruned neural networks; and” selecting “one of the plurality of intermediate deep learning models, having a determined greatest accuracy among the plurality of intermediate deep learning models, to be an optimized deep learning model.” 
The above-noted provisional application does not mention or disclose any intermediate deep learning models, let alone generating or creating any group, set or plurality of such “intermediate deep learning models by generating a respective intermediate deep learning model corresponding to each of” any group, set or plurality of “pruned neural networks” as claimed in claims 1 and 14. The above-noted provisional application mentions a “conventional approach to generation of Neural Architectures from scratch takes input from a teacher network and uses network distillation to generate more optimized network. However, this approach misses the opportunity of generating models from scratch which limit their applicability to the deep learning models” in paragraph 7 of the background section and generally mentions “generation of optimized CNN models which can help the widespread deployment of CNN models in the embedded devices including smart phone, watches and IoT devices” in paragraph 30. However, the above-noted provisional application fails to mention or disclose any selection of one of the claimed “intermediate deep learning models, to be an optimized deep learning model” as recited in claims 1 and 14. Accordingly, the above-noted provisional does not provide support for the above recited features.
Also, dependent claims 2 and 15 both recite “pruning of the different sets of the one or more of the plurality of connections … based on predetermined pruning policies”. Claims 3-5 and 16-18, which each depend directly from claims 2 and 15, respectively, each recite further limitations regarding “the predetermined pruning policies”. Further, dependent claims 5 and 18 both recite, inter alia, selecting, “at random, respective combinations of two or more connections for pruning” and pruning “each of the respective combinations based on the predetermined pruning policies.”
The above-noted provisional application does not mention or disclose any “pruning policies”, pruning criteria or pruning rules, much less any “predetermined pruning policies” as claimed in claims 2-5 and 15-18. The above-noted provisional application also fails to mention or disclose any random selection of combinations of two or more connections for pruning and then pruning such randomly-selected combinations based on the predetermined pruning policies, as claimed in claims 5 and 18. Accordingly, the above-noted provisional does not provide support for the above recited features.
The disclosure of the prior-filed Indian non-provisional application No. 201841031680, filed on August 20, 2019, appears to provide support for at least the above features recited in claims 1-5 and 14-18 (see, e.g., Abstract, FIGs. 1-4 depicting “predetermined pruning policies”, “intermediate deep learning models” and an “optimized deep learning model” and pages 2-3, 9 and 11, and claims 1-3, 5-9, 11-13 and 16). Accordingly, the effective filing date of independent claims 1 and 14 and dependent claims 2-13 and 15-24 of the present application is the filing date of the above-noted Indian non-provisional application, August 20, 2019.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 9/24/2019 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been considered by the examiner.

Claim Objections
Claims 3 and 16 are objected to because of the following informalities:  
In line 2 of both claims 3 and 16, the recitation of “based a determined accuracy level” is grammatically incorrect and appears to be missing the word “on” between “based” and “a”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 6, 9-17, 19 and 22-24 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (U.S. Patent Application Pub. No. 2018/0114114 A1, hereinafter “Molchanov”) in view of Dai et al. (U.S. Patent Application Pub. No. 2021/0182683 A1, hereinafter “Dai”). As discussed above, the earliest effective filing date for claims 1-24 is the filing date of the above-noted Indian non-provisional application, August 20, 2019. Dai was filed as a national stage application of PCT application no. PCT/US2018/057485, filled October 25, 2018, 2018 and this date is before the effective filing date of this application, i.e., August 20, 2019. Therefore, Dai constitutes prior art under 35 U.S.C. 102(a)(2). Dai also claims priority to U.S. Provisional application No. 62/580,525, filed on November 2, 2017, which is also before the effective filing date of this application.
With respect to claim 1, Molchanov discloses the invention as claimed including a processor-implemented method (see, e.g., paragraph 17, “a method for neural network pruning … the method 100 may also be performed by a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network” [i.e., a processor-implemented method]) comprising:
identifying a plurality of connections in a neural network that is pre-associated with a deep learning model (see, e.g., FIG. 1D – depicting identified connections between neurons in layers in a neural network, and paragraphs 17, 30 and 81, “modern deep CNNs [convolutional neural networks] are composed of a variety of layer types, … convolutional layers. With the goal of speeding up inference, entire feature maps may be pruned so the resulting networks may be run efficiently, even on embedded devices. In one embodiment, greedy criteria-based pruning is interleaved with fine-tuning, resulting in a computationally efficient procedure that maintains good generalization in the pruned network. A pruning criterion is computed to evaluate the importance of neurons” [i.e., a neural network/CNN that is associated with a deep learning model/deep CNN], “FIG. 1D illustrates a conceptual diagram of removing neurons from a neural network, in accordance with one embodiment. Neurons (or feature maps) for a particular layer are represented as circles and connections between the neurons are each associated with a weight … When a neuron is removed, all connections to and from the neuron are removed” [i.e., identify connections between neurons in a neural network], “the PPU 300 comprises a deep learning or machine learning processor. The PPU 300 is configured to receive commands that specify programs for modeling neural networks and processing data according to a neural network.” [i.e., the neural network is pre-associated with a deep learning model]);
generating a plurality of pruned neural networks by pruning different sets of one or more of the plurality of connections to respectively generate each of the plurality of pruned neural networks (see, e.g., FIG. 1D – showing generation of a pruned networks by pruning different sets of connections between neurons in layers, and paragraphs 30, 45 and 49, “Neurons (or feature maps) for a particular layer are represented as circles and connections between the neurons are each associated with a weight. After fine-pruning, connections between the neurons (or feature maps) are removed. For example, connections corresponding to small weight values may be removed. … In coarse pruning, entire neurons (or feature maps) are removed. As shown in FIG. 1D, the patterned neuron is removed during coarse pruning. When a neuron is removed, all connections to and from the neuron are removed.”, “the pruned neural network is fine-tuned … a determination is made whether pruning should continue or not … Pruning may be considered to be complete when a threshold number of neurons are removed. In one embodiment, neurons corresponding to a single feature map is pruned during each iteration” [i.e., pruning different sets of connections between different sets of neurons to generate pruned networks], “trained neural networks may be iteratively pruned using either a first criterion or a second criterion” [i.e., generating a plurality of pruned networks]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing generating a plurality of intermediate deep learning models by generating a respective intermediate deep learning model corresponding to each of the plurality of pruned neural networks; and
selecting one of the plurality of intermediate deep learning models, having a determined greatest accuracy among the plurality of intermediate deep learning models, to be an optimized deep learning model.
In the same field, analogous art Dai teaches generating a plurality of intermediate deep learning models by generating a respective intermediate deep learning model corresponding to each of the plurality of pruned neural networks (see, e.g., paragraphs 23, 46, 49, 51, 62 and 64, “NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with … pruning of neurons and connections. Experimental results show that NeST yields accurate, yet very compact DNNs, with a wide range of seed architecture selection.”, “insignificant connections and neurons are pruned away”, “Connections with small effective weights are treated as insignificant. Pruning of insignificant weights is an iterative process. In each iteration, the most insignificant weights (e.g., top 1 %) are only pruned for each layer, and then the whole DNN [deep neural network] is retrained”, “connections are pruned away. The whole DNN is retrained after each pruning iteration.”, “For the pruning phase, next, the post-growth LeNet DNNs are pruned … It is shown the post-pruning DNN sizes and compression ratios for LeNet-300-100 and LeNet … in FIG. 10.”, “The larger the pre-pruning DNN, the larger is its post-pruning DNN” [i.e., iteratively generating intermediate DNNs/deep neural networks/deep learning models by generating a respective intermediate DNN/LeNet corresponding to each pruned neural network/post-pruning DNN]); and
selecting one of the plurality of intermediate deep learning models, having a determined greatest accuracy among the plurality of intermediate deep learning models, to be an optimized deep learning model (see, e.g., paragraphs 8, 23, 27, 59 and 73, and claim 10, “a neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures”, “A DNN synthesis tool (referred to herein as ‘NeST’) is disclosed that automates the generation of compact and accurate DNNs.” [i.e., the plurality of deep learning models/DNNs], “the pruning phase 28, the DNN inherits the synthesized architecture and weights from the growth phase and iteratively removes redundant connections 34 and neurons 36, based on their magnitudes. Finally, NeST comes to rest at a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model.”, “LeNet-300-100 (LeNet-5) … and then the DNN architectures were grown from these seeds. The impact of these seeds was studied … and post-growth DNN sizes under the same target accuracy (this accuracy is typically a reference value for the architecture).”, “a synthesis tool, NeST, to synthesize compact yet accurate DNNs. NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy.”, “A neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures” [i.e., select the most accurate LeNet/deep learning model/DNN model 38 having a high/greatest accuracy, of the pruned DNNs generated by the iterations of pruning, to be the optimal/optimized deep learning model]).
Molchanov and Dai are analogous art because they are both directed to techniques for pruning neural networks (see, e.g., Molchanov, Abstract and Dai, paragraph 7). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Dai to provide a “DNN [deep neural network] synthesis tool (referred to herein as ‘NeST’) … that automates the generation of compact and accurate DNNs.” (See, e.g., Dai, paragraph 23). Doing so would have allowed Molchanov to use Dai’s NeST tool to create “a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model” and “to synthesize compact yet accurate DNNs” where “NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy”, as suggested by Dai (See, e.g., Dai, paragraphs 27 and 73). 

With respect to independent claim 14, claim 14 is substantially similar to claim 1 and therefore is rejected on the same ground as claim 1, discussed above. In particular, claim 14 is a system claim that performs operations corresponding to the method steps of claim 1. 
In addition, Molchanov further discloses a computing system, the system comprising: one or more processors; and a memory storing instructions, which when executed by the one or more processors, configure the one or more processors (see, e.g., FIG. 6 and paragraphs 88 and claim 20, “a system 600 is provided including at least one central processor 601 … The system 600 also includes a main memory 604. Control logic (software) and data are stored in the main memory 604”, “A non-transitory, computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform steps” [i.e., executable logic/software instructions stored in memory that, when executed by the processor, configure/cause the processor to perform operations/steps]).

With respect to claims 2 and 15, as discussed above, Molchanov in view of Dai teaches the method of claim 1 and the system of claim 14.
Molchanov further discloses wherein the pruning of the different sets of the one or more of the plurality of connections is performed based on predetermined pruning policies (see, e.g., FIG. 1A – showing block 120 where “a pruning criterion for each layer parameter [is] based on the first-order gradient corresponding to the layer parameters, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network” prior to subsequent blocks 130 and 140 where lowest importance neurons are identified and the identified neurons are removed, based on the pre-determined pruning criterion, “from the trained neural network to produce a pruned neural network” and paragraphs 20-21 and claims 1 and 11, “At step 130, at least one neuron having a lowest importance is identified. In one embodiment, the at least one neuron corresponds to a feature map in a convolutional layer. In one embodiment, the at least one neuron includes neurons having importances below a threshold value [i.e., a predetermined threshold]. In one embodiment, the at least one neuron comprises a predetermined percentage of all of the neurons in the trained neural network. At step 140, the at least one neuron is removed from the trained neural network to produce a pruned neural network.”, “A neural network pruning system comprising: a processor configured to: … identify at least one neuron having a lowest importance … wherein the at least one neuron comprises a predetermined percentage of all of the neurons in the trained neural network.” [i.e., predetermined pruning criteria/importance threshold/policies/percentages established prior to pruning]). 

Regarding claims 3 and 16, as discussed above, Molchanov in view of Dai teaches the method of claim 2 and the system of claim 15.
Molchanov further discloses updating the predetermined pruning policies based [on] a determined accuracy level of the optimized deep learning model (see, e.g., FIG. 1A – showing step 120 to compute/update pruning criteria/policies based on a first-order gradient and neural network accuracy determined in step 110, and paragraphs 18-19 and 26, “At step 110, first-order gradients of a cost function with respect to layer parameters are received for a trained neural network. A cost value is the value of the cost function at the current state of the network that indicates the accuracy of the neural network.” [i.e., a determined accuracy level of the neural network/deep learning model], “a first pruning criterion is based on a first-order Taylor expansion including the first-order gradient (i.e., first derivative) that approximates a change in the cost function induced by pruning network parameters. The change in the cost value indicates the accuracy of the neural network … a second criterion is based on a sum of squares including the first-order gradient of the cost function relative to the layer input parameter.”, “During pruning, a subset of parameters is refined. During pruning the accuracy of the adapted neural network, C(W')≈C(W), is preserved. The accuracy corresponds to a combinatorial optimization” [i.e., updating pruning criteria/policies based on a determined accuracy level of the optimized deep learning model]). 

Regarding claims 4 and 17, as discussed above, Molchanov in view of Dai teaches the method of claim 2 and the system of claim 15.
Molchanov further discloses wherein the predetermined pruning policies comprise at least one of a policy of pruning one or more connections for a predetermined time period or a policy of pruning connections until a threshold number of connections are pruned (see, e.g., paragraph 45, “a determination is made whether pruning should continue or not. … Pruning may be considered to be complete when a threshold number of neurons are removed.” [i.e., pruning criteria/policies comprise pruning until a threshold number of neurons/neural connections are pruned]).

Regarding claims 6 and 19, as discussed above, Molchanov in view of Dai teaches the method of claim 1 and the system of claim 14.
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing determining accuracy levels of each of the plurality of intermediate deep learning models to determine the greatest accuracy among the plurality of intermediate deep learning models.
In the same field, analogous art Dai teaches determining accuracy levels of each of the plurality of intermediate deep learning models to determine the greatest accuracy among the plurality of intermediate deep learning models (see, e.g., paragraphs 23, 27, 59 and 73, “A DNN synthesis tool (referred to herein as ‘NeST’) is disclosed that automates the generation of compact and accurate DNNs. … NeST … iteratively tunes the architecture with … pruning of neurons and connections. Experimental results show that NeST yields accurate, yet very compact DNN”, “the pruning phase 28, the DNN inherits the synthesized architecture and weights from the growth phase and iteratively removes redundant connections 34 and neurons 36, based on their magnitudes. Finally, NeST comes to rest at a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model.”, “LeNet-300-100 (LeNet-5) … and then the DNN architectures were grown from these seeds. The impact of these seeds was studied … and post-growth DNN sizes under the same target accuracy (this accuracy is typically a reference value for the architecture).”, “a synthesis tool, NeST, to synthesize compact yet accurate DNNs. NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy.” [i.e., determine accuracy levels of each of the LeNets/DNNs/deep learning models to determine a DNN model 38 having a high/greatest accuracy among the pruned DNNs generated by the iterations of pruning]).
Molchanov and Dai are analogous art because they are both directed to techniques for pruning neural networks (see, e.g., Molchanov, Abstract and Dai, paragraph 7). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Dai to provide a “DNN [deep neural network] synthesis tool (referred to herein as ‘NeST’) … that automates the generation of compact and accurate DNNs.” (See, e.g., Dai, paragraph 23). Doing so would have allowed Molchanov to use Dai’s NeST tool to create “a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model” and “to synthesize compact yet accurate DNNs” where “NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy”, as suggested by Dai (See, e.g., Dai, paragraphs 27 and 73). 

Regarding claims 9 and 22, as discussed above, Molchanov in view of Dai teaches the method of claim 1 and the system of claim 14.
Molchanov further discloses wherein the pruning includes assigning a zero value to each weight corresponding to each pruned connection (see, e.g., paragraphs 21, 32 and 36, “At step 140, the at least one neuron is removed from the trained neural network to produce a pruned neural network. In one embodiment, a neuron may be removed by setting a layer parameter to zero. In one embodiment, a neuron may be removed by setting a corresponding pruning gate to zero.”, “The first criterion that is used for pruning is an approximation of C(D, hi=0), where the remainder R1 (hi=O) is ignored (i.e., set to zero)”, “pruning techniques regard y as equal to zero” [i.e., pruning includes assigning/setting parameters/weights to zero for neuron connections]).

Regarding claims 10 and 23, as discussed above, Molchanov in view of Dai teaches the method of claim 1 and the system of claim 14.
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein each of the plurality of intermediate deep learning models is a subset of the deep learning model.
In the same field, analogous art Dai teaches wherein each of the plurality of intermediate deep learning models is a subset of the deep learning model (see, e.g., FIGs. 2, 10A and 10B showing that deep learning “Seed Architecture” 20 and seed model sizes “Before Pruning” and before “Pruning Phase” 28 are larger than the intermediate subsets “After Pruning”, and paragraphs 23, 46, 49, 51, 62 and 64, “NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with … pruning of neurons and connections. Experimental results show that NeST yields accurate, yet very compact DNNs”, “insignificant connections and neurons are pruned away”, “Connections with small effective weights are treated as insignificant. Pruning of insignificant weights is an iterative process. In each iteration, the most insignificant weights (e.g., top 1 %) are only pruned for each layer, and then the whole DNN [deep neural network] is retrained”, “connections are pruned away. The whole DNN is retrained after each pruning iteration.”, “For the pruning phase, next, the … DNNs are pruned … It is shown the post-pruning DNN sizes and compression ratios for LeNet-300-100 and LeNet … in FIG. 10.”, “The larger the pre-pruning DNN, the larger is its post-pruning DNN” [i.e., each of the intermediate DNNs/deep neural networks/deep learning models is a subset of neurons and connections of the pre-pruning deep learning model/DNN]).
The motivation to combine Molchanov and Dai is the same as discussed above with respect to claims 1 and 14.

Regarding claims 11 and 24, as discussed above, Molchanov in view of Dai teaches the method of claim 10 and the system of claim 23.
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing wherein a total number of connections in an intermediate deep learning model, of the plurality of intermediate deep learning models, is less than or equal to a total number of connections in the deep learning model.
In the same field, analogous art Dai teaches wherein a total number of connections in an intermediate deep learning model, of the plurality of intermediate deep learning models, is less than or equal to a total number of connections in the deep learning model (see, e.g., FIG. 2 - showing that the total number of connections in the intermediate “Pruning Phase” 28 deep learning model are less than the number of connections in the “Seed Architecture” 20 deep learning model, and paragraphs 23, 46, 49, 51, 62 and 64, “NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with … pruning of neurons and connections. … NeST yields accurate, yet very compact DNNs”, “insignificant connections and neurons are pruned away”, “Connections with small effective weights are treated as insignificant. Pruning of insignificant weights is an iterative process. In each iteration, the most insignificant weights (e.g., top 1 %) are only pruned for each layer, and then the whole DNN [deep neural network] is retrained”, “connections are pruned away.”, “For the pruning phase, next, the … DNNs are pruned … It is shown the post-pruning DNN sizes and compression ratios for LeNet-300-100 and LeNet … in FIG. 10.”, “The larger the pre-pruning DNN, the larger is its post-pruning DNN” [i.e., the total number of connections and neurons in the intermediate DNNs/deep neural networks/deep learning models are less than the total number of neurons and connections in the pre-pruning deep learning model/DNN]).
The motivation to combine Molchanov and Dai is the same as discussed above with respect to claims 1 and 14.

Regarding claim 12, as discussed above, Molchanov in view of Dai teaches the method of claim 1.
Molchanov further discloses implementing the … deep learning model (see, e.g., paragraphs 17, 50 and 81, “the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network.”, “FIG. 3 illustrates a parallel processing unit (PPU) 300, in accordance with one embodiment. The PPU 300 may be configured to implement neural network pruning … PPU 300 is configured to implement the neural network pruning system 250.”, “the PPU 300 comprises a deep learning or machine learning processor. The PPU 300 is configured to receive commands that specify programs for modeling neural networks and processing data according to a neural network. [i.e., implementing the neural network/deep learning model with GPU, CPU or PPU processors]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing the optimized deep learning model.
In the same field, analogous art Dai teaches the optimized deep learning model (see, e.g., paragraphs 8, 23, 27, 59 and 73 and claim 10, “a neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures”, “method for generating one or more optimal neural network architectures”, “NeST comes to rest at a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model.”, “the DNN architectures were grown from these seeds. The impact of these seeds was studied … and post-growth DNN sizes under the same target accuracy (this accuracy is typically a reference value for the architecture).”, “a synthesis tool, NeST, to synthesize compact yet accurate DNNs. NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy.”, “A neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures” [i.e., the most accurate deep learning model/DNN model 38 having a high/greatest accuracy is the optimal/optimized deep learning model]).
The motivation to combine Molchanov and Dai is the same as discussed above with respect to claim 1.

Regarding claim 13, as discussed above, Molchanov in view of Dai teaches the method of claim 12.
Molchanov further discloses implementing of the one of the plurality of intermediate deep learning models (see, e.g., paragraphs 17, 50 and 81, “the method 100 may be executed by a GPU (graphics processing unit), CPU (central processing unit), neural network, or any processor capable of implementing a neural network.”, “FIG. 3 illustrates a parallel processing unit (PPU) 300, in accordance with one embodiment. The PPU 300 may be configured to implement neural network pruning … PPU 300 is configured to implement the neural network pruning system 250.”, “the PPU 300 comprises a deep learning or machine learning processor. The PPU 300 is configured to receive commands that specify programs for modeling neural networks and processing data according to a neural network. [i.e., implementing one of the intermediate neural network/deep learning models being pruned with a GPU, CPU or PPU processor]).
Although Molchanov substantially discloses the claimed invention, Molchanov is not relied on for explicitly disclosing determining the greatest accuracy based on … one of the plurality of intermediate deep learning models.
In the same field, analogous art Dai teaches determining the greatest accuracy based on … one of the plurality of intermediate deep learning models (see, e.g., paragraphs 8, 23, 27, 59 and 73 and claim 10, “a neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures”, “method for generating one or more optimal neural network architectures”, “NeST comes to rest at a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model.”, “the DNN architectures were grown from these seeds. The impact of these seeds was studied … and post-growth DNN sizes under the same target accuracy (this accuracy is typically a reference value for the architecture).”, “a synthesis tool, NeST, to synthesize compact yet accurate DNNs. NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy.”, “A neural network synthesis tool (NeST) that automatically generates one or more optimal neural network architectures” [i.e., determine the highest/greatest accuracy/target accuracy based on the optimal/most accurate deep learning model/DNN model 38 having a high accuracy]).
Molchanov and Dai are analogous art because they are both directed to techniques for pruning neural networks (see, e.g., Molchanov, Abstract and Dai, paragraph 7). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “method, computer readable medium, and system … for neural network pruning” of Molchanov (See, Molchanov, Abstract) to incorporate the teachings of Dai to provide a “DNN [deep neural network] synthesis tool (referred to herein as ‘NeST’) … that automates the generation of compact and accurate DNNs.” (See, e.g., Dai, paragraph 23). Doing so would have allowed Molchanov to use Dai’s NeST tool to create “a lightweight DNN model 38 that incurs no accuracy degradation relative to a fully connected model” and “to synthesize compact yet accurate DNNs” where “NeST starts from a sparse seed architecture, adaptively adjusts the architecture through … pruning, and finally arrives at a compact DNN with high accuracy”, as suggested by Dai (See, e.g., Dai, paragraphs 27 and 73). 

Claims 5, 7-8, 18 and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov in view of Dai as applied to claims 2 and 15 and further in view of non-patent literature Molchanov et al. (“Pruning convolutional neural networks for resource efficient inference." arXiv preprint arXiv:1611.06440 v2 (2017): 1-17, hereinafter “Molchanov NPL”).
Regarding claims 5 and 18, as discussed above, Molchanov in view of Dai teaches the method of claim 2 and the system of claim 15.
Molchanov further discloses pruning … based on the predetermined pruning policies (see, e.g., FIG. 1A – showing block 120 with “a pruning criterion for each layer parameter based on the first-order gradient corresponding to the layer parameters, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network” prior to subsequent blocks 130 and 140 where lowest importance neurons are identified and the identified neurons are removed/pruned, based on the pre-determined pruning criterion, “from the trained neural network to produce a pruned neural network” and paragraphs 20-21, “At step 130, at least one neuron having a lowest importance is identified. In one embodiment, the at least one neuron corresponds to a feature map in a convolutional layer. In one embodiment, the at least one neuron includes neurons having importances below a threshold value [i.e., a predetermined threshold]. In one embodiment, the at least one neuron comprises a predetermined percentage of all of the neurons in the trained neural network. At step 140, the at least one neuron is removed from the trained neural network to produce a pruned neural network.” [i.e., pruning based on the predetermined pruning criteria/policies]). 
Although Molchanov in view of Dai substantially teaches the claimed invention, Molchanov in view of Dai is not relied on for teaching selecting, at random, respective combinations of two or more connections for pruning; and
pruning each of the respective combinations.
In the same field, analogous art Molchanov NPL teaches selecting, at random, respective combinations of two or more connections for pruning (see, e.g., FIG. 5 depicting random “Pruning of feature maps” for neural networks, and page 9, section 3.5, “updates are used between pruning iterations. When results are displayed w.r.t. FLOPs, the difference with random pruning is only 0%-4%, … Increasing the number of updates from 100 to 1000 improves performance of pruning significantly for both the Taylor criterion and random pruning” [i.e., random pruning/random selection of connections for pruning]); and
pruning each of the respective combinations (see, e.g., pages 2 and 6, sections 2 and 3.1, “Alternate iterations of pruning and further fine-tuning; 3) Stop pruning after reaching the target trade-off between accuracy and pruning objective”, “we opt for encouraging a balanced pruning that distributes selection across all layers. Next, we iteratively prune the network” [i.e., pruning each combination via pruning iterations]).
Molchanov, Dai and Molchanov NPL are analogous art because they are each directed to techniques for pruning neural networks (see, e.g., Molchanov, Abstract, Dai, paragraph 7, and Molchanov NPL, Abstract). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Molchanov NPL with Molchanov in view of Dai to provide “a new formulation for pruning convolutional kernels in neural networks to enable efficient inference” and to provide “a new [pruning] criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.” (See, e.g., Molchanov NPL, Abstract). Doing so would have allowed Molchanov in view of Dai to use Molchanov NPL’s new formulation and pruning criterion to produce a “computationally efficient procedure that maintains good generalization in the pruned network” and to achieve “superior performance compared to other criteria” for pruning neural networks, as suggested by Molchanov NPL (See, e.g., Molchanov NPL, Abstract). 

Regarding claims 7 and 20, as discussed above, Molchanov in view of Dai teaches the method of claim 6 and the system of claim 19.
Although Molchanov in view of Dai substantially teaches the claimed invention, Molchanov in view of Dai is not relied on for teaching wherein determining of the accuracy levels includes using a predetermined validation technique to determine the accuracy levels.
In the same field, analogous art Molchanov NPL teaches wherein determining of the accuracy levels includes using a predetermined validation technique to determine the accuracy levels (see, e.g., pages 9-10, “We also test our pruning scheme on the large-scale ImageNet classification task. In the first experiment, we begin with a trained CaffeNet implementation of AlexNet with 79:2% top-5 validation accuracy.”, “Results for ImageNet dataset are reported as top-5 accuracy on validation set. … Fine-tuning after pruning significantly improves results: the network pruned to 11:5 GFLOPs improves from 83% to 87% top-5 validation accuracy” [i.e., determine the accuracy level of the pruning scheme/pruned neural networks using a predetermined validation set/task/technique]).
Molchanov, Dai and Molchanov NPL are analogous art because they are each directed to techniques for pruning neural networks (see, e.g., Molchanov, Abstract, Dai, paragraph 7, and Molchanov NPL, Abstract). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Molchanov NPL with Molchanov in view of Dai to provide “a new formulation for pruning convolutional kernels in neural networks to enable efficient inference” and to provide “a new [pruning] criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.” (See, e.g., Molchanov NPL, Abstract). Doing so would have allowed Molchanov in view of Dai to use Molchanov NPL’s new formulation and pruning criterion to produce a “computationally efficient procedure that maintains good generalization in the pruned network” and to achieve “superior performance compared to other criteria” for pruning neural networks, as suggested by Molchanov NPL (See, e.g., Molchanov NPL, Abstract). 

Regarding claims 8 and 21, as discussed above, Molchanov in view of Dai and Molchanov NPL teaches the method of claim 7 and the system of claim 20.
Although Molchanov in view of Dai substantially teaches the claimed invention, Molchanov in view of Dai is not relied on for teaching wherein the predetermined validation technique comprises determining an error level corresponding to each of the plurality of intermediate deep learning models.
In the same field, analogous art Molchanov NPL teaches wherein the predetermined validation technique comprises determining an error level corresponding to each of the plurality of intermediate deep learning models (see, e.g., pages 2 and 8-10, “During pruning, we refine a subset of parameters which preserves the accuracy of the adapted network … we reach the global minimum of the error function”, “While pruning increases classification error by nearly 6%, additional fine-tuning restores much of the lost accuracy, yielding a final pruned network with … only a 2:5% loss in accuracy”, “test our pruning scheme … with a trained CaffeNet implementation of AlexNet with 79:2% top-5 validation accuracy.”, “Results for ImageNet dataset are reported as top-5 accuracy on validation set. … Fine-tuning after pruning significantly improves results: the network pruned to 11:5 GFLOPs improves from 83% to 87% top-5 validation accuracy” [i.e., validation technique includes determining a classification error level/inaccuracy]).
The motivation to combine Molchanov, Dai and Molchanov NPL is the same as discussed above with respect to claims 7 and 20.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
For example, non-patent literature Li, Hao, et al. ("Pruning filters for efficient convnets." arXiv preprint arXiv:1608.08710v3 (2017): 1-13, hereinafter “Li”) discloses “Pruning filters across consecutive layers” and “Pruning residual blocks with … filters to be pruned for the second layer … are determined by the pruning result” in FIGs. 3 and 4. Pages 3-5, sections 3.1 and 3.3 of Li disclose that “The kernels in the next convolutional layer corresponding to the pruned feature maps are also removed.”, “pruning … the second layer of each residual block results in additional pruning of other layers. To prune filters across multiple layers to prune the second convolutional layer of the residual block, the corresponding projected feature maps must also be pruned.” and “Since the identical feature maps are more important …, the feature maps to be pruned should be determined by the pruning results of the shortcut layer. To determine which identity feature maps are to be pruned, we use the same selection criterion based on the filters of the shortcut convolutional layers (with 1 x 1 kernels). The second layer of the residual block is pruned with the same filter index as selected by the pruning of the shortcut layer.”

The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125