Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
a performance estimating module configured to in claims 1 and 10
a portion selecting module configured to in claims 1 and 10
a new neural network generating module configured to in claims 1 and 10
a final neural network output module configured to in claims 1 and 10
a neural network input module configured to in claim 2
an analyzing module configured to in claim 2
a portion determining module configured to in claim 2
a subset generating module configured to in claims 4 and 13
a subset learning module configured to in claims 4 and 14
a subset performance check module configured to in claim 4
a reward module configured to in claim 4
a final neural network performance check module configured to in claims 5 and 15
a final output module configured to in in claims 5 and 15
a neural network sampling module in claims 6 and 10
a performance check module configured to in claims 6 and 10
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.  Structure for the claim modules is provided in at least [¶0110] of the instant specification. 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 4 and 14 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 4 and 14, "a subset learning module configured to learn the subset generated by the subset generating module" is indefinite.  One of ordinary skill in the art would not be able to determine from the claim or the instant specification whether or not the subset learning module is simply determining the generated subset or training the subset.  In the interest of further examination learning the subset generated by the subset generating module is interpreted as determining the subset.

Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a device which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes and mathematical calculations.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
select a portion of the neural network whose operation deviates from the limitation requirement (observation, evaluation, and judgement),
 generate a subset by changing a layer structure included in the portion of the neural network (mathematical relationship)
determine an optimal layer structure based on the estimated performance (observation, evaluation, and judgement)
change the portion to the optimal layer structure to generate a new neural network (mathematical relationship)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “performance estimating module”, “portion selecting module”, “new neural network generating module”, and “final neural network output module”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 1 also recites additional insignificant extra-solution activity “to output estimated performance based on operations of a neural network and limitation requirements of resources used to perform the operations of the neural network”, “to receive the estimated performance from the performance estimating module“, and “to output the new neural network generated by the new neural network generating module as a final neural network” which amounts to gathering and outputting data (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015) (presenting offers and gathering statistics amounted to mere data gathering)).  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claims 10 and 19, which recite a system and a computer program product, respectively, as well as to dependent claims 2-9, 11-18, and 20. The additional limitations of the dependent claims are addressed briefly below:
Dependent claims 2 and 11 recite additional generic computer components “a neural network input module”, “an analyzing module”, and “a portion determining module”.  Claims 2, 11, and 20 recite additional insignificant extra-solution activity of gathering and outputting data “to receive information of the neural network” as well as additional observation, evaluation , and judgement “to search the information of the neural network and analyze whether the estimated performance deviates from the limitation requirements”, and “to determine a layer in which the estimated performance deviates from the limitation requirements as the portion”
Dependent claims 3 and 12 recites additional observation, evaluation, and judgement “the analyzing module sets a threshold reflecting the limitation requirements and then analyzes whether the estimated performance exceeds the threshold
Dependent claims 4 recites additional generic computer components “subset generating module”, “subset learning module”, “subset performance check module”, and “a reward module”.  Claim 4 also recites additional mathematical relationships “to generate the subset including at least one change layer structure generated by changing the layer structure of the portion” as well as additional observation, evaluation, and judgement “to learn the subset generated by the subset generating module “, and “to check the performance of the subset using the estimated performance and determine the optimal layer structure to generate the new neural network” as well as additional insignificant extra-solution activity of outputting data “to provide a reward to the subset generating module based on the subset learned by the subset learning module and the performance of the subset checked by the subset performance check module”
Dependent claims 5 and 15 recite additional generic computer components “final neural network performance check module” and “final output module”.  Claims 5 and 15 also recite additional observation, evaluation, and judgement “to check the performance of the final neural network” as well as insignificant extra-solution activity of outputting data “to output the final neural network”
Dependent claim 6 recites additional generic computer components “neural network sampling module” and “performance check module” as well as additional observation, evaluation, and judgement “to sample the subset generated by the new neural network generating module” and “to check the performance of the neural network sampled in the subset” as well as insignificant extra-solution activity of outputting data “provide update information to the performance estimating module based on a result of the check executed by the performance check module”
Dependent claims 7 and 16 recite additional insignificant extra-solution activity of outputting data “the performance estimating module outputs the estimated performance for a single indicator.”
Dependent claims 8 and 17 recite additional insignificant extra-solution activity of outputting data “the performance estimating module outputs the estimated performance for a composite indicator.”
Dependent claims 9 and 18 recite additional insignificant extra-solution activity “the limitation requirements include a first limitation requirement and a second limitation requirement different from the first limitation requirement, and the estimated performance includes first estimated performance according to the first limitation requirement and second estimated performance according to the second limitation requirement” which amounts to selection of a data type.  Claims 9 and 18 also recite additional observation, evaluation, and judgement “the portion selecting module selects a first portion in which the first estimated performance deviates from the first limitation requirement in the neural network and a second portion in which the second estimated performance deviates from the second limitation requirement” as well as additional mathematical relationships “the new neural network generating module changes the first portion to a first optimal layer structure and changes the second portion to a second optimal layer structure to generate the new neural network, the first optimal layer structure is a layer structure determined through the reinforcement learning from the layer structure included in the first portion, and the second optimal layer structure is a layer structure determined through the reinforcement learning from the layer structure included in the second portion”
Dependent claim 14 recites additional generic computer components “subset learning module” and “reward module”.  Claim 14 also recites additional mathematical relationships “to generate the subset and determine the optimal layer structure” as well as additional observation, evaluation, and judgement “to learn the subset generated by the new neural network generating module” as well as additional insignificant extra-solution activity of outputting data “to provide a reward to the subset generating module based on the subset learned by the subset learning module and the performance of the subset checked by the subset performance check module”
Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-20 are rejected under 35 U.S.C. § 101. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 6-14, and 16-20 are rejected under 35 U.S.C. 102 as being anticipated by He (“AMC: AutoML for Model Compression and Acceleration on Mobile Devices”, 2019).

	Regarding claim 1, He teaches A neural network optimizing device comprising: ([p. 4 §2] "Many works on searching models with reinforcement learning and genetic algorithms [46, 42, 5, 37] greatly improve the performance of neural networks...AMC engine optimizes for both accuracy and latency, requires a simple non-RNN controller, can do fast exploration with fewer GPU hours, and also support continuous action space")
	a performance estimating module configured to output estimated performance based on operations of a neural network ([p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Predicting final model accuracy interpreted as synonymous with estimating performance based on operations of a neural network.)
	and limitation requirements of resources used to perform the operations of the neural network; ([Abstract] "Model compression is an effective technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets" [p. 3 §I] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget")
	a portion selecting module configured to receive the estimated performance from the performance estimating module and select a portion of the neural network whose operation deviates from the limitation requirements; ([p. 3 §I] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget" [p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy." [p. 4 §3.1] "Fine-grained pruning [19] aims to prune individual unimportant elements in weight tensors, which is able to achieve very high compression rate with no loss of accuracy." Pruning interpreted as synonymous with selecting a portion of the neural network whose operation deviates from the limitation requirements.)
	a new neural network generating module configured to, through reinforcement learning, generate a subset by changing a layer structure included in the portion of the neural network, determine an optimal layer structure based on the estimated performance, and change the portion to the optimal layer structure to generate a new neural network; and ([p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy." Generating a subset interpreted as synonymous with generating a new neural network by pruning.)
	a final neural network output module configured to output the new neural network generated by the new neural network generating module as a final neural network. ([p. 9 §4.1] "Fine-tuning a pruned model usually takes a very long time. We observe a correlation between the pre-fine-tune accuracy and the post fine-tuning accuracy [20, 22]. As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Fine-tuning module interpreted as synonymous with final neural network output module.  Final model interpreted as synonymous with final neural network.). 

	Regarding claim 2, He teaches The neural network optimizing device of claim 1, wherein the portion selecting module includes: a neural network input module configured to receive information of the neural network; ([p. 5 §3.2] "For each layer t, we have 11 features that characterize the state st: where t is the layer index, the dimension of the kernel is n×c×k×k, and the input is c × h × w. FLOP s[t] is the FLOPs of layer Lt. Reduced is the total number of reduced FLOPs in previous layers. Rest is the number of remaining FLOPs in the following layers. Before being passed to the agent, they are scaled within [0, 1]. Such features are essential for the agent to distinguish one convolutional layer from another."" State features interpreted as synonymous with information of the neural network.  Receiving interpreted as synonymous with being passed to the agent.)
	an analyzing module configured to search the information of the neural network and analyze whether the estimated performance deviates from the limitation requirements; and ([p. 3 §I] "we propose resource-constrained compression to achieve the best accuracy given the maximum amount of hardware resources (e.g., FLOPs, latency, and model size)" [p. 5 §3.2] "AMC leverages reinforcement learning for efficient search over action space. Here we introduce the detailed setting of reinforcement learning framework...For each layer t, we have 11 features that characterize the state st: where t is the layer index, the dimension of the kernel is n×c×k×k, and the input is c × h × w. FLOP s[t] is the FLOPs of layer Lt. Reduced is the total number of reduced FLOPs in previous layers. Rest is the number of remaining FLOPs in the following layers. Before being passed to the agent, they are scaled within [0, 1]. Such features are essential for the agent to distinguish one convolutional layer from another.")
	a portion determining module configured to determine a layer in which the estimated performance deviates from the limitation requirements as the portion. ([p. 3 §1] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget"  [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" [p. 10 §4.1] "we follow the settings in [16] to conduct 4-iteration pruning & fine-tuning experiments, where the overall density of the full model is set to [50%, 35%, 25% and 20%] in each iteration. For each stage, we run AMC to determine the sparsity ratio of each layer given the overall sparsity. The model is then pruned and fine-tuned for 30 epochs following common protocol." He explicitly teaches that the model is pruned on a layer by layer basis based on the action space (pruning ratio) which is a function of the resource budget (limitation requirements).). 

	Regarding claim 3, He teaches The neural network optimizing device of claim 2, wherein the analyzing module sets a threshold reflecting the limitation requirements and then analyzes whether the estimated performance exceeds the threshold. ([p. 6 §3.2] "By limiting the action space (the sparsity ratio for each layer), we can accurately arrive at the target compression ratio...we allow arbitrary action a at the first few layers; we start to limit the action a when we find that the budget is insufficient even after compressing all the following layers with most aggressive strategy." Budget is interpreted as synonymous with threshold.). 

	Regarding claim 4, He teaches The neural network optimizing device of claim 1, wherein the new neural network generating module includes: a subset generating module configured to generate the subset including at least one change layer structure generated by changing the layer structure of the portion; ([p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning)" Compressing layer by channel pruning interpreted as synonymous with generating a subset including at least one change layer structure generated by changing the layer structure of the portion.)
	a subset learning module configured to learn the subset generated by the subset generating module; ([p. 6 §3.2] "Following Block-QNN [54], which applies a variant form of Bellman’s Equation [50], each transition in an episode is (st, at, R, st+1), where R is the reward after the network is compressed. During the update, the baseline reward b is subtracted to reduce the variance of gradient estimation, which is an exponential moving average of the previous rewards" Determining a reward based on the compressed network is interpreted as synonymous with learning the subset in light of the instant specification.)
	a subset performance check module configured to check the performance of the subset using the estimated performance and determine the optimal layer structure to generate the new neural network; and ([p. 4 §3] "We aim to automatically find the redundancy for each layer, characterized by sparsity. We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy")
	a reward module configured to provide a reward to the subset generating module based on the subset learned by the subset learning module and the performance of the subset checked by the subset performance check module. ([p. 6 §3.2] "Following Block-QNN [54], which applies a variant form of Bellman’s Equation [50], each transition in an episode is (st, at, R, st+1), where R is the reward after the network is compressed. During the update, the baseline reward b is subtracted to reduce the variance of gradient estimation, which is an exponential moving average of the previous rewards"). 

	Regarding claim 6, He teaches The neural network optimizing device of claim 1, further comprising: a neural network sampling module configured to sample the subset generated by the new neural network generating module; and ([p. 2 §I] "we propose AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and greatly improve the model compression quality." Sampling the design space to improve model compression quality is interpreted as synonymous with sampling the subset generated by the new neural network generating module.)
	a performance check module configured to check the performance of the neural network sampled in the subset and provide update information to the performance estimating module based on a result of the check executed by the performance check module. ([p. 4 §3] "We present an overview of our AutoML for Model Compression(AMC) engine in Figure 1. We aim to automatically find the redundancy for each layer, characterized by sparsity. We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy. Then we update the agent by encouraging smaller, faster and more accurate models."). 

	Regarding claim 7, He teaches The neural network optimizing device of claim 1, wherein the performance estimating module outputs the estimated performance for a single indicator. ([p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Predicting final model accuracy interpreted as synonymous with estimating performance based on operations of a neural network.  Model accuracy interpreted as a single indicator. Table 2 interpreted as output of estimating.  Val Acc interpreted as single indicator.). 

	Regarding claim 8, He teaches The neural network optimizing device of claim 1, wherein the performance estimating module outputs the estimated performance for a composite indicator. ([p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Predicting final model accuracy interpreted as synonymous with estimating performance based on operations of a neural network.  Table 2 interpreted as output of estimating. Ratio interpreted as composite indicator.). 

	Regarding claim 9, He teaches The neural network optimizing device of claim 1, wherein: the limitation requirements include a first limitation requirement and a second limitation requirement different from the first limitation requirement, and the estimated performance includes first estimated performance according to the first limitation requirement and second estimated performance according to the second limitation requirement, ([p. 3 §1] "AMC distinguishes from other works by getting reward without fine-tuning, continuous search space control, and can produce both accuracy-guaranteed and hardware resource-constrained models" [p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning." [p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration" Model accuracy interpreted as synonymous with first limitation requirement and resource constraint interpreted as synonymous with second limitation requirement. He teaches estimating/predicting both the accuracy as well as the action space which is dependent on the resource constraint.)
	the portion selecting module selects a first portion in which the first estimated performance deviates from the first limitation requirement in the neural network and a second portion in which the second estimated performance deviates from the second limitation requirement, and ([p. 3 §1] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget"  [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" [p. 10 §4.1] "we follow the settings in [16] to conduct 4-iteration pruning & fine-tuning experiments, where the overall density of the full model is set to [50%, 35%, 25% and 20%] in each iteration. For each stage, we run AMC to determine the sparsity ratio of each layer given the overall sparsity. The model is then pruned and fine-tuned for 30 epochs following common protocol." He explicitly teaches that the model is pruned on a layer by layer basis based on the action space (pruning ratio) which is a function of the resource budget (limitation requirements).  Pruned layer Lt interpreted as synonymous with first portion.  Pruned layer Lt+1 interpreted as synonymous with second portion.)
	the new neural network generating module changes the first portion to a first optimal layer structure and changes the second portion to a second optimal layer structure to generate the new neural network, the first optimal layer structure is a layer structure determined through the reinforcement learning from the layer structure included in the first portion, and the second optimal layer structure is a layer structure determined through the reinforcement learning from the layer structure included in the second portion. ([p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy." [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" See also FIG. 1. Generating a subset interpreted as synonymous with generating a new neural network by pruning.  Pruned layer Lt interpreted as synonymous with first portion changed to a first optimal layer structure.  Pruned layer Lt+1 interpreted as synonymous with second portion changed to a second optimal layer structure.). 

	Regarding claim 10, claim 10 is substantially similar to claim 6.  Therefore, the rejection applied to claim 6 also applies to claim 10.  

	Regarding claim 11, He teaches The neural network optimizing device of claim 10, wherein the portion selecting module includes: a neural network input module configured to receive information of the neural network; ([p. 5 §3.2] "For each layer t, we have 11 features that characterize the state st: where t is the layer index, the dimension of the kernel is n×c×k×k, and the input is c × h × w. FLOP s[t] is the FLOPs of layer Lt. Reduced is the total number of reduced FLOPs in previous layers. Rest is the number of remaining FLOPs in the following layers. Before being passed to the agent, they are scaled within [0, 1]. Such features are essential for the agent to distinguish one convolutional layer from another."" State features interpreted as synonymous with information of the neural network.  Receiving interpreted as synonymous with being passed to the agent.)
	an analyzing module configured to search the information of the neural network and analyze whether the estimated performance deviates from the limitation requirements; and ([p. 3 §I] "we propose resource-constrained compression to achieve the best accuracy given the maximum amount of hardware resources (e.g., FLOPs, latency, and model size)" [p. 5 §3.2] "AMC leverages reinforcement learning for efficient search over action space. Here we introduce the detailed setting of reinforcement learning framework...For each layer t, we have 11 features that characterize the state st: where t is the layer index, the dimension of the kernel is n×c×k×k, and the input is c × h × w. FLOP s[t] is the FLOPs of layer Lt. Reduced is the total number of reduced FLOPs in previous layers. Rest is the number of remaining FLOPs in the following layers. Before being passed to the agent, they are scaled within [0, 1]. Such features are essential for the agent to distinguish one convolutional layer from another.")
	a portion determining module configured to determine a layer in which the estimated performance deviates from the limitation requirements as the portion. ([p. 3 §1] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget"  [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" [p. 10 §4.1] "we follow the settings in [16] to conduct 4-iteration pruning & fine-tuning experiments, where the overall density of the full model is set to [50%, 35%, 25% and 20%] in each iteration. For each stage, we run AMC to determine the sparsity ratio of each layer given the overall sparsity. The model is then pruned and fine-tuned for 30 epochs following common protocol." He explicitly teaches that the model is pruned on a layer by layer basis based on the action space (pruning ratio) which is a function of the resource budget (limitation requirements).). 

	Regarding claim 12, He teaches The neural network optimizing device of claim 11, wherein the analyzing module sets a threshold reflecting the limitation requirements and then analyzes whether the estimated performance exceeds the threshold. ([p. 6 §3.2] "By limiting the action space (the sparsity ratio for each layer), we can accurately arrive at the target compression ratio...we allow arbitrary action a at the first few layers; we start to limit the action a when we find that the budget is insufficient even after compressing all the following layers with most aggressive strategy." Budget is interpreted as synonymous with threshold.). 

	Regarding claim 13, He teaches The neural network optimizing device of claim 10, wherein the new neural network generating module includes: a subset generating module configured to generate the subset including at least one change layer structure generated by changing the layer structure of the portion; and ([p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning)" Compressing layer by channel pruning interpreted as synonymous with generating a subset including at least one change layer structure generated by changing the layer structure of the portion.)
	a subset performance check module configured to check the performance of the subset using the estimated performance and determine the optimal layer structure to generate the new neural network; and ([p. 4 §3] "We aim to automatically find the redundancy for each layer, characterized by sparsity. We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy"). 

	Regarding claim 14, He teaches  The neural network optimizing device of claim 13, wherein: the new neural network generating module performs reinforcement learning to generate the subset and determine the optimal layer structure, and ([p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy." Generating a subset interpreted as synonymous with generating a new neural network by pruning.)
	the neural network optimizing device further comprises: a subset learning module configured to learn the subset generated by the subset generating module; ([p. 6 §3.2] "Following Block-QNN [54], which applies a variant form of Bellman’s Equation [50], each transition in an episode is (st, at, R, st+1), where R is the reward after the network is compressed. During the update, the baseline reward b is subtracted to reduce the variance of gradient estimation, which is an exponential moving average of the previous rewards" Determining a reward based on the compressed network is interpreted as synonymous with learning the subset in light of the instant specification.)
	a reward module configured to provide a reward to the subset generating module based on the subset learned by the subset learning module and the performance of the subset checked by the subset performance check module. ([p. 6 §3.2] "Following Block-QNN [54], which applies a variant form of Bellman’s Equation [50], each transition in an episode is (st, at, R, st+1), where R is the reward after the network is compressed. During the update, the baseline reward b is subtracted to reduce the variance of gradient estimation, which is an exponential moving average of the previous rewards"). 

	Regarding claim 16, He teaches The neural network optimizing device of claim 10, wherein the performance estimating module outputs the estimated performance for a single indicator. ([p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Predicting final model accuracy interpreted as synonymous with estimating performance based on operations of a neural network.  Model accuracy interpreted as a single indicator. Table 2 interpreted as output of estimating.  Val Acc interpreted as single indicator.). 

	Regarding claim 17, He teaches The neural network optimizing device of claim 10, wherein the performance estimating module outputs the estimated performance for a composite indicator. ([p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration." Predicting final model accuracy interpreted as synonymous with estimating performance based on operations of a neural network.  Table 2 interpreted as output of estimating. Ratio interpreted as composite indicator.). 

	Regarding claim 18, He teaches The neural network optimizing device of claim 10, wherein: the limitation requirements include a first limitation requirement and a second limitation requirement different from the first limitation requirement, and the estimated performance includes first estimated performance according to the first limitation requirement and second estimated performance according to the second limitation requirement, ([p. 3 §1] "AMC distinguishes from other works by getting reward without fine-tuning, continuous search space control, and can produce both accuracy-guaranteed and hardware resource-constrained models" [p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning." [p. 9 §4.1] "As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration" Model accuracy interpreted as synonymous with first limitation requirement and resource constraint interpreted as synonymous with second limitation requirement. He teaches estimating/predicting both the accuracy as well as the action space which is dependent on the resource constraint.)
	the portion selecting module selects a first portion in which the first estimated performance deviates from the first limitation requirement in the neural network and a second portion in which the second estimated performance deviates from the second limitation requirement, and ([p. 3 §1] "We achieve resource-constrained compression by constraining the search space, in which the action space (pruning ratio) is constrained such that the model compressed by the agent is always below the resources budget"  [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" [p. 10 §4.1] "we follow the settings in [16] to conduct 4-iteration pruning & fine-tuning experiments, where the overall density of the full model is set to [50%, 35%, 25% and 20%] in each iteration. For each stage, we run AMC to determine the sparsity ratio of each layer given the overall sparsity. The model is then pruned and fine-tuned for 30 epochs following common protocol." He explicitly teaches that the model is pruned on a layer by layer basis based on the action space (pruning ratio) which is a function of the resource budget (limitation requirements).  Pruned layer Lt interpreted as synonymous with first portion.  Pruned layer Lt+1 interpreted as synonymous with second portion.).
	the new neural network generating module changes the first portion to a first optimal layer structure and changes the second portion to a second optimal layer structure to generate the new neural network, the first optimal layer structure is a layer structure determined through reinforcement learning from the layer structure included in the first portion, and the second optimal layer structure is a layer structure determined through reinforcement learning from the layer structure included in the second portion. ([p. 4 §3] "We train an reinforcement learning agent to predict the action and give the sparsity, then perform form the pruning. We quickly evaluate the accuracy after pruning but before fine-tuning as an effective delegate of final accuracy." [p. 5 §3.2] "As illustrated in Figure 1, the agent receives an embedding state st of layer Lt from the environment and then outputs a sparsity ratio as action at. The underlying layer is compressed with at (rounded to the nearest feasible fraction) using a specified compression algorithm (e.g., channel pruning). Then the agent moves to the next layer Lt+1, and receives state st+1" See also FIG. 1. Generating a subset interpreted as synonymous with generating a new neural network by pruning.  Pruned layer Lt interpreted as synonymous with first portion changed to a first optimal layer structure.  Pruned layer Lt+1 interpreted as synonymous with second portion changed to a second optimal layer structure.). 

Regarding claims 19-20, claims 19-20 are directed towards the method performed by the device of claims 1-2, respectively.  Therefore, the rejection applied to claims 1-2 also applies to claims 19-20.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over He and in view of Yan (US20200234130A1). 

	Regarding claim 5, He teaches The neural network optimizing device of claim 1, wherein the final neural network output module includes: a final neural network performance check module configured to check the performance of the final neural network; and ([p. 9 §4.1] "Fine-tuning a pruned model usually takes a very long time. We observe a correlation between the pre-fine-tune accuracy and the post fine-tuning accuracy [20, 22]. As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration" Observing post-fine tune accuracy interpreted as synonymous with checking the performance of the final neural network.).
	However, He does not explicitly teach a final output module configured to output the final neural network.  

Yan, in the same field of endeavor, teaches a final output module configured to output the final neural network. ([¶0232] " in one embodiment, training/fine-tuning logic 2111 may then be triggered to train or fine-tune the narrow CNN and, if necessitated, continue to repeat one or more the above operations. In one embodiment, communication/compatibility logic 2107 may be used to output the final narrow CNN network structure and models."). 

	He and Yan are both directed towards autonomously pruning neural networks for acceleration.  Therefore, He and Yan are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of He with the teachings of Yan by outputting the final optimized neural network. It would have been obvious to one of ordinary skill in the art that any form of processed data could be output.  This is reinforced by the disclosure of Yan who teaches as an additional motivation for combination ([¶035] “this novel technique allows for yielding of three times (3×) the parameters savings of model size, 3× the runtime memory savings during inference procedures, and two times (2×) the speedup with standard floating-point hardware.”).  This motivation for combination also applies to the remaining claims which depend on this combination. 

	Regarding claim 15, He teaches The neural network optimizing device of claim 10, wherein the final neural network output module includes: a final neural network performance check module configured to check the performance of the final neural network; and ([p. 9 §4.1] "Fine-tuning a pruned model usually takes a very long time. We observe a correlation between the pre-fine-tune accuracy and the post fine-tuning accuracy [20, 22]. As shown in Table 2, policies that obtain higher validation accuracy correspondingly have higher fine-tuned accuracy. This enables us to predict final model accuracy without fine-tuning, which results in an efficient and faster policy exploration" Observing post-fine tune accuracy interpreted as synonymous with checking the performance of the final neural network.).
	However, He does not explicitly teach a final output module configured to output the final neural network.  

Yan, in the same field of endeavor, teaches a final output module configured to output the final neural network. ([¶0232] " in one embodiment, training/fine-tuning logic 2111 may then be triggered to train or fine-tune the narrow CNN and, if necessitated, continue to repeat one or more the above operations. In one embodiment, communication/compatibility logic 2107 may be used to output the final narrow CNN network structure and models."). 

	He and Yan are both directed towards autonomously pruning neural networks for acceleration.  Therefore, He and Yan are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of He with the teachings of Yan by outputting the final optimized neural network. It would have been obvious to one of ordinary skill in the art that any form of processed data could be output.  This is reinforced by the disclosure of Yan who teaches as an additional motivation for combination ([¶035] “this novel technique allows for yielding of three times (3×) the parameters savings of model size, 3× the runtime memory savings during inference procedures, and two times (2×) the speedup with standard floating-point hardware.”).  This motivation for combination also applies to the remaining claims which depend on this combination. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Dong (“DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures”, 2018) is directed towards autonomously optimizing neural networks with respect to both accuracy and device efficiency.  Marculescu is directed towards autonomously optimizing neural networks with respect to hardware (“Hardware-Aware Machine Learning: Modeling and Optimization”, 2018).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126