DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This office action is in response to an Amendment/Argument submitted on 09/09/2022. The applicant does not submit an Information Disclosure Statement. Applicant amends claims 1- 8, 10 – 20. The Section 101 rejection regarding Non-Transitory is withdrawn. The Section 112 rejection regarding the use of the word “if” is withdrawn.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 — 10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of a mental concept of evaluation or observation without significantly more. The claims recite a neural network training method. The claims fail the first prong of the 2019 Subject Matter Eligibility Guidance. The independent claim features of training a neural network using sample data, determining an indicator parameter, determining an update manner, and updating a parameter of a batch normalization are broad. The features do not state with specificity structure that gathers specific sample data. The claims do not further identify what the neural network is trained to do. The USPTO guidance example 39 shows the requirements for claiming training a neural network. The example is specific in defining the data of digital facial images from a database, are processed through transformation operation, and through the creation of training sets the network is trained. 
This judicial exception is not integrated into a practical application because the claims do not identify what the network is trained to do. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims do not identify any structure such as a database or operations that are performed to the data and how the neural network is trained to perform a specific operation. Thus, the claims fail the second prong of the 2019 Subject Matter Eligibility Guidance and not patentable.
Claims 11 — 19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea of a mental concept of evaluation or observation without significantly more. The claims recite a neural network training apparatus. The claims fail the first prong of the 2019 Subject Matter Eligibility Guidance. The independent claim features of training a neural network using sample data, determining an indicator parameter, determining an update manner, and updating a parameter of a batch normalization are broad. The features do not state with specificity structure that gathers specific sample data. The claims do not further identify what the neural network is trained to do. The USPTO guidance example 39 shows the requirements for claiming training a neural network. The example is specific in defining the data of digital facial images from a database, are processed through transformation operation, and through the creation of training sets the network is trained. 
This judicial exception is not integrated into a practical application because the claims do not identify what the network is trained to do. The independent claim identifies generic structural features but does not specifically identify structure that collects particular data. The operation is a generic data gathering claim without identifying what the networks is trained to perform.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims do not identify any structure such as a database or operations that are performed to the data and how the neural network is trained to perform a specific operation. Thus, the claims fail the second prong of the 2019 Subject Matter Eligibility Guidance and not patentable.
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea of a mental concept of evaluation or observation without significantly more. The claim recites a computer readable media on which a computer program instruction is stored. The claims fail the first prong of the 2019 Subject Matter Eligibility Guidance. The independent claim features of training a neural network using sample data, determining an indicator parameter, determining an update manner, and updating a parameter of a batch normalization are broad. The features do not state with specificity structure that gathers specific sample data. The claims do not further identify what the neural network is trained to do. The USPTO guidance example 39 shows the requirements for claiming training a neural network. The example is specific in defining the data of digital facial images from a database, are processed through transformation operation, and through the creation of training sets the network is trained. 
This judicial exception is not integrated into a practical application because the claims do not identify what the network is trained to do. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims do not identify any structure such as a database or operations that are performed to the data and how the neural network is trained to perform a specific operation. Thus, the claims fail the second prong of the 2019 Subject Matter Eligibility Guidance and not patentable.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 – 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The claims do not identify any specific features of the sample data, parameters, or threshold for which the operations are to be performed. Claims 7, 10, 17, and 19 disclose a feature of map, however, that feature is not determinative in the training of the neural network.
Claims 1, 11, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The claims are amended to add the feature of; “wherein the update manner is configured to increase a quantity of zero elements in a feature map output by the first neural network”. The clause is indefinite in that the update is zero to a feature map. Therefore, as written the clause doe not update a feature map. 


As per claim 1, A neural network training method, comprising:
training a first neural network to be trained by using sample image data; (Towal paragraph 0071 discloses, “As an example, the network may be tasked with discriminating between a dog and a cat. In this example, a limited number of training samples or an error in training may be present.”)
determining an indicator parameter of the first neural network in a current training process; (Towal paragraph 0076 discloses, “FIG. 8 illustrates an example of filters 800 trained from a first training iteration (epoch 1) and the same filters 800 after a ninetieth training iteration (epoch 90). The training iterations may sometimes be referred to as training passes. In this example, a data set may have a specific number of images, such as ten thousand. The training uses the images from the data sets to adjust the weights of the filters based on the weight update equation (EQUATION 3). The weights of the filters may be adjusted after training on a specific number of images from the data set, such as one hundred images.”)
determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition, wherein the update manner is configured to increase a quantity of zero elements in a feature map output by the first neural network; (Towal paragraph 0057 discloses, “The results of the deep neural network may then be thresholded 522 and passed through an exponential smoothing block 524 in the classify application 510.”) and
updating a parameter of a batch normalization layer in the first neural network based on the update manner. (Towal paragraph 0063 discloses, “The first convolution layer 604 outputs the results of the convolution to the second convolution layer 606. Furthermore, the second convolution layer 606 outputs the results of the convolution to a third convolution layer 608.
Finally, a predicted label 610 is output from the third convolution layer 608. Of course, aspects of the present disclosure are not limited to three convolution layers and more or less convolution layers may be specified as desired.”)
As per claim 2, The neural network training method of claim 1, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises:
When the indicator parameter meets the preset condition, determining that the update manner is to reduce a translation parameter of the batch normalization layer by a sum of a penalty parameter and a product of a gradient and a learning rate that are updated when each training is performed through backpropagation. (Towal paragraph 0043 discloses, “In lower layers, the gradient may depend on the value of the weights and on the computed error gradients of the higher layers. The weights may then be adjusted so as to reduce the error. This manner of adjusting the weights may be referred to as “back propagation” as it involves a “backward pass” through the neural network.’)
As per claim 3, The neural network training method of claim 2, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition further comprises:
When the indicator parameter does not meet the preset condition, determining that the update manner is to reduce a translation parameter of the batch normalization layer by a product of a gradient and a learning rate that are updated when each training is performed through backpropagation. (Towal paragraph 0083 discloses, “In another configuration, the training of filters that have a particular specificity is terminated to reduce computation costs. That is, the
learning of filters that have a specificity that is greater than or equal to a threshold is stopped so that the weights of the filters are no longer updated.”)
As per claim 4, The neural network training method of claim 1, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining a translation parameter of the batch normalization layer of the first neural network in the current training process, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises: (Towal paragraph 0059 discloses, “According to certain aspects of the present disclosure, each local processing unit 202 may be configured to determine parameters of the model based upon desired one or more functional features of the model, and develop the one or more functional features towards the desired functional features as the determined parameters are further adapted, tuned and updated.”)
determining the update manner corresponding to the preset condition when the translation parameter is greater than a predetermined translation threshold. (Towal paragraph 0059 discloses, “According to certain aspects of the present disclosure, each local processing unit 202 may be configured to determine parameters of the model based upon desired one or more functional features of the model, and develop the one or more functional features towards the desired functional features as the determined parameters are further adapted, tuned and updated.”)
As per claim 5, The neural network training method of claim 1, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining times of training of the first neural network in the current training process, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises: (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining the update manner corresponding to the preset condition when the times of training is greater than a predetermined times threshold. (Towal paragraph 0093 discloses, “In one configuration, when the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
As per claim 6, The neural network training method of claim 1, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining training precision of the first neural network in the current training process, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises: (Towal paragraph 0093 discloses, “In one configuration, when the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining the update manner corresponding to the preset condition when the training precision is greater than a predetermined precision threshold. (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
As per claim 7, The neural network training method of claim 1, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining a ratio of zero elements to all elements in a feature map output from one or more layers of the first neural network in the current training process, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises: (Towal paragraph 0070 discloses, “FIG. 7 illustrates a set of weak filters 702 compared to a set of strong filters 704. As shown in FIG. 7, the weak filters 702 do not have specific definitions. For example, each of the weak filters 702 is generalized and does not have a well-defined outline. In contrast, the definition of the strong filters 704 is greater than the definition of the weak filters 702, such that various lines and angles are visible. The strong filters 704 improve the detection of specific features of an input, such as whether one or more horizontal lines are present in an image.”)
determining the update manner corresponding to the preset condition when the ratio of zero elements to all elements is less than a first ratio threshold. (Towal paragraph 0070 discloses, “FIG. 7 illustrates a set of weak filters 702 compared to a set of strong filters 704. As shown in FIG. 7, the weak filters 702 do not have specific definitions. For example, each of the weak filters 702 is generalized and does not have a well-defined outline. In contrast, the definition of the strong filters 704 is greater than the definition of the weak filters 702, such that various lines and angles are visible. The strong filters 704 improve the detection of specific features of an input, such as whether one or more horizontal lines are present in an image.”)
As per claim 8, The neural network training method of claim 7, wherein the determining an update manner corresponding to a preset condition when the indicator parameter meets the preset condition comprises:
determining whether the ratio of zero elements to all elements is less than a second ratio threshold when the ratio of zero elements to all elements is greater than the first ratio threshold, wherein the second ratio threshold is greater than the first ratio threshold; (Towal paragraph 0077 discloses, “As shown in FIG. 8 at the first training pass, each filter has a specific entropy. For example, a first filter 802 has an entropy of 2.006 and a second filter 804 has an entropy of 2.018. The filters in the first training pass are ordered from low entropy to high entropy. Furthermore, as shown in FIG. 8, the entropy of each filter is modified after the ninetieth training pass (epoch 90). The filters in the ninetieth training pass are ordered from low entropy to high entropy. It should be noted that because the filters in both epoch 1 and epoch 90 are ordered from low entropy to high entropy, the same filters do not have the same positions in each figure. That is, the first filter 808 of epoch 1 may or may not be the first filter 808 of epoch 90. In other words, the first filter 802 of epoch 1 may have had a greater change in entropy in comparison to
neighboring filters such that the first filter 802 of epoch 1 may be, for example, an eleventh filter 814 of epoch 90.”) and
determining the update manner corresponding to the preset condition when the ratio of zero elements to all elements is less than the second ratio threshold. (Towal paragraph 0077 discloses, “As shown in FIG. 8 at the first training pass, each filter has a specific entropy. For example, a first filter 802 has an entropy of 2.006 and a second filter 804 has an entropy of 2.018. The filters in the first training pass are ordered from low entropy to high entropy. Furthermore, as shown in FIG. 8, the entropy of each filter is modified after the ninetieth training pass (epoch 90). The filters in the ninetieth training pass are ordered from low entropy to high entropy. It should be noted that because the filters in both epoch 1 and epoch 90 are ordered from low entropy to high entropy, the same filters do not have the same positions in each figure. That is, the first filter 808 of epoch 1 may or may not be the first filter 808 of epoch 90. In other words, the first filter 802 of epoch 1 may have had a greater change in entropy in comparison to neighboring filters such that the first filter 8302 of epoch 1 may be, for example, an eleventh filter 814 of epoch 90.”)
As per claim 9, The neural network training method of claim 7, wherein the first ratio threshold is updated as a number of iterations increases. (Towal paragraph 0030 discloses, “Specifically, in one configuration, when training a neural network model, a specificity of one or more filters is determined after a predetermined number of training iterations. Furthermore, in this configuration, the network determines whether to continue training each filter based on the specificity.” and paragraph 0063 discloses various iterations)
As per claim 10, The neural network training method of claim 1, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
outputting a first feature map of the sample data through a predetermined layer of the first neural network in the current training process; (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”)
outputting a second feature map of the sample data through a corresponding predetermined layer of a trained second neural network; (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”) and
determining the indicator parameter of the first neural network in the current training process based on a loss function value between the first feature map and the second feature map. (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”)
As per claim 11, A neural network training apparatus, comprising:
a processor; (Towal paragraph 0013 discloses, “Another aspect of the present disclosure is directed to an apparatus for training a neural network model having a memory and one or more processors coupled to the memory. The processor(s) is configured to determine a specificity of multiple filters after a predetermined number of training iterations. The processor(s) is also configured to train each of the filters based on the specificity.”) and
a memory on which a computer program instruction is stored, wherein when the computer program instruction is executed by the processor, the processor performs the following steps: (Towal paragraph 0013 discloses, “Another aspect of the present disclosure is directed to an apparatus for training a neural network model having a memory and one or more processors coupled to the memory. The processor(s) is configured to determine a specificity of multiple filters after a predetermined number of training iterations. The processor(s) is also configured to train each of the filters based on the specificity.”)
training a first neural network to be trained by using sample data; (Towal paragraph 0071 discloses, “As an example, the network may be tasked with discriminating between a dog and a
cat. In this example, a limited number of training samples or an error in training may be present.”)
determining an indicator parameter of the first neural network in a current training process; (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition, wherein the update manner is configured to increase a quantity of zero elements in a feature map output by the first neural network; (Towal paragraph 0057 discloses, “The results of the deep neural network may then be thresholded 522 and passed through an exponential smoothing block 524 in the classify application 510.”) and
updating a parameter of a batch normalization layer in the first neural network based on the update manner. (Towal paragraph 0063 discloses, “The first convolution layer 604 outputs the results of the convolution to the second convolution layer 606. Furthermore, the second convolution layer 606 outputs the results of the convolution to a third convolution layer 608. Finally, a predicted label 610 is output from the third convolution layer 608. Of course, aspects of the present disclosure are not limited to three convolution layers and more or less convolution layers may be specified as desired.”)
As per claim 12, The neural network training apparatus of claim 11, wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition comprises:
the indicator parameter meets the preset condition, determining that the update manner is to reduce a translation parameter of the batch normalization layer by a sum of a penalty parameter and a product of a gradient and a learning rate that are updated when each training is performed through backpropagation. (Towal paragraph 0043 discloses, “In lower layers, the gradient may depend on the value of the weights and on the computed error gradients of the higher layers. The weights may then be adjusted so as to reduce the error. This manner of adjusting the weights may be referred to as “back propagation” as it involves a “backward pass” through the neural network.’)
As per claim 13, The neural network training apparatus of claim 12, wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition further comprises:
the indicator parameter does not meet the preset condition, determining that the update manner is to reduce a translation parameter of the batch normalization layer by a product of a gradient and a learning rate that are updated when each training is performed through backpropagation. (Towal paragraph 0083 discloses, “In another configuration, the training of filters that have a particular specificity is terminated to reduce computation costs. That is, the learning of filters that have a specificity that is greater than or equal to a threshold is stopped so that the weights of the filters are no longer updated.”)
As per claim 14, The neural network training apparatus of claim 11, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining a translation parameter of the batch normalization layer of the first neural network in the current training process; wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition comprises: (Towal paragraph 0059 discloses, “According to certain aspects of the present disclosure, each local processing unit 202 may be configured to determine parameters of the model based upon desired one or more functional features of the model, and develop the one or more functional features towards the desired functional features as the determined parameters are further adapted, tuned and updated.”)
determining the update manner corresponding to the preset condition if the translation parameter is greater than a predetermined translation threshold. (Towal paragraph 0059 discloses, “According to certain aspects of the present disclosure, each local processing unit 202 may be configured to determine parameters of the model based upon desired one or more functional features of the model, and develop the one or more functional features towards the desired functional features as the determined parameters are further adapted, tuned and updated.”)
As per claim 15, The neural network training apparatus of claim 11, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining times of training of the first neural network in the current training process; wherein the determining an update manner corresponding to a preset condition if the indicator
parameter meets the preset condition comprises: (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining the update manner corresponding to the preset condition if the times of training is greater than a predetermined times threshold. (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
As per claim 16, The neural network training apparatus of claim 11, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining training precision of the first neural network in the current training process; wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition comprises: (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the
filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining the update manner corresponding to the preset condition if the training precision is greater than a predetermined precision threshold. (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
As per claim 17, The neural network training apparatus of claim 11, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
determining a ratio of zero elements to all elements in a feature map output from one or more layers of the first neural network in the current training process; wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition comprises: (Towal paragraph 0070 discloses, “FIG. 7 illustrates a set of weak filters 702 compared to a set of strong filters 704. As shown in FIG. 7, the weak filters 702 do not have specific definitions. For example, each of the weak filters 702 is generalized and does not have a well-defined outline. In contrast, the definition of the strong filters 704 is greater than the definition of the weak filters 702, such that various lines and angles are visible. The strong filters
704 improve the detection of specific features of an input, such as whether one or more horizontal lines are present in an image.”)
determining the update manner corresponding to the preset condition if the ratio of zero elements to all elements is less than a first ratio threshold. (Towal paragraph 0070 discloses, “FIG. 7 illustrates a set of weak filters 702 compared to a set of strong filters 704. As shown in FIG. 7, the weak filters 702 do not have specific definitions. For example, each of the weak filters 702 is generalized and does not have a well-defined outline. In contrast, the definition of the strong filters 704 is greater than the definition of the weak filters 702, such that various lines and angles are visible. The strong filters 704 improve the detection of specific features of an input, such as whether one or more horizontal lines are present in an image.”)
As per claim 18, The neural network training apparatus of claim 17, wherein the determining an update manner corresponding to a preset condition if the indicator parameter meets the preset condition comprises:
determining whether the ratio of zero elements to all elements is less than a second ratio threshold if the ratio of zero elements to all elements is greater than the first ratio threshold, wherein the second ratio threshold is greater than the first ratio threshold; (Towal paragraph 0077 discloses, “As shown in FIG. 8 at the first training pass, each filter has a specific entropy. For example, a first filter 802 has an entropy of 2.006 and a second filter 804 has an entropy of 2.018. The filters in the first training pass are ordered from low entropy to high entropy. Furthermore, as shown in FIG. 8, the entropy of each filter is modified after the ninetieth training pass (epoch 90). The filters in the ninetieth training pass are ordered from low entropy to high entropy. It should be noted that because the filters in both epoch 1 and epoch 90 are ordered from low entropy to high entropy, the same filters do not have the same positions in each figure. That
is, the first filter 808 of epoch 1 may or may not be the first filter 808 of epoch 90. In other words, the first filter 802 of epoch 1 may have had a greater change in entropy in comparison to neighboring filters such that the first filter 802 of epoch 1 may be, for example, an eleventh filter 814 of epoch 90.”) and
determining the update manner corresponding to the preset condition the ratio of zero elements to all elements is less than the second ratio threshold. (Towal paragraph 0077 discloses, “As shown in FIG. 8 at the first training pass, each filter has a specific entropy. For example, a first filter 802 has an entropy of 2.006 and a second filter 804 has an entropy of 2.018. The filters in the first training pass are ordered from low entropy to high entropy. Furthermore, as shown in FIG. 8, the entropy of each filter is modified after the ninetieth training pass (epoch 90). The filters in the ninetieth training pass are ordered from low entropy to high entropy. It should be noted that because the filters in both epoch 1 and epoch 90 are ordered from low entropy to high entropy, the same filters do not have the same positions in each figure. That is, the first filter 808 of epoch 1 may or may not be the first filter 808 of epoch 90. In other words, the first filter 802 of epoch 1 may have had a greater change in entropy in comparison to neighboring filters such that the first filter 8302 of epoch 1 may be, for example, an eleventh filter 814 of epoch 90.”)
As per claim 19, The neural network training apparatus of claim 11, wherein the determining an indicator parameter of the first neural network in a current training process comprises:
outputting a first feature map of the sample data through a predetermined layer of the first neural network in the current training process; (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of
neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”)
outputting a second feature map of the sample data through a corresponding predetermined layer of a trained second neural network; (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”) and
determining the indicator parameter of the first neural network in the current training process based on a loss function value between the first feature map and the second feature map. (Towal paragraph 0049 discloses, “The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be
further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map.”)
As per claim 20, A computer readable media on which a computer program instruction is stored, wherein when the computer program instruction is executed by a processor, the processor performs the following steps:
training a first neural network to be trained by using sample data; (Towal paragraph 0071 discloses, “As an example, the network may be tasked with discriminating between a dog and a cat. In this example, a limited number of training samples or an error in training may be present.”)
determining an indicator parameter of the first neural network in a current training process; (Towal paragraph 0093 discloses, “In one configuration, if the determined specificity of a filter is greater than a threshold, the network stops training the filter (block 1208). Additionally, or alternatively, the network stops training the filter (block 1208) when a change in the specificity of the specific filter is less than a threshold after the predetermined number of training iterations. In another configuration, as shown in block 1210, a filter is eliminated from the neural network model when the specificity of the specific filter is less than a threshold after the predetermined number of training iterations.”)
determining an update manner corresponding to a preset condition the indicator parameter meets the preset condition, wherein the update manner is configured to increase a quantity of zero elements in a feature map output by the first neural network; (Towal paragraph 0057 discloses, “The results of the deep neural network may then be thresholded 522 and passed through an exponential smoothing block 524 in the classify application 510.”) and
updating a parameter of a batch normalization layer in the first neural network based on the update manner. ( Towal paragraph 0063 discloses, “The first convolution layer 604 outputs the results of the convolution to the second convolution layer 606. Furthermore, the second convolution layer 606 outputs the results of the convolution to a third convolution layer 608. Finally, a predicted label 610 is output from the third convolution layer 608. Of course, aspects of the present disclosure are not limited to three convolution layers and more or less convolution layers may be specified as desired.”)

Response to Arguments
Applicant's arguments filed 09/09/2022 have been fully considered but they are not persuasive. The applicant alleges the claims are compliant with the 2019 Subject Matter Guidance. However, the claims do not specify what operation the neural network is trained to perform. The Claims do not identify what is an indicator parameter, a preset condition, an update manner, feature map, or a batch normalization layer. The claim doesn’t identify what data is collected or what collets the data to perform the training. The claims do not identify with specificity what the neural network is trained to perform. Applicant amends the claims to include the feature of “image” to specify the data. However, the claim doesn’t identify what the image actually constitutes or how it is collected. The applicant states the specification gives examples of possible use for the invention, however, the claims do not include those features in the claim and, thus, can not be incorporated into the reading of the claims.
The applicant alleges the art of record does not disclose the claimed invention. Applicant explains, features of the art of record. Applicant does not show the functional distinction between the art of record and the claimed invention. Therefore, as applicant cannot show the functional differences, the office concludes the art of record anticipates the claimed invention. The amendment made to the claim does not show how the invention is meaningfully distinguished or different. It appears, the amendment is convoluted way to articulate the operation maintains the same result based upon implemented update.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TYLER D PAIGE whose telephone number is (571)270-5425. The examiner can normally be reached M-F 7:00am - 6:00pm (mst).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thomas Black can be reached on 571-272-6956. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TYLER D PAIGE/Primary Examiner, Art Unit 3666