DETAILED ACTION
Status of the Claims
This action is in response to the applicant amendment filed on 8/24/2022 for application 15/982,756 filed on 5/17/2018. Claim 1 – 9 and 11 – 21 are pending and have been examined. 

Claim 1, 3, 5, 11, 12 are amended. 

Claims 10 is canceled.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 – 9 and 11 – 21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  It is also further noted that dependent claims based upon the rejected claims are also rejected based upon dependency.

Claim 1 and 12 recite a limitation of “known importance”. The term “known” is subjective without provide a standard for ascertaining the requisite degree either the “known” is based on a measurement of an experimental result or a prediction over a scaled . One of ordinary skill in the art would not be reasonably appraise the scope of the invention. All the dependent claims including Claim2 – 9, 11, 13 – 21 are rejected with the same reason. 

Claims 3 recite limitation of “the one or more candidate operations”. There is insufficient antecedent basis for this limitation in the claim. For the examination purpose, the limitation is interpreted as “the set of candidate operations”   

Response to Argument 
Applicant’s arguments with respect to prior art rejection have been considered but they are not persuasive. 
Applicant states that Loshchilov does not describe a stress indicator that is set to a value from among a range of values based on a known importance of an instance of input data and thus does not disclose the amended limitation of Claim 1 and 12 (examiner notes these limitations are substantially similar to those previously recited in now canceled claim10). Examiner respectfully disagrees. Loshchilov disclose the approach based of the probability which is calculated based on the known ψk(x) of latest compute (Loshchilov, page 4, paragraph 3). The measurement is the accumulation of probabilities which is in a range of a summation of each probabilities. 
 
Claim Objection
Claim 11 is objected to because of the following informalities:  Claim is amended without proper marking according to 37 CFR 1.121.  Appropriate correction is required.
 
Claim Rejections - 35 USC § 103 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.   
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 
 
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 
1. Determining the scope and contents of the prior art. 
2. Ascertaining the differences between the prior art and the claims at issue. 
3. Resolving the level of ordinary skill in the pertinent art. 
4. Considering objective evidence present in the application indicating obviousness or nonobviousness. 
 
Claim 1 – 3, 12 – 14 and 21 are rejected under 35 U.S.C. 102(a)(1) and 35 U.S.C. 102(a)(2) as being anticipated by Black US6269351 Method and System for Training an Artificial Neural Network, 2001, in view of Loshchilov, Online Batch Selection for Faster Training of Neural Networks, International Conference of Learning Representations, ICLR 2016.

Regarding Claim 1, Black discloses an electronic device, comprising:  a  computational functional block circuitry and a controller functional block circuitry (Black, col.  1, ln. 5 – 13 where system … automatically select a representative training dataset; i.e., Black disclose an electronic/computational system to perform/control programed logic of data processing task), the controller functional block circuitry configured to: 
receive a stress indicator associated with an instance of input data for training a neural network to perform a specified task (Black, col. 4, ln. 10 – 15, where artificial neural network training method and system having an adaptive learning rate with the capability to take progressively larger or smaller step in a subsequent training iterations based on the error [received stress indicator associated with input data] in previous training iterations; fig. 11, where step 72 performs prediction [specific task]), 
select, based on a value of the stress indicator, one or more operations to be performed when training the neural network from among a set of candidate operations that may be performed for training the neural network; and cause the computational functional block to perform the one or more operations when training the neural network  (Black, col. 4, ln. 21 – 49, where  in the event … achieve the intermediate error goal … add additional representative data records to the training dataset … ; adaptive adding nodes and layers during the training process … when the neural network training method and system fails to accomplish a predetermined error goal; col 5, ln. 16 – 21, where if the error ratio is less than a predetermined threshold value, the adaptive learning rate can be multiplied by a step up factor to increase the learning rate, if the error ratio is greater than the threshold value then the adaptive learning rate can be multiplied by a step down factor to reduce the learning rate;  i.e., based on the error [stress indicator], system can either add training data, add nodes or multiply learning rate  with step up/down factor [set of candidate operations] during the training of neural network); 
	 Black do not explicitly disclose: 
the stress indicator being set to a value from among a range of values based on a known importance of the instance of input data among multiple different instances of input data for training the neural network to perform the specified task; 
	Loshchilov explicitly discloses:
the stress indicator being set to a value (Loshchilov, alg. 3, ln. 24, where aisel [stress indicator]) from among a range of values (Loshchilov, alg. 3, where the accumulative selection probability ai is in a range of the summation of each probability pi) based on a known importance of the instance of input data among multiple different instances of input data for training the neural network to perform the specified task (Loshchilov, alg. 3, ln. 9, where aj is calculated based on the cumulated probability of being selected [importance], the probability is based on the computed [known] ψk(x)); 
Black and Loshchilov both teach fast neural network training method utilizing computed loss and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black’s teaching of the fast neural network learning method with Loshchilov’s teaching of the importance ranking indicator to achieve the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to speed up the training process (Loshchilov, abs. ln. 14). 

Regarding Claim 2, depending on Claim 1, Black further discloses: wherein selecting the one or more operations comprises: 
for each candidate operation from among the set of candidate operations comparing the value of the stress indicator to a threshold associated with the candidate operation and when the value of the stress indicator has a specified relationship with the threshold associated with the candidate operation, selecting the candidate operation as one of the one or more operations (Black, col. 4, ln. 21 – 49, where  in the event … achieve the intermediate error goal … add additional representative data records to the training dataset … ; adaptive adding nodes and layers during the training process … when the neural network training method and system fails to accomplish a predetermined error goal; col 5, ln. 16 – 21, where if the error ratio is less than a predetermined threshold value, the adaptive learning rate can be multiplied by a step up factor to increase the learning rate, if the error ratio is greater than the threshold value then the adaptive learning rate can be multiplied by a step down factor to reduce the learning rate;  i.e., based on the error [stress indicator] received, compare [specific relationship] to the intermediate error goal, predetermined error goal and thresholds [thresholds], system determine to take the action of add training data, add nodes, or change learning rate). 
 
Regarding Claim 3, depending on Claim 2, Black further discloses: wherein a candidate operation from among the one or more candidate operations comprises:  adjusting, based at least in part on the value of the stress indicator, a training coefficient that is used for computing weighting values for inputs to nodes in the neural network (Black, col 5, ln. 16 – 21, where if the error ratio [based on the value of the stress indicator] is less than a predetermined threshold value, the adaptive learning rate [training coefficient used for computing weighting values] can be multiplied by a step up factor to increase the learning rate, if the error ratio is greater than the threshold value then the adaptive learning rate can be multiplied by a step down factor to reduce the learning rate). 
 
Regarding Claim 5, depending on Claim 2, Black discloses an electronic device of Claim 2. Black further disclose:  
wherein: the instance of input data is an instance of input data from a batch of input data that includes at least two different instances of input data (Black col 5, ln. 56 – 58, where representative training dataset from a group of data records [batch of input data including at least two different instance]); 
Black did not explicitly disclose:  
a candidate operation from among the set of candidate operations comprises: instead of performing training iterations for each of the at least two different instances of input data, performing a training iteration with only the instance of input data. 
Loshchilov explicitly discloses:  
a candidate operation from among the one or more candidate operations comprises: instead of performing training iterations for each of the at least two different instances of input data, performing a training iteration with only the instance of input data (Loshchilov, alg. 3, ln. 24 – 25 & sec. 4, para. 6, ln. 3 - 7, where select only the datapoint of the lowest index isel to the batch. In the implementation that the batch size b = 1, the next training iteration only perform training with the instance of input data). 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black’s teaching of the fast neural network learning method with Loshchilov’s teaching of selecting the training data sampling base on the importance to achieve a method and device for fast neural network training using importance sampling technique. One of the ordinary skilled in the art would have motivated to make this modification in order to speed up the training process (Loshchilov, abs. ln. 14). 
 
Regarding Claim 6, depending on Claim 2, Black discloses an electronic device of Claim 2. Black further disclose:  
wherein: the instance of input data is an instance of input data from a batch of input data that includes at least two different instances of input data (Black col 5, ln. 56 – 58, where representative training dataset from a group of data records [batch of input data including at least two different instance]); 
 Black did not explicitly disclose:
a candidate operation from among the set of candidate operations comprises: instead of performing training iterations for each of the at least two different instances of input data, performing multiple training iteration with only the instance of input data. 
Loshchilov explicitly discloses:  
a candidate operation from among the one or more candidate operations comprises: instead of performing training iterations for each of the at least two different instances of input data, performing multiple training iteration with only the instance of input data (Loshchilov, alg. 3, ln. 24 – 25 & sec. 4, para. 6, ln. 3 - 7, where select only the datapoint of the lowest index isel to the batch. In the implementation that the batch size b = 1, the datapoint that has constant low ranking, are selected again, the datapoint is trained alone with multiple iteration). 
	The reason for the combination of Black and Loshchilov is same as Claim 5.  

Regarding Claim 7, depending on Claim 2, Black discloses an electronic device of Claim 2. Black further disclose:  
wherein:  the instance of input data is an instance of input data from a batch of input data that includes a plurality of different instances of input data (Black col 5, ln. 56 – 58, where representative training dataset from a group of data records [batch of input data including at least two different instance]); 
Black did not explicitly disclose:  
a candidate operation from among the set of candidate operations comprises: instead of performing training iterations for each of the plurality of different instances of input data, performing one more training iterations with each of a subset of instances of input data from the batch of input data, the subset including the instance of input data. 
Loshchilov explicitly discloses:  
a candidate operation from among the one or more candidate operations comprises: instead of performing training iterations for each of the plurality of different instances of input data, performing one more training iterations with each of a subset of instances of input data from the batch of input data, the subset including the instance of input data (Loshchilov, alg. 3, ln. 24 – 25 & sec. 4, para. 6, ln. 3 - 7, where select only the datapoint of the lowest index isel to the batch. In the implementation that the batch size b > 1, the datapoint that has constant low ranking, get selected among a group of low ranking data points into batch [subset of instances of input data from batch of input data] multiple times, the datapoint is trained again with each of the instances in the batch). 
	The reason for the combination of Black and Loshchilov is same as Claim 5.  

Regarding Claim 12, Claim 12 is the method claim corresponding to Claim 1, Black further discloses: computational functional block (Black, fig. 1, & col. 1, ln. 6 – 7, where f() neural network training system [computational functional block]). Claim 12 is rejected with the same reason as Claim 1.
 
Regarding Claim 13 – 14, Claim 13 – 14 are the method claims corresponding to Claim 2 – 3. Claim 13 – 14 are rejected with the same reason as Claim 2 – 3.  

Regarding Claim 16 – 18, Claim 16 – 18 are the method claims corresponding to Claim 5 – 7. Claim 16 – 18 are rejected with the same reason as Claim 5 – 7.  

Regarding Claim 21, depending on Claim 1, Black further discloses: wherein the stress indicator is set to a value that identifies at least one operation that is to be performed for training the neural network using the instance of input data (Black, col. 4, ln. 21 – 49, where  in the event … achieve the intermediate error goal … add additional representative data records to the training dataset … ; adaptive adding nodes and layers during the training process … when the neural network training method and system fails to accomplish a predetermined error goal; col 5, ln. 13 – 21, where the adaptive rate is left unchanged, however … if the error ratio is less than a predetermined threshold value, the adaptive learning rate can be multiplied by a step up factor to increase the learning rate, if the error ratio is greater than the threshold value then the adaptive learning rate can be multiplied by a step down factor to reduce the learning rate; i.e., based on the error [stress indicator], system can perform add training data, add nodes, multiply learning rate with step up/down factor or left unchanged [set of candidate operations] during the training of neural network). 

Claim 4 and 15 are rejected under 35 U.S.C. 103 as  being unpatentable over Black US6269351 Method and System for Training an Artificial Neural Network, 2001 in view of Loshchilov, Online Batch Selection for Faster Training of Neural Networks, International Conference of Learning Representations, ICLR 2016, further in view of Rezaie, A Novel Approach for Implementation of Adaptive Learning Rate Neural Networks, Proceeding Norchip Conference, 79-82, 2004

Regarding Claim 4, depending on Claim 3, Black in view of Loshchilov do not explicitly disclose: 
wherein adjusting the training coefficient comprises: setting the training coefficient in proportion to the stress indicator, thereby causing, when computing the weighting values, relatively larger modifications of the weighting values for higher values of the stress indicator 
Rezaie explicitly discloses: wherein adjusting the training coefficient comprises: setting the training coefficient in proportion to the stress indicator (Rezaie, fig. 4, where learning rate [training coefficient] is in proportion to error e [stress indicator]), thereby causing, when computing the weighting values, relatively larger modifications of the weighting values for higher values of the stress indicator (Rezaie, fig. 4 & eq. 2, where larger error e [stress indicator] result in larger learning rate ηe and further result in a larger modification ηe:e for the weighting value wij). 
Black (in view of Loshchilov) and Rezaie both teach fast neural network training method using error as indicator and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black (in view of Loshchilov)’s teaching of executing different operations based on the error as indicator with Rezaie’s teaching of set learning rate in proportion to the error to achieve a method and device for fast neural network training using importance sampling technique. One of the ordinary skilled in the art would have motivated to make this modification as this combination yield predictable results. 
 
Regarding Claim 15, Claim 15 is the method claims corresponding to Claim 4. Claim 15 is rejected with the same reason as Claim 4.  
 
Claim 8 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Black US6269351 Method and System for Training an Artificial Neural Network, 2001, in view of Loshchilov, Online Batch Selection for Faster Training of Neural Networks, International Conference of Learning Representations, ICLR 2016, further in view of Na, Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier Accumulator, Proceedings of the 2016 International Symposium on Low Power Electronics and Design, ISLPED, Aug. 2016. 
 
Regarding Claim 8, depending on Claim 2, Black in view of Loshchilov did not explicitly disclose:  
wherein a candidate operation from among the set of candidate operations comprises: increasing a precision level of at least one of operands and results for specified computations for training the neural network. 
Na explicitly discloses:  
wherein a candidate operation from among the one or more candidate operations comprises: increasing a precision level of at least one of operands and results for specified computations for training the neural network (Na. alg. 1, sec. 2, para. 4, ln. 3 – 10 & sec. 3.2, para. 1, ln. 1 – 4, where during training, dynamically scales the precision and allow hardware dynamic fixed-point operation base on the ma_loss[0] [stress indicator]. The change of precision apply to parameters of each layer and global [operands and results] during training). 
Black (in view of Loshchilov) and Na both teach fast neural network training method utilizing indicator and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black (in view of Loshchilov)’s teaching of selecting among multiple operations for fast training with Na’s teaching of the dynamic training precision base on indicator to achieve the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to speed up the training process (Na, abs. ln. 12) while save computation energy (Na, abs. ln. 12 - 13). 
 
Regarding Claim 19, Claim 19 is the method claims corresponding to Claim 8. Claim 19 is rejected with the same reason as Claim 8. 
 
Claim 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Black US6269351 Method and System for Training an Artificial Neural Network, 2001, in view of Loshchilov, Online Batch Selection for Faster Training of Neural Networks, International Conference of Learning Representations, ICLR 2016, further in view of Zhang, Neuron-adaptive higher order neural-network models for automated financial data modeling, IEEE Transactions on Neural Networks, Vol. 13, Iss. 1, 2002. 
 
Regarding Claim 9, depending on Claim 2, Black in view of Loshchilov did not explicitly disclose:  
wherein a candidate operation from among the set of candidate operations comprises: modifying or replacing, based at least on part on the value of the stress indicator, one or more activation functions in the neural network. 
Zhang explicitly discloses:  
wherein a candidate operation from among the one or more candidate operations comprises: modifying or replacing, based at least on part on the value of the stress indicator, one or more activation functions in the neural network (Zhang. sec. III, eq. 3.7, where during training, increase variable a1i,k [candidate operation] of activation function i,k based in part of E [stress indicator] when 
    PNG
    media_image1.png
    69
    87
    media_image1.png
    Greyscale
 is greater than zero [threshold associated with the candidate operation]). 
Black (in view of Loshchilov) and Zhang both teach neural network training model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black (in view of Loshchilov)’s teaching of select among multiple operations to speed up neural network training with Zhang’s teaching of optimum model determination to achieve the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to automatically finding the optimum model (Zhang, abs. 15 - 16) 
 
Regarding Claim 20, Claim 20 is the method claims corresponding to Claim 9. Claim 20 is rejected with the same reason as Claim 9.  
 
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Black US6269351 Method and System for Training an Artificial Neural Network, 2001, in view of Loshchilov, Online Batch Selection for Faster Training of Neural Networks, International Conference of Learning Representations, ICLR 2016, further in view of Koosh, Analog VLSI Neural Network with Digital Perturbative Learning, IEEE Transactions on Circuits and System II. Analog and Digital Signal Processing. Vol 49, May 2002. 
 
Regarding Claim 11, depending on Claim 1, Black in view of Loshchilov further disclose: 
the stress indicator to be used for one or more computations involving the instance of input data when training the neural network (Black, fig 4, where error [stress indicator] is used to adjust learning rate for the current training data) 
Black in view of Loshchilov  did not explicitly disclose:  
further comprises: a plurality of computational units in the computational functional block, each computational unit being configured to perform computations for training the neural network 
wherein each computational unit comprises circuit elements for at least one of receiving and storing the stress indicator 
Koosh explicitly discloses:  
further comprises: a plurality of computational units in the computational functional block, each computational unit being configured to perform computations for training the neural network (Koosh, fig. 2 & abs. ln. 2 – 5, where synapse circuits [plurality of computational units] in the neural network hardware [computational function block] in a chip used for the training of neural network)  
wherein each computational unit comprises circuit elements for at least one of receiving and storing the stress indicator (Koosh, sec. II, para. 4, ln. 15 – 17, where one digital counter [circuit elements] per synapses [computational unit]; sec. III, para. 5, where digital counter [circuit elements] receive and store random bits [stress indicator] and use it to update weight)  
Black (in view of Loshchilov) and Koosh both teach neural network training device that receive parameter update indicator during training and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Black (in view of Loshchilov)’s teaching of fast neural network learning device with Koosh’s teaching of hardware implementation of multiple computational unit each receive parameter update indicator to achieve the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification to allow the neural network parallelism integrate into compact high speed solution in VLSI (Koosh, intro. para. 1, ln. 9 – 11). 
     
Conclusion 
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354.  The examiner can normally be reached on Monday- Friday 9 am - 5 pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/S.C./             Examiner, Art Unit 2122                                                                                                                                                                                           
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122