DETAILED ACTION
1.	This office action is in response to the Application No. 16399766 filed on 4/5/2019. Claims 1-20 are presented for examination and are currently pending.
 	
Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
3.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

4.	Claims 1, 5-7 and 17 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 14 and 16-18 of copending Application No. 16516229. 
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.



Instant application 16399766
Copending application 16516229
1. A system comprising: a processor; and a non-transitory computer readable medium containing programming instructions that, when executed, will cause the processor to:
14. A system comprising: a processor; and a non-transitory computer readable medium containing programming instructions that when executed, will cause the processor to:
     determine weights of an artificial intelligence (AI) model;
     determine weights of an artificial intelligence (Al) model comprising a plurality of convolution layers;
     repeat in one or more iterations, until a stopping criteria is met, operations comprising:
     repeat in one or more iterations, until a stopping criteria is met, operations comprising:
          quantizing the weights into one or more quantization levels;
          quantizing the weights into one or more quantization levels;
         determining output of the Al model based at least on a training data set and the quantized weights of the Al model;
          determining output of the Al model based at least on a training data set and the quantized weights of the Al model;
        determining a change of weights based on the output of the Al model; and
         determining a change of weights based on the output of the Al model; and
          updating the weights of the Al model based on the change of weights;
        updating the weights of the Al model based on the change of weights; and
upon the stopping criteria being met, upload the quantized weights of the Al model to an Al chip for performing an Al task.
     upload the quantized weights of the Al model to an Al chip for performing an AI task; wherein the weights of the Al model comprise respective weights of each of the plurality of convolution layers of the Al model, and wherein at least weights of first and second convolution layers of the plurality of convolution layers are duplicate.


	Claim 14 of copending application 16/516229 includes all the limitation of claim 1 except for the underlined limitation identified in the table above.  That is, claim 14 of copending application 16/516229 does not specify the ‘upload the quantized weights’ is based on the stopping criteria being met.  The stopping criteria being met is interpreted as stopping AI model training.  It would be obvious to a person of ordinary skill in the art that the system in claim 14 of copending application 16/516229 would also wait until AI model training is complete prior to uploading the AI model to an AI chip for performing an AI task.  This will ensure that the uploaded model is complete and accurate prior to execution.  

Instant application 16399766
Copending application 16516229
17. A system comprising: a processor; and a non-transitory computer readable 


     determine weights of an artificial intelligence (Al) model comprising a plurality of convolution layers;
     repeat in one or more iterations, until a stopping criteria is met, operations comprising:
     repeat in one or more iterations, until a stopping criteria is met, operations comprising:
          quantizing the weights into one or more quantization levels;
          quantizing the weights into one or more quantization levels;
         determining output of the Al model based at least on a training data set and the quantized weights of the Al model;
          determining output of the Al model based at least on a training data set and the quantized weights of the Al model;
        determining a change of weights based on the output of the Al model; and
         determining a change of weights based on the output of the Al model; and
          updating the weights of the Al model based on the change of weights;
        updating the weights of the Al model based on the change of weights; and
     upon the stopping criteria being met, upload the quantized weights of the Al model to an embedded cellular neural network architecture in an AI chip configured to:
     upload the quantized weights of the Al model to an Al chip for performing an AI task; wherein the weights of the Al model comprise respective weights of each of the plurality of convolution layers of the Al 
based on the quantized weights; and
transmit the output of the AI task to an output device



	Claim 14 of copending application 16/516229 includes all the limitation of claim 17 except for the underlined limitation identified in the table above.  That is, claim 14 of copending application 16/516229 does not specify the ‘upload the quantized weights’ is based on the stopping criteria being met.  The stopping criteria being met is interpreted as stopping AI model training.  It would be obvious to a person of ordinary skill in the art that the system in claim 14 of copending application 16/516229 would also wait until AI model training is complete prior to uploading the AI model to an AI chip for performing an AI task.  This will ensure that the uploaded model is complete and accurate prior to execution.  
	Claim 14 of copending application 16/516229 includes all the limitation of claim 17 except for the underlined limitation identified in the table above.  That is, claim 14 of copending application 16/516229 does not specify ‘perform an AI task to generate output’ is based on the quantized weights, ‘transmit the output of the AI task to an output device’. It would be obvious to a person of ordinary skill in the art that the system 

Instant application 16399766
Copending application 16516229
5. The system of claim 1, wherein the weights of the AI model are stored in floating point and the quantized weights of the AI model are stored in fixed point.
16. The system of claim 14, wherein the weights of the AI model are stored in floating point and the quantized weights of the AI model are stored in fixed point.
6. The system of claim 1, wherein the programming instructions for determining the change of weights contain programming instructions configured to use a gradient descent method, wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set, wherein the loss value of each of the plurality of training instances is a difference between an output of the AI model for a training instance and a ground truth of the training instance.
17. The system of claim 14, wherein the programming instructions for determining the change of weights contain programming instructions configured to use a gradient descent method, wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set, wherein the loss value of each of the plurality of training instances is a difference between an output of the AI model for a training instance and a ground truth of the training instance.

18. The system of claim 17, wherein the stopping criteria is met when a value of the loss function at an iteration is greater than a value of the loss function at a preceding iteration.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



5.	 Claims 1, 2, 4, 5, 7-9, 11, 12 and 16-19 are rejected under 35 U.S.C 103 as being unpatentable over Tung et al ("Clip-q: Deep network compression learning by in-parallel pruning-quantization. " Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.) in view of El-Yaniv et al (US20170286830)


	Regarding claim 1, Tung teaches a system comprising: a processor; and a non-transitory computer readable medium containing programming instructions that, when executed, will cause the processor to: (image classification network and AlexNet system (which includes the use of a computer which has a processor and a memory containing code), pg. 7877, left col, Experiments.)
determine weights of an artificial intelligence (AI) model; (learn the pruned network structure and quantized weights together, of compressed deep neural network, pg. 7875, right col, 3. CLIP-Q) 
	repeat in one or more iterations, (iteration as the network learns, pg. 7875, right col, 3. CLIP-Q)
	until a stopping criteria is met, (repeat (step1) … until maximum iterations reached, pg. 7876, right col, Algorithm 1, step 10)
	operations (we combine network pruning and weight quantization in a single operation, pg. 7875, right col, 3. CLIP-Q) comprising:
	quantizing the weights into one or more quantization levels; (we then quantize the weights by setting them to the new quantization levels, pg. 7876, right col, 3) Quantizing)
	determining output of the Al model based at least on a training data set and the quantized weights of the Al model; (the forward pass uses the quantized weights, simulating the output of the compressed network, pg. 7876, left col, second to the last para.)
	determining a change of weights based on the output of the Al model; (The full-precision weights are used in the pruning-quantization update as well as during backpropagation, the forward pass uses the quantized weights, simulating the output of the compressed network. pg. 7876, left col, second to the last para.)	and updating the weights of the Al model based on the change of weights; (connections can be reassigned quantization levels and the quantization levels 
themselves evolve over time. The full-precision weights are fine-tuned during training, pg. 7876, right col, first para.)
	upon the stopping criteria being met, (maximum number of iterations, pg. 7877, right col, 4.1. Alexnet on ImageNet.) 
	Tung does not explicitly teach upload the quantized weights of the Al model to an Al chip for performing an Al task.
	El-Yaniv teaches upload the quantized weights of the Al model (during a training phase, quantize weight values are stored [0051], and training set is uploaded [0064])
	to an Al chip for performing an Al task. (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Tung to incorporate the teachings of El-Yaniv for the benefit of reducing computational complexity and power consumption in a substantial manner (El-Yaniv, [0113]).

	Regarding claim 2, Tung modified by El-Yaniv teaches the system of claim 1, Tung teaches wherein the programming instructions for quantizing the weights comprise programming instructions configured to: (Algorithm 1, pg. 7876, right col, steps 1 to 10)
	clip the weights of a convolution neural network (CNN) model (Clip using full-precision weights wi, pg. 7876, Algorithm 1, step 3)
	to a maximum value of a corresponding convolution layer in an embedded cellular neural network of the Al chip (we place two “clips”, scalars c− and c+,
such that (p × 100)% of the positive weights in the layer are less than or equal to c+, pg. 7875, 3.1. 1. Clippling)
	and quantize the clipped weights of the Al model. (We then quantize the weights, pg. 7876, left col, 3. Quantizing)
	El-Yaniv teaches in an embedded cellular neural network of the Al chip; (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 4, Tung modified by El-Yaniv teaches the system of claim 1, Tung teaches wherein the programming instructions for quantizing the weights comprise programming instructions configured to: (Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches determine a distribution of the weights of the Al model; (during the training phase each, each floating point connection weight value is optionally constrained between -1 and 1 [0083])
	and upon determining the distribution of the weights: if the distribution of the weights is symmetric (-1 to +1, [-1, 1], [0083] is a symmetric distribution)
	apply an uniform quantization over the weights of the Al model; (number of scaling calculations is the same as the number of neurons of the QNN (quantized neural network, [0067])
	otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. (otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. [0049])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 5, Tung modified by El-Yaniv teaches the system of claim 1, El-Yaniv teaches wherein the weights of the Al model are stored in floating point (floating point weight values are stored [0051])
	and the quantized weights of the Al model are stored in fixed point. (continuous-valued inputs may be handed as fixed point numbers [0098])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 8, Tung modified by El-Yaniv teaches the system of claim 1, El-Yaniv further teaches wherein the programming instructions comprise additional programming instructions configured to (additionally or alternatively, the one or more processors 201 executes an inference code 204 stored in the memory 203 for using a trained neural network [0055])
	by the Al chip: (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	perform the Al task to generate output based on the quantized weights of the Al model; (quantization function that outputs a finite set of outcomes based on a floating point value of a neuron and/or a connection weight value [0065])
(computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network [0041])
	wherein the quantized weights of the Al model are uploaded into an embedded cellular neural network architecture in the Al chip. (during a training phase, quantized weight are stored [0051], and training set is uploaded [0064] to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 9, Tung teaches a method comprising, at a processing device: (We have presented a new method for deep network compression that combines weight pruning and quantization in a single learning framework, (pg. 7880, left col, conclusion), image classification network and AlexNet system (which includes the use of a computer which has a processor and a memory containing code), pg. 7877, left col, Experiments)
	determining initial weights of a convolution neural network (CNN) model; (learn the pruned network structure and quantized weights together, of compressed deep neural network, pg. 7875, right col, 3. CLIP-Q)
	repeating in one or more iterations, (iteration as the network learns, pg. 7875, right col, 3. CLIP-Q)
	until a stopping criteria is met, (repeat (step1) … until maximum iterations reached, pg. 7876, right col, Algorithm 1, step 10)
	operations (we combine network pruning and weight quantization in a single operation, pg. 570, right col, first para) comprising:
	quantizing the weights into one or more quantization levels; (We then quantize the weights by setting them to the new quantization levels, pg. 7876, right col, 3) Quantizing)
	determining output of the CNN model based at least on a training data set and the quantized weights of the CNN model; (the forward pass uses the quantized
weights, simulating the output of the compressed network, pg. 7876, left col, second to the last para)
	determining a change of weights based on the output of the CNN model;  (The full-precision weights are used in the pruning-quantization update as well as during backpropagation, the forward pass uses the quantized weights, simulating the output of the compressed network. pg. 7876, left col, second to the last para.)
	and updating the weights of the CNN model based on the change of weights; (connections can be reassigned quantization levels and the quantization levels themselves evolve over time. The full-precision weights are fine-tuned during training, pg. 7876, right col, first para.)
	upon the stopping criteria being met, (maximum number of iterations, pg. 7877, right col, 4.1. Alexnet on ImageNet)
	Tung does not explicitly teach uploading the quantized weights of the CNN model to an artificial intelligence (Al) chip configured to perform an Al task.
(during a training phase, quantize weight values are stored [0051], and training set is uploaded [0064])
	to an Al chip configured to perform an Al task. (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 11, Tung modified by El-Yaniv teaches the method of claim 9, wherein quantizing the weights comprise (Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches determining a distribution of the weights of the CNN model; (during the training phase each, each floating point connection weight value is optionally constrained between -1 and 1 [0083])
	and upon determining the distribution of the weights: if the distribution of the weights is symmetric (-1 to +1, [-1, 1], [0083] is a symmetric distribution)
	apply an uniform quantization over the weights of the Al model; (number of scaling calculations is the same as the number of neurons of the QNN (quantized neural network, [0067])
	otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. (otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. [0049])
	The same motivation to combine as independent claim 1 applies here.

(floating point weight values are stored [0051])
	and the quantized weights of the CNN model are stored in fixed point. (continuous-valued inputs may be handed as fixed point numbers [0098])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 16, Tung modified by El-Yaniv teaches the method of claim 9, El-Yaniv teaches further comprising, by the Al chip: (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	performing the Al task to generate output based on the quantized weights of the CNN model; (quantization function that outputs a finite set of outcomes based on a floating point value of a neuron and/or a connection weight value [0065])
	and transmitting the output of the Al task to an output device; (computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network [0041])
	wherein the quantized weights of the CNN model are uploaded into an embedded cellular neural network architecture in the Al chip. (during a training phase, quantized weight are stored [0051], and training set is uploaded [0064] to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	The same motivation to combine as independent claim 1 applies here.

(image classification network and AlexNet system (which includes the use of a computer which has a processor and a memory containing code), pg. 7877, left col, Experiments.)
	repeat in one or more iterations, (iteration as the network learns, pg. 7875, right col, 3. CLIP-Q)
	until a stopping criteria is met, (repeat (step1) … until maximum iterations reached, pg. 7876, right col, Algorithm 1, step 10)
	operations (we combine network pruning and weight quantization in a single operation, pg. 7875, right col, 3. CLIP-Q) comprising:
	quantizing the weights of an artificial intelligence (AI) model into one or more quantization levels; (we then quantize the weights by setting them to the new quantization levels, , pg. 7876, right col, 3) Quantizing)
	determining output of the Al model based at least on a training data set and the quantized weights of the Al model; (the forward pass uses the quantized weights, simulating the output of the compressed network, pg. 7876, left col, second to the last para.)
	determining a change of weights based on the output of the Al model;  (The full-precision weights are used in the pruning-quantization update as well as during backpropagation, the forward pass uses the quantized weights, simulating the output of the compressed network. pg. 7876, left col, second to the last para.)
	and updating the weights of the Al model based on the change of weights; (connections can be reassigned quantization levels and the quantization levels themselves evolve over time. The full-precision weights are fine-tuned during training, pg. 7876, right col, first para.)
 	upon the stopping criteria being met, (until maximum iterations reached, pg. 571, left col, step 10) 
	perform an Al task to generate output based on the quantized weights; (the forward pass uses the quantized weights, simulating the output of the compressed network. pg. 7876, left col, second to the last para.)
	to an embedded cellular neural network (Efficient use of weights built into the GoogLeNet architecture with low- dimensional embeddings, pg. 7878, left col, second to the last para.)
	Tung did not explicitly teach upload the quantized weights of the Al model architecture, in an Al chip configured to: and transmit the output of the Al task to an output device.
	El-Yaniv teaches upload the quantized weights of the Al model (during a training phase, quantize weight values are stored [0051], and training set is uploaded [0064])
	in an Al chip configured to: (to an artificial intelligence system or device [0113] such as an integrated circuit chip [0114])
	and transmit the output of the Al task to an output device. (computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network [0041])
	The same motivation to combine as independent claim 1 applies here.

	Regarding claim 18, Tung modified by El-Yaniv teaches the system of claim 17, Tung teaches wherein the programming instructions for quantizing the weights comprise programming instructions configured to: (Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches determine a distribution of the weights of the Al model; (during the training phase each, each floating point connection weight value is optionally constrained between -1 and 1 [0083])
	and upon determining the distribution of the weights: if the distribution of the weights is symmetric (-1 to +1, [-1, 1], [0083] is a symmetric distribution)
	apply an uniform quantization over the weights of the Al model; (number of scaling calculations is the same as the number of neurons of the QNN (quantized neural network, [0067])
	otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. (otherwise, group the weights of the Al model and quantize the weights to the one or more quantization levels based on the grouping. [0049])
	The same motivation to combine as independent claim 1 applies here.

(floating point weight values are stored [0051])
	and the quantized weights of the Al model are stored in fixed point. (continuous-valued inputs may be handed as fixed point numbers [0098])
	The same motivation to combine as independent claim 1 applies here.

6.	Claims 3 and 10 are rejected under 35 U.S.C 103 as being unpatentable over Tung et al ("Clip-q: Deep network compression learning by in-parallel pruning-quantization." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.) in view of El-Yaniv et al (US20170286830) and further in view of Zhang et al (US20160358075)

	Regarding claim 3, Tung modified by El-Yaniv teaches the system of claim 1, Tung teaches wherein the programming instructions for quantizing the weights comprise (Algorithm 1, pg. 7876, steps 1 to 10)
	and assign a weight to a quantization level of the one or more quantization levels  (we then quantize the weights by setting them to the new quantization levels, pg. 7876, right col, 3) Quantizing)
	Tung modified by El-Yaniv do not explicitly teach programming instructions configured to cluster the weights of the Al model, based on which cluster to which the weight belongs.
(cluster 16 to calculate the Q weight update based on equation (5) [0062])
	based on which cluster to which the weight belongs. (based on a Small Q update logic is placed inside each grid/cluster 16 [0062])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Tung modified by El-Yaniv to incorporate the teachings of Zhang for the benefit of resilience to errors in the stored memory weights and the ability to correct errors through on-line learning (Zhang, [0069])

	Regarding claim 10, Tung modified by El-Yaniv teaches the method of claim 9, Tung teaches wherein quantizing the weights comprise (Algorithm 1, pg. 7876, steps 1 to 10)
	and quantizing a weight to a quantization level of the one or more quantization levels (we then quantize the weights by setting them to the new quantization levels, pg. 7876, right col, 3) Quantizing)
	Tung modified by El-Yaniv do not explicitly teach clustering the weights of the CNN model based on which cluster to which the weight belongs.
	Zhang teaches clustering the weights of the CNN model (cluster 16 to calculate the Q weight update based on equation (5) [0062])
	based on which cluster to which the weight belongs. (based on a Small Q update logic is placed inside each grid/cluster 16 [0062]) 
	The same motivation to combine as dependent claim 3 applies here.

7.	Claims 6, 7, 13-15 and 20 are rejected under 35 U.S.C 103 as being unpatentable over Tung et al ("Clip-q: Deep network compression learning by in-parallel pruning-quantization." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.) in view of El-Yaniv et al (US20170286830) and further in view of Dognin et al (US20150310329)

	Regarding claim 6, Tung modified by El-Yaniv teaches the system of claim 1, Tung teaches wherein the programming instructions for determining the change of weights contain programming instructions (Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches configured to use a gradient descent method, (using floating point connection weight values for gradient descent calculated during the training phase [0064])
	However, they do not explicitly teach wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set, wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the Al model for a training instance and a ground truth of the training instance.
	Dognin teaches wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set (improving over conventional stochastic gradient (SG) techniques by writing the loss function as a sum of loss values over training samples [0044])
	wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the Al model for a training instance and a ground truth of the training instance. (provide a loss function gradient [0049], based on a difference in between a held-out loss of a current iteration and a held-out of at least one previous iteration [0050])
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Tung modified by El-Yaniv to incorporate the teachings of Dognin for the benefit of faster convergence in computation time and for faster held-out loss improvements, especially in the early updates (Dognin, [0006])

	Regarding claim 7, Tung modified by El-Yaniv modified by Dognin teaches the system of claim 6, Dognin further teaches wherein the stopping criteria is met (limited to a maximum number of iterations [0036])
	when a value of the loss function at an iteration is greater than a value of the loss function at a preceding iteration. (a gradient of the loss is computed on a sample portion of the training data and a solution is found. Then, on a next iteration, using another subset of the training data, a gradient of the loss is computed on that subset of the training data, and both the gradient from the first sample portion, and the gradient from the current sample portion are integrated to find the solution in the second iteration. [0047])
	The same motivation to combine as dependent claim 6 applies here.

Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches configured based on a gradient descent method, (using floating point connection weight values for gradient descent calculated during the training phase [0064])
	However, they do not explicitly teach wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set, wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the CNN model for a training instance and a ground truth of the training instance.
	Dognin teaches wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set (improving over conventional stochastic gradient (SG) techniques by writing the loss function as a sum of loss values over training samples [0044])
	wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the CNN model for a training instance and a ground truth of the training instance. (provide a loss function gradient [0049], based on a difference in between a held-out loss of a current iteration and a held-out of at least one previous iteration [0050])
	The same motivation to combine as dependent claim 6 applies here.

Algorithm 1,  pg. 7876, steps 1 to 10)
	of the quantized weights of the CNN model. (We then quantize the weights, pg. 7876, left col, 3. Quantizing)
	Dognin teaches is further based on a stochastic gradient improving over conventional stochastic gradient (SG) techniques by writing the loss function as a sum of loss values over training samples [0044])
	The same motivation to combine as dependent claim 6 applies here.

	Regarding claim 15, Tung modified by El-Yaniv modified by Dognin teaches the method of claim 13, Dognin further teaches wherein the stopping criteria is met (limited to a maximum number of iterations [0036])
	when a value of the loss function at an iteration is greater than a value of the loss function at a preceding iteration. (a gradient of the loss is computed on a sample portion of the training data and a solution is found. Then, on a next iteration, using another subset of the training data, a gradient of the loss is computed on that subset of the training data, and both the gradient from the first sample portion, and the gradient from the current sample portion are integrated to find the solution in the second iteration. [0047])
	The same motivation to combine as dependent claim 6 applies here.

Algorithm 1, pg. 7876, steps 1 to 10)
	El-Yaniv teaches configured to use a gradient descent method, (using floating point connection weight values for gradient descent calculated during the training phase [0064])
	However, they do not explicitly teach wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set, wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the Al model for a training instance and a ground truth of the training instance.
	Dognin teaches wherein a loss function in the gradient descent method is based on a sum of loss values over a plurality of training instances in the training data set (improving over conventional stochastic gradient (SG) techniques by writing the loss function as a sum of loss values over training samples [0044])
	wherein the loss 21Attorney Docket No. P281354.US.01 value of each of the plurality of training instances is a difference between an output of the Al model for a training instance and a ground truth of the training instance. (provide a loss function gradient [0049], based on a difference in between a held-out loss of a current iteration and a held-out of at least one previous iteration [0050])
	The same motivation to combine as dependent claim 6 applies here.

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MORIAM MOSUNMOLA GODO whose telephone number is (571)272-8670. The examiner can normally be reached Monday-Friday 7:30am-5:30pm EST. 
	Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patentcenter for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121