DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.


Drawings
The applicant’s submitted drawings appear to be acceptable for examination purposes. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the drawings.


Information Disclosure Statement
As required by M.P.E.P. 609(c), the applicant's submission of the Information Disclosure Statement, dated 18 November 2019, is acknowledged by the examiner and M.P.E.P 609 C(2), a copy of the PTOL-1449 initialed and dated by the examiner is attached to the instant office action.


Claim Objections
Claim 5 is objected to because of the following informalities:  “satisfy quality requirement” appears as though it should be “satisfy quality requirements” (based on dependent claims referring to the plural). Appropriate correction is required.
Claims 6-11 depend upon claim 5, and thus include the aforementioned limitation(s).

Claim 8 is objected to because of the following informalities:  “units configured to” appears as though it should be “units are configured to”. Appropriate correction is required.

Claim 20 is objected to because of the following informalities:  “one of boundaries” appears as though it should be “one boundary” or “one of the boundaries” or similar.  Appropriate correction is required.

Claim 25 is objected to because of the following informalities:  “obtain result of the ANN computation” appears as though it should be “obtain a result of the ANN .  Appropriate correction is required.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-7, 9, 12, 13, 16-21, 23, and 24 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 2, 5, 6, 15, and 16 of U.S. Patent No. 11,068,784. Although the claims at issue are not identical, they are not patentably distinct from each other for the reasons highlighted below.

Claim 1 is compared with claim 1 of U.S. Patent No. 11,068,784—whereas the differences between the instant application and the copending application have been highlighted—as follows:
Instant Application
U.S. Patent No. 11,068,784
A system for monitoring a quality of a result of computations of an artificial neural network (ANN), the system comprising one or more processing units
A system for performing a quantization of artificial neural networks (ANNs), the system comprising one or more processors

configured to: receive a description of an ANN and input data associated with the ANN
perform, for the input data, ANN computations of the ANN to obtain the result of the ANN computations for the input data
perform, based on the modified input data and the modified description of the ANN, the computations of outputs of one or more neurons of the ANN
and while performing the ANN computations, monitor a measure of quality of the ANN computations
determine, based on the outputs of one or more neurons of the ANN, a measure of a quantity of saturations among the outputs of the one or more neurons of the ANN [where the quantity of saturations is a measure of quality – see claim 3 of the instant application]


	As per claim 2, see claim 5 of U.S. Patent No. 11,068,784.

	As per claim 3, see claim 1 of U.S. Patent No. 11,068,784.

	As per claim 4, see claim 1 of U.S. Patent No. 11,068,784.

	As per claim 5, see claim 5 of U.S. Patent No. 11,068,784.

	As per claim 6, see claim 1 of U.S. Patent No. 11,068,784.

	As per claim 7, see claim 2 of U.S. Patent No. 11,068,784.

	As per claim 9, see claim 6 of U.S. Patent No. 11,068,784.
	
	As per claim 12, see claim 6 of U.S. Patent No. 11,068,784.



	As per claim 16, see 14 of U.S. Patent No. 11,068,784.

	As per claim 17, see claim 15 of U.S. Patent No. 11,068,784.

As per claim 18, see claim 15 of U.S. Patent No. 11,068,784.

As per claim 19, see claim 15 of U.S. Patent No. 11,068,784.

As per claim 20, see claim 15 of U.S. Patent No. 11,068,784.

As per claim 21, see claim 16 of U.S. Patent No. 11,068,784.

As per claim 23, see claim 5 of U.S. Patent No. 11,068,784.

As per claim 24, see claim 15 of U.S. Patent No. 11,068,784


Claims 8, 10, 11, and 22 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 5, 6, and 15 of U.S. Patent No. 11,068,784 in view of well-known practices in the art. 

As per claim 8, as illustrated above, claims 1 and 5 of U.S. Patent No. 11,068,784 claims all of the limitations set forth in the instant application, except that U.S. Patent No. 11,068,784 claims adjusting the quantization in response to not meeting the requirement, and does not explicitly teach what happens when the requirement is met, and therefore determine the measure of quality satisfies the quality requirements; in response to the determination: return the result of the ANN computations of the quantized ANN for the input data; and keep the quantization scheme to be used for the further input data.
However, the examiner takes official notice that stopping adjustments to a NN after the set quality requirements are met is old and well known within the art as the completion of NN training/optimization.  Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to determine the measure of quality satisfies the quality requirements; in response to the determination: return the result of the ANN computations of the quantized ANN for the input data; and keep the quantization scheme to be used for the further input data, in the system claimed by U.S. Patent No. 11,068,784, to achieve the predictable result of providing the adjusted NN for use on actual input once the quality desired has been achieved, to avoid infinite training/optimization.


As per claim 10, as illustrated above, claims 1 and 6 of U.S. Patent No. 11,068,784 claims all of the limitations set forth in the instant application, except that U.S. Patent No. 11,068,784 claims adjusting the quantization in response to not meeting the requirement, and does not explicitly teach what happens when the requirement is met, and therefore determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality satisfies the quality requirements, keep the further quantization scheme to be used for the ANN computations for the further input data.
However, the examiner takes official notice that stopping adjustments to a NN after the set quality requirements are met is old and well known within the art as the completion of NN training/optimization.  Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality satisfies the quality requirements, keep the further quantization scheme to be used for the ANN computations for the further input data in the system claimed by U.S. Patent No. 11,068,784, to achieve the predictable result of providing the adjusted NN for use on actual input once the quality desired has been achieved, to avoid infinite training/optimization.

As per claim 11, as illustrated above, claims 1 and 6 of U.S. Patent No. 11,068,784 claims all of the limitations set forth in the instant application, except that is met, and therefore determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality satisfies the quality requirements, return the result of the ANN computation of the further quantized ANN for the input data.
However, the examiner takes official notice that stopping adjustments to a NN after the set quality requirements are met is old and well known within the art as the completion of NN training/optimization.  Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality satisfies the quality requirements, return the result of the ANN computation of the further quantized ANN for the input data in the system claimed by U.S. Patent No. 11,068,784, to achieve the predictable result of providing the adjusted NN for use on actual input once the quality desired has been achieved, to avoid infinite training/optimization.

As per claim 22, see the rejections of claims 10-11, above, and claim 15 of U.S. Patent No. 11,068,784.


Claims 14, 15, and 25 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 and 20 of U.S. Patent No. 11,068,784 in view of Ross (US 2018/0232663).

As per claim 14, Patent ‘784 claims the system of claim 1, as described above.
While Patent ‘784 claims using specified quality requirements, receiving input data and the ANN and determining the quantity of saturations/quality (see above) it does not explicitly teach how the requirements are derived, and thus wherein the one or more processing units are configured to: receive the description of the ANN and the input data associated with the ANN from an external system, the external system being in communications with the one or more processing units; and based on the measure of the quality, issue a message concerning the quality of the ANN computations to the external system or a user associated with the external system.
Ross teaches wherein the one or more processing units are configured to: receive the description of the ANN and the input data associated with the ANN from an external system, the external system being in communications with the one or more processing units; and based on the measure of the quality, issue a message concerning the quality of the ANN computations to the external system or a user associated with the external system [the server may receive a machine learning model and requirements from an external source, such as a user (paras. 0064-76, etc.); for the determining quality from computations in Patent ‘784].
Patent ‘784 and Ross are analogous art as they are within the same field of endeavor, namely machine learning.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to receive the predefined requirements and a model for the neural network from a remote user, as taught by Ross, for setting the predefined requirements for the neural network in the system taught by Patent ‘784.
Ross provides motivation as [the system may be able to create a more accurate model based on specific user requirements (paras. 0005-7, etc.)].

As per claim 15, Patent ‘784/Ross teaches wherein the external system is configured to, in response to receiving the message, send an instruction to the one or more processing units, the instruction causing the one or more processing units to perform an operation concerning the quality of the ANN computations [the server may receive a machine learning model, including as a neural network, and requirements from an external source, such as a user, and train the NN (Ross: paras. 0064-76, etc.)].

	As per claim 25, see the rejection of claim 14, above, and claim 20 of U.S. Patent No. 11,068,784.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-11, 13, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 8 recites the limitations "the quality requirements” in line 2.  There is insufficient antecedent basis for this limitation in the claim.

Claim 9 recites the limitations "the quality requirements” in lines 9-10.  There is insufficient antecedent basis for this limitation in the claim.
Claims 10-11 depend upon claim 9, and thus include the aforementioned limitation(s), as well as further recitations of the same.

Claim 13 recites the limitation "the further quantized ANN” in line 4.  There is insufficient antecedent basis for this limitation in the claim.

The term "substantially close to" in claim 20 is a relative term which renders the claim indefinite.  The term "substantially close to" is not defined by the claim, the 


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 3-13, 16-22, and 24 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Park et al. (Weighted-Entropy-based Quantization for Deep Neural Networks, July 2017, pgs. 5456-5464).

As per claim 1, Park teaches a system for monitoring a quality of a result of computations of an artificial neural network (ANN), the system comprising one or more processing units [the proposed system is run on GPUs using CNNs for several datasets (pg. 5461, section 5.1; etc.)] configured to: receive a description of the ANN and an input data associated with the ANN [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3)]; perform, for the input data, ANN computations of the ANN to obtain the result of the ANN computations for the input data [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3)]; and while performing the ANN computations, monitor a measure of quality of the ANN computations [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3) while measuring a quality metric of the weight quantization, an entropy of the activation quantization, and accuracy of the NN for updating the weights (pgs. 5458-5460, section 4.1 for the weight quantization quality metric and section 4.2 for the activation quantization entropy; and pgs. 5461-5462, sections 5.1.1-5.1.4, for network accuracy)].

As per claim 3, Park teaches wherein the monitoring of the measure of quality includes counting a number of neuron saturations [measuring the quantization quality metric includes counting the number of neurons that are zero-value or very large (pgs. 5457-5459, section 3 discusses the zero-value and large value weights and activations, section 4.1 describes counting them and providing a quality metric)].

As per claim 4, Park teaches wherein the performing the ANN computations includes: performing, based on a quantization scheme, quantization of the ANN to obtain a quantized ANN [the weights and activations of the NN are quantized using a specific policy (pgs. 5458-5460, sections 4.1-4.2)]; and performing, using the quantized ANN and based on the input data, the ANN computations to obtain the result of the ANN computations for the input data [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN using the quantized weights and activations (pg. 5460, section 4.3)].

As per claim 5, Park teaches wherein the one or more processing units are configured to: in response to the monitoring of the measure of the quality of the ANN computations, determine that the measure of quality does not satisfy quality requirement; and in response to the determination, adjust, based on the measure of quality, the quantization scheme to be used in the ANN computations for further input data [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].

As per claim 6, Park teaches wherein the quantization of the ANN includes mapping data from a first interval of a first data type into data from a second interval of a second data type; and adjusting the quantization scheme includes modifying a boundary of at least one of the first interval or the second interval [the updating of the quantization occurs during training of the network (pg. 5460, section 4.3) including modifying the quantization of the weights and activations, including clustering into different clusters of different intervals for the weights to the required precision (pgs. 5459-5460, sections 4.1-4.2)].

As per claim 7, Park teaches wherein adjusting the quantization scheme results in improving the measure of quality [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3) while measuring a quality metric of the weight quantization, an entropy of the activation quantization, and accuracy of the NN for updating the weights, and attempting to maximize the quality/entropy/accuracy (pgs. 5458-5460, section 4.1 for the weight quantization quality metric and section 4.2 for the activation quantization entropy; and pgs. 5461-5462, sections 5.1.1-5.1.4, for network accuracy)].

As per claim 8, Park teaches wherein the one or more processing units configured to: determine the measure of quality satisfies the quality requirements; in response to the determination: return the result of the ANN computations of the quantized ANN for the input data; and keep the quantization scheme to be used for the further input data [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization being met; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].

As per claim 9, Park teaches wherein the one or more processing units are configured to repeat: performing, based on the adjusted quantization scheme, the quantization of the ANN to obtain a further quantized ANN [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training)]; performing ANN computations of the further quantized ANN for the input data [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training)]; monitoring a further measure of quality of the ANN computations of the further quantized ANN [the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3) while measuring a quality metric of the weight quantization, an entropy of the activation quantization, and accuracy of the NN for updating the weights (pgs. 5458-5460, section 4.1 for the weight quantization quality metric and section 4.2 for the activation quantization entropy; and pgs. 5461-5462, sections 5.1.1-5.1.4, for network accuracy)]; determining that the further measure of quality does not satisfy the quality requirements; and in response to the determination that the further measure of quality does not satisfy the quality requirements, adjusting the further quantization scheme to be used for the further input data [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].

As per claim 10, Park teaches wherein the one or more processing units are configured to: determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization being met; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].

As per claim 11, Park teaches wherein the one or more processing units are configured to: determine that the further measure of quality satisfies the quality requirements; and in response to the determination that the further measure of quality satisfies the quality requirements, return the result of the ANN computation of the further quantized ANN for the input data [once the training and quantization has completed, based on the accuracy/quantization requirements, the NN may be used for various tasks such as object detection and language modeling, providing a result based on the input data (pg. 5463, sections 5.2-5.3, etc.)].

As per claim 12, Park teaches wherein the one or more processing units are configured to: receive further input data [the system is provided a training data input in multiple mini-batches, a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3)]; perform ANN computations of the quantized ANN for the further input data while keeping the quantized ANN unchanged [weight quantization remains the same within a mini-batch, changing between mini-batches, while activation quantization may change with each training pass (pg. 5460, section 4.3)]; determine, based on the ANN computations for the further input data, a further measure of the quality; and based on the further measure of quality, adjust the quantization scheme and perform quantization of the ANN based on the [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].

As per claim 13, Park teaches a storage unit configured to store the input data and an information concerning the quantization of the ANN, and wherein the one or [the proposed system is run on GPUs using CNNs for several datasets (pg. 5461, section 5.1; etc.)].

As per claim 16, Park teaches wherein the ANN computations of the ANN are performed on an integrated circuit, wherein the integrated circuit is configured to collect information concerning quality of the ANN computations [the proposed system is run on GPUs using CNNs for several datasets (pg. 5461, section 5.1; etc.); and the system is provided a training data input and a neural network and desired precision (pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN (pg. 5460, section 4.3) while measuring a quality metric of the weight quantization, an entropy of the activation quantization, and accuracy of the NN for updating the weights (pgs. 5458-5460, section 4.1 for the weight quantization quality metric and section 4.2 for the activation quantization entropy; and pgs. 5461-5462, sections 5.1.1-5.1.4, for network accuracy)].

As per claim 17, see the rejection of claim 1, above.

As per claim 18, see the rejections of claim 4-5, above.

As per claim 19, see the rejection of claim 6, above.

As per claim 20, Park teaches wherein the monitoring of the measure of quality includes counting a number of neuron saturations and wherein a neuron of the ANN is saturated when a result of computation of the neuron is substantially close to one of boundaries of the second interval of the second data type [measuring the quantization quality metric includes counting the number of neurons that are zero-value or very large (pgs. 5457-5459, section 3 discusses the zero-value and large value weights and activations, section 4.1 describes counting them and providing a quality metric)].

As per claim 21, see the rejection of claim 7, above.

As per claim 22, see the rejections of claims 10-11, above.

As per claim 24, see the rejection of claim 16, above.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to 

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2, 14, 15, 23, and 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Park et al. (Weighted-Entropy-based Quantization for Deep Neural Networks, July 2017, pgs. 5456-5464) in view of Ross (US 2018/0232663).

As per claim 2, Park teaches the system of claim 1, as described above.
While Park teaches using specified quality requirements (see above) it does not explicitly teach how they are derived, and thus wherein the one or more processing 
Ross teaches wherein the one or more processing units are configured to receive a user input, the user input including a quality requirement related to the measure of quality of the ANN computations [a server receives and trains a machine learning model to improve its accuracy to a predefined requirement which may be set by a user (para. 0069, etc.)].
Park and Ross are analogous art as they are within the same field of endeavor, namely machine learning.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to receive the predefined requirements for the neural network from a user, as taught by Ross, for setting the predefined requirements for the neural network in the system taught by Park.
Ross provides motivation as [the system may be able to create a more accurate model based on specific user requirements (paras. 0005-7, etc.)].

As per claim 14, Park teaches the system of claim 1, as described above.
While Park teaches using specified quality requirements, receiving input data and the ANN and determining the measure of quality (see above) it does not explicitly teach how the requirements are derived, and thus wherein the one or more processing units are configured to: receive the description of the ANN and the input data associated with the ANN from an external system, the external system being in communications with the one or more processing units; and based on the measure of the quality, issue a 
Ross teaches wherein the one or more processing units are configured to: receive the description of the ANN and the input data associated with the ANN from an external system, the external system being in communications with the one or more processing units; and based on the measure of the quality, issue a message concerning the quality of the ANN computations to the external system or a user associated with the external system [the server may receive a machine learning model and requirements from an external source, such as a user (paras. 0064-76, etc.); for the determining quality from computations and providing results in Park].
Park and Ross are analogous art as they are within the same field of endeavor, namely machine learning.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to receive the predefined requirements and a model for the neural network from a remote user, as taught by Ross, for setting the predefined requirements for the neural network in the system taught by Park.
Ross provides motivation as [the system may be able to create a more accurate model based on specific user requirements (paras. 0005-7, etc.)].

As per claim 15, Park/Ross teaches wherein the external system is configured to, in response to receiving the message, send an instruction to the one or more processing units, the instruction causing the one or more processing units to perform an operation concerning the quality of the ANN computations [the server may receive a machine learning model, including as a neural network, and requirements from an external source, such as a user, and train the NN (Ross: paras. 0064-76, etc.)].

As per claim 23, Park/Ross teaches, prior to determining that the measure of quality does not satisfy the quality requirements, receiving, by the one or more processing units, a user input, the user input including the quality requirements [the server may receive a machine learning model, including as a neural network, and requirements from an external source, such as a user, and train the NN (Ross: paras. 0064-76, etc.)].
Examiner’s Note: the reasoning and motivation for the combination is provided, above, in the rejection of claim 2.

As per claim 25, Park/Ross teaches a system for quality monitoring and hidden quantization of artificial neural networks (ANN), the system comprising: one or more processing units; and a memory communicatively coupled with the one or more processors [the proposed system is run on GPUs, including GPU memory, using CNNs for several datasets (Park: pg. 5461, section 5.1; etc.) and/or the system may include a processor and memory (Ross: fig. 1, etc.)], the memory storing instructions which when executed by the one or more processing units perform a method comprising: receiving, from an external system, a description of an ANN and input data associated with the ANN [the server may receive a machine learning model, including as a neural network, and requirements from an external source, such as a user, and train the NN (Ross: paras. 0064-76, etc.)]; performing, based on [the weights and activations of the NN are quantized using a specific policy (Park: pgs. 5458-5460, sections 4.1-4.2)], wherein the quantization of the ANN includes mapping data from a first interval of a first data type into data from a second interval of a second data type [quantization includes clustering into different clusters of different intervals for the weights to the required precision (Park: pgs. 5459-5460, sections 4.1-4.2)]; performing, based on the set of input data, ANN computations of the quantized ANN to obtain result of the ANN computation for the input data [the system is provided a training data input and a neural network and desired precision (Park: pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN using the quantized weights and activations (Park: pg. 5460, section 4.3)]; while performing the ANN computations, monitoring a measure of quality of the ANN computations of the quantized ANN [the system is provided a training data input and a neural network and desired precision (Park: pgs. 5458-5459, section 4.1), as well as running the NN on the training data to update the weights of the NN using quantized weights and activations (Park: pg. 5460, section 4.3) while measuring a quality metric of the weight quantization, an entropy of the activation quantization, and accuracy of the NN for updating the weights (Park: pgs. 5458-5460, section 4.1 for the weight quantization quality metric and section 4.2 for the activation quantization entropy; and pgs. 5461-5462, sections 5.1.1-5.1.4, for network accuracy)], wherein the measure of quality is based on count of neuron saturations in the ANN computations [measuring the quantization quality metric includes counting the number of neurons that are zero-value or very large (Park: pgs. 5457-5459, section 3 discusses the zero-value and large value weights and activations, section 4.1 describes counting them and providing a quality metric)]; determining that the measure of quality does not satisfy quality requirements; and in response to the determination, adjusting, by the one or more processing units and based on the measure of quality, the quantization scheme to be used in the ANN computations for further input data, wherein the adjusting the quantization scheme includes modifying the at least one boundary of one of the first interval and the second interval [quantization of the weights and activations are performed in each training batch and forward/backward pass, respectively, where updates are made to the weights based upon the measure of the neural network (training) (Park: pg. 5460, section 4.3) and updates to the weight quantization are made based upon the updated weights (Park: pgs. 5458-5459, section 4.1 for how weight quantization is determined and applied based on the weights; and pg. 5460, section 4.3 for how the weights are updated and re-quantized for each mini-batch in training) all of which is attempting to meet an accuracy requirement for the network and quantization requirements (Park: pgs. 5458-5460, sections 4.1-4.2 for quantization requirements; and pgs. 5461-5462, sections 5.1.1-5.1.4 for network accuracy); therefore the weight quantization scheme, which is based on the weights, is updated based on the updated weights, which have changed the quality metric of the quantization, and the updating of the weights is itself based on the quality measure of the network during training, all of which is based on the requirements for accuracy and quantization; as well as during training the weighted entropy of the activation quantization is maximized by changing the quantization of the activations during each pass in training (Park: pgs. 5459-5460, section 4.2 for how activation quantization is determined and applied including maximizing entropy; and pg. 5460, section 4.3 for how the activations are updated and re-quantized for each pass in training)].
Examiner’s Note: the reasoning and motivation for the combination is provided, above, in the rejection of claim 14.


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-25 are rejected.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769.  The examiner can normally be reached on M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.