DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 7 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
7The term “balanced” in claim 7 is indefinite because it is unclear from the claims or the specification in what manner the input data is balanced.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Migacz et al. (US 2021/0256348 A1 – hereinafter “Migacz”) in view of Jang (US 2021/0058653 A1 – hereinafter “Jang”).
Claim 1:
 Migacz discloses a method for quantizing an image (¶5 discloses “the memory storage and computation of 32-bit values requires considerable memory and processing resources … reduced precision format … use a 16 bit floating-point (float16) representation or 8 bit integer (INT8)”; ¶26 discloses “image classification”), comprising: 

creating histograms by (¶10 discloses “the process can be performed for each layer of a neural network”; ¶11 discloses “Activation data is generated for one or more layers of the neural network.  The activation (output) data is then collected, stored, and a histogram of the data is subsequently created.”; ¶24 discloses “step 101 by referencing activation data generated from an execution of one or more layers of a neural network … flowchart 100 may be performed independently for multiple (all) layers of the neural network.”; ¶25 discloses “At step 103, the activation data referenced at step
101 is collected, and a histogram is created that collects like in values multiple bins”); 
merging the histograms for each of the batches of images into a merged histogram (Fig. 3 and ¶30 discloses the Candidate Conversion Generation step of Fig. 1, 105; ¶32 discloses “At step 305 … candidate conversion is sequentially compressed (merged) into a plurality of distribution intervals … data values within each interval from the histogram between 0
and the saturation threshold are merged with the other data values in the same interval until the remaining number of bins is equal to the highest observed absolute value ( e.g., 127).”); 
obtaining a minimum value from all minimum values of the M merged histograms and a maximum value from all maximum values of the M merged histograms (¶12 discloses “the range of values between 0 and the highest (maximum) value observed in the activation data”; ¶32 discloses “between 0 and the saturation threshold are merged with the other data values in the same interval until the remaining number of bins is equal to the highest observed absolute value ( e.g., 127).”; ¶33 discloses “then the histogram bins between 0 and 1000 are divided into the maximum positive value expressible (e.g., 127 for Int8)”); 
defining ranges of new bins of a new histogram according to the obtained minimum value, the obtained maximum value, and the number of the new bins (¶33 discloses “then the histogram bins between 0 and 1000 are divided into the maximum positive value expressible (e.g., 127 for Int8) … Discrete sequences of consecutive histogram bins between 0 and the saturation threshold (e.g., 1000) are then sequentially compressed (e.g., merged) until the number of histogram bins remaining corresponds to the highest observed absolute value.”); and 
estimating a distribution of each of the new bins by adding up frequencies falling into the ranges of the new bins to create the new histogram (¶10 discloses “identifying the distribution with the least divergence from the reference distribution”; ¶¶33-34 discloses “16 consecutive histogram bins are merged … each discrete sequence of 4 consecutive
histogram bins are merged … The resulting merged and/or clamped data values are collected and stored as the candidate conversions at step 307 … accuracy (inversely proportional to the divergence) of each candidate conversion to original data values from the calibration data set”).
Migacz discloses all of the subject matter as described above except for specifically teaching  “training based” and “obtaining M batches of images, wherein the amount of the images in each of the M batches of images is N, M is an integer and equal to or larger than two, and N is an integer and equal to or larger than two.”  However, Jang in the same field of endeavor teaches “training based” (¶119 discloses “training a student model … student model may be generated through … quantization”) and “obtaining M batches of images, wherein the amount of the images in each of the M batches of images is N, M is an integer and equal to or larger than two, and N is an integer and equal to or larger than two” (¶49 discloses “a plurality of frame bundles based on image identity … one image is displayed over 100 frames, the 100 frames may be set as one frame bundle, and a plurality of such frame bundles may be set”).
Therefore, it would have been obvious to one of ordinary skill in the art to combine Migacz and Jang before the effective filing date of the claimed invention.  The motivation for this combination of references would have been to require less memory and less processing time for a neural network (Migacz ¶6; Jang ¶¶6-7).  This motivation for the combination of Migacz and Jang is supported by KSR exemplary rationale (G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. MPEP 2141 (III).  
Claims 2 and 6:
The combination of Migacz and Jang discloses the method for quantizing the image of claim 1, further comprising: quantizing (Migacz ¶5 discloses “the memory storage and computation of 32-bit values requires considerable memory and processing resources … reduced precision format … use a 16 bit floating-point (float16) representation or 8 bit integer (INT8)”; ¶37 pseudo code discloses “For i in range( l28, 2048): Candidate_distribution_Q = take bins from bins[0] , …, bins [i-1] and quantize into 128 levels (emphasis added)”) activations (Migacz ¶24 discloses “step 101 by referencing activation data generated from an execution of one or more layers of a neural network”) according to the created new histogram (Migacz ¶10 discloses “identifying the distribution with the least divergence from the reference distribution”; ¶¶33-34 discloses “16 consecutive histogram bins are merged … each discrete sequence of 4 consecutive histogram bins are merged … The resulting merged and/or clamped data values are collected and stored as the candidate conversions at step 307 … accuracy (inversely proportional to the divergence) of each candidate conversion to original data values from the calibration data set”).
Claims 3 and 8:
The combination of Migacz and Jang discloses the method for quantizing the image of claim 1, wherein the distribution of each of the new bins is selected by the group of Gaussian, Rayleigh, normal distribution or others by characteristic data of images (Migacz ¶37 pseudo code discloses normalizing the reference and candidate distributions “normalize reference distribution_P (sum equal to 1) … normalize candidate distribution_Q (sum equal to 1)”).
Claims 4 and 9:
The combination of Migacz and Jang discloses the method for quantizing the image of claim 1, wherein the step of defining the ranges of the new bins of the new histogram according to the obtained minimum value, the obtained maximum value, and the number of the new bins (Migacz ¶31 discloses 8-bit integer (Int8) whose typical range of values is between -127 and 127) comprises deciding the ranges of the new bins of the new histogram by subtracting the obtained maximum value from the obtained minimum value and then dividing the number of the new bins (Migacz ¶33 discloses “bins between 0 and 1000 are divided into the maximum positive value expressible (e.g., 127 for Int8)”).
Claim 5:
Migacz discloses a method for (¶10 discloses “the process can be performed for each layer of a neural network”), comprising: 
receiving a plurality of input data (¶25 discloses “At step 103, the activation data referenced at step 101 is collected, and a histogram is created that collects like in values multiple bins”); 

performing a training of a neural network based on each of the (¶11 discloses “Activation data is generated for one or more layers of the neural network.  The activation (output) data is then collected, stored, and a histogram of the data is subsequently created.”); 
creating histograms of the output data for each of the M batches of input data (¶11; ¶24 discloses “flowchart 100 may be performed independently for multiple (all) layers of the neural network”; ¶25 discloses “At step 103, the activation data referenced at step 101 is collected, and a histogram is created that collects like in values multiple bins”); 
merging the histograms of the output data for each of the M batches of input data into a merged histogram (Fig. 3 and ¶30 discloses the Candidate Conversion Generation step of Fig. 1, 105; ¶32 discloses “At step 305 … candidate conversion is sequentially compressed (merged) into a plurality of distribution intervals … data values within each interval from the histogram between 0 and the saturation threshold are merged with the other data values in the same interval until the remaining number of bins is equal to the highest observed absolute value ( e.g., 127).”; ¶¶33-34 discloses “16 consecutive histogram bins are merged … each discrete sequence of 4 consecutive histogram bins are merged”); 
obtaining a minimum value from all minimum values of the M merged histograms and a maximum value from all maximum values of the M merged histograms (¶12 discloses “the range of values between 0 and the highest (maximum) value observed in the activation data”; ¶32 discloses “between 0 and the saturation threshold are merged with the other data values in the same interval until the remaining number of bins is equal to the highest observed absolute value ( e.g., 127).”; ¶33 discloses “then the histogram bins between 0 and 1000 are divided into the maximum positive value expressible (e.g., 127 for Int8)”); 
defining ranges of new bins of a new histogram according to the obtained minimum value, the obtained maximum value, and the number of the new bins (¶33 discloses “then the histogram bins between 0 and 1000 are divided into the maximum positive value expressible (e.g., 127 for Int8) … Discrete sequences of consecutive histogram bins between 0 and the saturation threshold (e.g., 1000) are then sequentially compressed (e.g., merged) until the number of histogram bins remaining corresponds to the highest observed absolute value.”); and estimating a distribution of each of the new bins by adding up frequencies falling into the ranges of the new bins to create the new histogram (¶10 discloses “identifying the distribution with the least divergence from the reference distribution”; ¶¶33-34 discloses “16 consecutive histogram bins are merged … each discrete sequence of 4 consecutive histogram bins are merged … The resulting merged and/or clamped data values are collected and stored as the candidate conversions at step 307 … accuracy (inversely proportional to the divergence) of each candidate conversion to original data values from the calibration data set”).
Migacz discloses all of the subject matter as described above except for specifically teaching  “training” and “dividing the plurality of input data into M batches of input data, wherein M is an integer and equal to or larger than two.”  However, Jang in the same field of endeavor teaches “training based” (¶119 discloses “training a student model … student model may be generated through … quantization”) and “dividing the plurality of input data into M batches of input data, wherein M is an integer and equal to or larger than two” (¶49 discloses “a plurality of frame bundles based on image identity … one image is displayed over 100 frames, the 100 frames may be set as one frame bundle, and a plurality of such frame bundles may be set”).
Therefore, it would have been obvious to one of ordinary skill in the art to combine Migacz and Jang before the effective filing date of the claimed invention.  The motivation for this combination of references would have been to require less memory and less processing time for a neural network (Migacz ¶6; Jang ¶¶6-7).  This motivation for the combination of Migacz and Jang is supported by KSR exemplary rationale (G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. MPEP 2141 (III).  


Claim 7:
The combination of Migacz and Jang discloses the method for training a neural network of claim 6, further comprising: performing the training of the neural network based on the quantized data (Jang ¶119 discloses “training a student model … student model may be generated through … quantization”).
Claim 10:
The combination of Migacz and Jang discloses the method for training a neural network of claim 5, wherein the amount of the data in each of the M batches of input data is equal to or larger than 100 (Migacz ¶25 discloses “the histogram may consist of 2048 bins, or a number of bins approximating 2048 bins.”).
Claim 11:
The combination of Migacz and XXXX discloses the method for training a neural network of claim 5, wherein data type of the data in each of the M batches of input data is balanced (Migacz ¶38 discloses “input distribution: [5 5 4 0 2 4 7 3 1 4 6 0 3 2 1 3], size: 16 bins” which is an even number).
Claim 12:
Migacz discloses a non-transitory computer-readable storage medium (¶39 and Fig. 4 discoes memory) including instructions (¶41 discloses “computer readable instructions”) that, when executed by at least one processor (¶42 discloses “processor 401”) of a computing system, cause the computing system to perform … 
The combination of Migacz and Jang discloses the remaining elements recited in claim 12 for at least the reasons discussed in claim 1 above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ross Varndell whose telephone number is (571)270-1922.  The examiner can normally be reached on M-F, 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571)270-3717.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/Ross Varndell/Primary Examiner, Art Unit 2666