Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


3. 	Claims 1, 11, and 16 are rejected under 35 U.S.C. 101 because the invention is directed to an abstract idea without significantly more. The claims are directed toward the abstract idea of “quantizing one or more tensors of the trained AI model based on the offline data distributions to generate a low-bit representation AI model, wherein each layer of the AI model includes the one or more tensors, wherein the one or more tensors include activation, weights, or bias tensors”. the broadest reasonable interpretation of the claim, i.e., reducing/decreasing the number of bits via quantizing, is a mathematical calculation that falls under the mathematical concepts group of the abstract ideas (See MPEP 2106.04(A)(2)(I)(C)).  With regard to whether the abstract idea is integrated into a practical application, it is clear that Applicant's claims do not comprise any additional elements that, individually or in combination, have integrated the judicial exception into a practical application. Since the abstract idea in Applicant’s claims1, 11, and 16 are implemented on a computer and there are no further limitations or structural elements that go beyond the computer/processor, it can clearly be seen that the abstract idea of generating a low-bit representation via quantizing is merely implemented on a computer. With regard to whether the claims recite additional elements that provide significantly more than the recited judicial exception, applicant's claims do not recite additional elements that provide significantly more than the recited judicial exception.   The limitation receiving a trained AI model having one or more layers; receiving first input data for offline inferencing amounts to insignificant extra solution activity (mere data gathering: See MPEP 2106.05(g)(3)), and the limitation applying offline inferencing to the trained AI model based on the first input data to generate offline data distributions for the trained AI model is mere instructions to implement the abstract idea on a computer (See MPEP 2106.05(f)), and thus does not integrate the abstract idea into a practical application. Thus, there is no integration of the abstract idea into a practical application.
Please note, according to the USPTO released new examination guidelines dated January 7,
2019, for determining whether a claim is directed to non-statutory subject matter, the guidelines
provide the following exemplary considerations that are indicative that an additional element (or
combination of elements) may have integrated the judicial exception into a practical application:
an additional element reflects an improvement in the functioning of a computer, or an improvement to other technology or technical field;
an additional element that applies or uses a judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition;
an additional element implements a judicial exception with, or uses a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim;
an additional element effects a transformation or reduction of a particular article to a different state or thing; and
an additional element applies or uses the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.
It is clear that Applicant's claims do not comprise any of the above additional elements that, individually or in combination, have integrated the judicial exception into a practical application. 
  At least one or more of the claims require a memory (or a computer-readable medium) storing instructions and a processor/computer to execute the instructions. These generic computer components are claimed to perform their basic functions of at least to store instructions for quantizing. The recitation of the memory and computer/processor limitations amount to a mere instruction to implement the abstract idea on the computer. Accordingly, claims 1, 11, and 16 are not patent eligible.
Claims 2-10, 12-15, and 17-20 are dependent claims reciting language/limitation adding to the abstract idea, but fails to include language that integrates the abstract idea into a practical application. Accordingly, the claims are not patent eligible.
Notice re prior art available under both pre-AIA  and AIA 
4. 	In the event the determination of the status of the application as subject to AIA  35
U.S.C.102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction
of the statutory basis for the rejection will not be considered a new ground of rejection if the
prior art relied upon, and the rationale supporting the rejection, would be the same under either
status.
					Examiner’s Note
5. 	Examiner has cited particular columns and line numbers or figures in the references as
applied to the claims below for the convenience of the applicant. Although the specified citations
are representative of the teachings in the art and are applied to the specific limitations within the
individual claim, other passages and figures may apply as well. It is respectfully requested from
the applicant, in preparing the responses, to fully consider the references in entirety as potentially
teaching all or part of the claimed invention, as well as the context of the passage as taught by
the prior art or disclosed by the examiner.

Claim Rejections - 35 USC § 102
6. 	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


7. 	Claims 1-5 and 9-20 and are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Desappan, et al. ((US 2019/0012559 A1).
8. 	With regard to claim 1, Desappan, et al. (hereinafter “Desappan”) DISCLOSES  a computer-implemented method, the method comprising: receiving a trained artificial intelligence (AI) model, i.e., DNN/CNN, having one or more layers; receiving first input data for offline inferencing; applying offline inferencing to the trained AI model based on the first input data to generate offline data distributions for the trained AI model (paragraphs 0005, 0023-0028; and Figs. 3 and 4 and the associated text: Prior to the quantization process); and quantizing one or more tensors of the trained AI model based on the offline data distributions to generate a low-bit representation, i.e., 8-bit representation, AI model, wherein each layer of the AI model includes the one or more tensors, wherein the one or more tensors include activation, weights, or bias tensors (See for example, paragraph 0028; and item 402, in Fig. 4 and the associated text)
With regard to claim 2, the method of claim 1, further comprising: receiving second input data for online inferencing; applying online inferencing, i.e., realtime inferencing, using the low-bit representation, i.e., 8-bit representation, AI model, i.e., quantized CNN model, based on the second input data to generate online data distributions for feature maps; and quantizing one or more feature map tensors for the low-bit representation AI model based on the online data distributions for the feature maps (See for example, paragraph 0028; and Fig. 4).
With regard to claim 3, the method of claim 2, wherein a feature map tensor is quantized for each of the one or more layers of the low-bit representation AI model (See for example, paragraph 0028).
With regard to claim 4, the method of claim 2, wherein the one or more feature map tensors are quantized by a two dimensional array of processing elements, i.e., accumulator and multiplication/division (See for example, paragraphs 0025 and 0029 0031).
With regard claim 5, the method of claim 1, wherein at least one of the one or more tensors includes an 8-bit representation (See for example, paragraphs 0025, 0028, and 0030).
With regard to claim 10, the method of claim 1, wherein the one or more tensors of the trained AI model are quantized offline and the quantized tensor information is stored as a model blob (See for example, paragraph 0050).
Claim 11 is rejected the same as claim 1 except claim 11 is an apparatus claim.  Thus, argument similar to that presented above for claim 1 is applicable to claim 11.
Claims 12, 13, 14, and 15 are rejected the same as claims 2, 3, 4, and 5 respectively, except claims 12, 13, 14, and 15 are apparatus claims. Thus, arguments similar to those presented above for claims 2, 3, 4, and 5 are respectively applicable to claims 12, 13, 14, and 15.
With regard to claim 15, the system of claim 11, wherein at least one of the one or more tensors includes an 8-bit integer representation (See for example, paragraphs 0040 and 0051).
Claim 16 is rejected the same as claim 1.  Thus, argument similar to that presented above for claim 1 is applicable to claim 16. Claim 16 distinguishes from claim 1 only in that it recites non-transitory machine-readable medium having instructions stored therein.  Fortunately, Desappan (See for example, Fig. 9: CPU) teaches this feature.
Claims 17, 18, and 19 are rejected the same as claims 2, 3, and 4 respectively. Thus, arguments similar to those presented above for claims 2, 3, and 4 are respectively applicable to claims 17, 18, and 19.
With regard to claim 20, the system of claim 11, wherein at least one of the one or more tensors includes an 8-bit integer representation (See for example, paragraphs 0040 and 0051).
9. 	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

10. 	Claims 1, 11, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Wu, et al. (US 2020/0193270 A1).
11. 	With regard to claim 1, Wu, et al. (hereinafter “Wu”) discloses a computer-implemented method (See for example, Fig. 1), the method comprising: receiving a trained artificial intelligence (AI) model, i.e., CNN model, having one or more layers receiving first input data for offline inferencing; applying offline inferencing to the trained AI model based on the first input data to generate offline data distributions for the trained AI model (See for example, paragraphs 0012-0014; and quantizing one or more tensors of the trained AI model based on the offline data distributions to generate a low-bit representation, i.e., fractional bit-width, AI model, wherein each layer of the AI model includes the one or more tensors, wherein the one or more tensors include activation, weights, or bias tensors (See for example, paragraph 0015).  While Wu does not expressly refer to offline inferencing, before the effective filing date of the claimed invention, and given the description made at paragraph 0029 of the specification “. . . offline inferencing refers to inferencing using an AI model before quantization. . .”, it would have been obvious, if not inherent, to a person of ordinary skill in the art to interpret the data generated before the quantization process and prior to deploying the quantized output to the inference engine as offline inferencing. Thus, each of the requirements of claim 1 is met.
Claim 11 is rejected the same as claim 1 except claim 11 is an apparatus claim.  Thus, argument similar to that presented above for claim 1 is applicable to claim 11.
Claim 16 is rejected the same as claim 1.  Thus, argument similar to that presented above for claim 1 is applicable to claim 16. Claim 16 distinguishes from claim 1 only in that it recites non-transitory machine-readable medium having instructions stored therein.  Given the considerable level of skill in the art of quantization/inferencing, one skilled in the art would have found it obvious, if not inherent, to make and use computer programs/instructions to implement the convolutional neural network and for performing the operations defined by the claim.
12. 	Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Desappan ‘559 in view of Krishnamoorthi (Quantizing deep convolutional networks for efficient inference: A whitepaper).
13. 	With regard to claim 6, Desappan discloses all of the claimed subject matter as already addressed above in paragraph 8, and incorporated herein by reference. Desappan does not expressly call for wherein at least one of the tensors is symmetrically quantized for a range for a data distribution corresponding to the tensor based on a scale factor for the range. However, Krishnamoorthi (See for example, section 2.1, page 4 – section 2.2, page 6) teaches this feature. Desappan and Krishnamoorthi are combinable because they are from the same field of endeavor, i.e., quantizing convolutional neural networks for inference with integer weights (See for example, the abstract).  Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to incorporate the teaching as taught by Krishnamoorthi into the system of Desappan, and to do so would at least allow a quantization scheme that is symmetric, and as result, it may improve the accuracy of quantization.
With regard to claim 7, Desappan does not expressly call for wherein at least one of the tensors is asymmetrically quantized for a range for a data distribution corresponding to the tensor based on an offset and a scale factor for the asymmetrically range.  However, Krishnamoorthi (See for example, section 2.6, page 8 – section 3.1.3, page 11) teaches this feature. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to incorporate the teaching as taught by Krishnamoorthi into the system Desappan if for no other reason than to provide an asymmetric quantization that produces accuracies close to floating point across a wide range of networks.    
With regard to claim 8, Desappan does not expressly call for identifying outlier points in the offline data distributions; and removing a predetermined number of the identified outlier points. However, Krishnamoorthi (See for example, page 26, Fig. 14) teaches this feature.   Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to incorporate the teaching as taught by Krishnamoorthi into the system Desappan such that data degradation may be prevented.
With regard to claim 9, the method of claim 1, wherein the input data includes one or more channels and the offline data distributions are generated for each of the one or more channels and the one or more tensors are quantized on a per-channel basis (See for example, the Abstract (1 and 7), section 2.6, page 8, and section 3.1.3, page 11 Krishnamoorthi).
Conclusion
14. 	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US Patent Application Publication No. 2020/0257966 A1 (See Figs. 3A, 3B, and Fig. 4B).
15. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL G MARIAM whose telephone number is (571)272-7394. The examiner can normally be reached M-F 7:30-5:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, EDWARD F URBAN can be reached on 571-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL G MARIAM/Primary Examiner, Art Unit 2665